From guido at python.org Thu Jan 1 00:39:09 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 1 00:39:16 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: Your message of "Wed, 31 Dec 2003 17:21:27 PST." <200401010121.i011LRP06664@c-24-5-183-134.client.comcast.net> References: <5.2.1.1.0.20040101014159.03a31f50@pop.bluewin.ch> <200401010050.i010oXA06606@c-24-5-183-134.client.comcast.net> <200401010121.i011LRP06664@c-24-5-183-134.client.comcast.net> Message-ID: <200401010539.i015d9N07010@c-24-5-183-134.client.comcast.net> > The ordering is supposed to be an unimportant detail. The external > program solution won't work due to the nature of the benchmark. I'll > fix the benchmark some time next week. The fix should be fairly > simple and localized. It was easier than I thought, so it's in place. Version 1.0.2 uses a subclass of dict whose __repr__ sorts the keys. The running time doesn't seem to be substantially affected. (Oh how I love subclassing builtin classes!) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Thu Jan 1 00:53:02 2004 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 1 00:53:20 2004 Subject: [Python-Dev] Specifying Python version when running piethon benchmark In-Reply-To: References: <200312312334.hBVNY7506326@c-24-5-183-134.client.comcast.net> Message-ID: <16371.46526.669386.381784@montanaro.dyndns.org> Dennis> I suspect there are other folks who have run the pie-con Dennis> benchmarks on their machines. Perhaps we should construct a Dennis> chart. 800 MHz Ti PowerBook (G4), using Python from CVS, output of make (second run of two, to make sure pyo files were already generated): time python -O b.py >@out real 0m49.999s user 0m46.610s sys 0m1.030s cmp @out out The best-of-three pystone measurement is 10964.9 pystones/second. The Makefile should probably parameterize "python" like so: PYTHON = python time: time $(PYTHON) -O b.py >@out cmp @out out ... so people can specify precisely which version of Python to use. Skip From anthony at interlink.com.au Thu Jan 1 01:22:13 2004 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu Jan 1 01:22:34 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: <200401010014.i010EGt06465@c-24-5-183-134.client.comcast.net> Message-ID: <200401010622.i016MDs8002169@localhost.localdomain> On this box (3.05G P4, 512M, Fedora Core 1): Using Python2.3.3: real 0m16.027s user 0m14.940s sys 0m0.280s Pystone(1.1) time for 50000 passes = 1.43 This machine benchmarks at 34965 pystones/second Using current-CVS python (and the bytecode compiled with 2.3.3): real 0m14.556s user 0m14.130s sys 0m0.280s Pystone(1.1) time for 50000 passes = 1.41 This machine benchmarks at 35461 pystones/second (Hm: should the default number of passes for pystone be raised? One and a half seconds seems rather a short time for a benchmark...) -- Anthony Baxter It's never too late to have a happy childhood. From python at rcn.com Thu Jan 1 02:53:55 2004 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 1 02:54:26 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <1072879822.28895.245.camel@anthem> Message-ID: <001001c3d03c$6894c020$e841fea9@oemcomputer> [Skip Montanaro] > > It would be nice to keep the old Python implementations around for two > > reasons. First, there's the obvious educational value. Second, the > PyPy > > folks will know where to find it when they need it. I think a Demo/Lib > > subdirectory (with appropriate README) would be a reasonable place to > put > > such code out to pasture. It could be made available in source > > distributions but never installed. [Barry] > The problem is that relegating stuff to Demo would very likely lead to > bitrot, which is not a good thing. Another advantage of having current, > working (and unit tested) pure-Python alternatives is that they are much > easier to modify if you're experimenting with new features, or bug > fixes, etc. > > OTOH, some of the really ancient Python modules will occasionally need > attention to bring them up to modern coding standards, idioms, and > practices. There a several ways to go: * have two visible modules like StringIO and cStringIO -- personally, I see the approach as clutter but it makes it possible to unittest both. * put the C module behind the scenes and import it into the python module while leaving the old code conditionally excluded or just commented out. * if the code is short, move it to the docs -- the itertools module has pure python equivalents in the docs -- that makes the code accessible and makes the docs more informative -- this might work for bisect and heapq where the amount of code is small. Raymond ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From martin at v.loewis.de Thu Jan 1 07:43:16 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Thu Jan 1 07:43:31 2004 Subject: [Python-Dev] Switching to VC.NET 2003 In-Reply-To: <3FF2E0BA.9090105@stackless.com> References: <3FF047E0.8070905@v.loewis.de> <3FF1C625.7070103@stackless.com> <3FF20197.1090802@v.loewis.de> <3FF2E0BA.9090105@stackless.com> Message-ID: <3FF415E4.6020907@v.loewis.de> Christian Tismer wrote: > The problem is: On Windows, and with VC++6.0, I really can > fly, while on Linux I'm so-so. Well, Python will continue to build with VC6 - you just have to maintain the project files yourself. Regards, Martin From martin at v.loewis.de Thu Jan 1 07:44:43 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Thu Jan 1 07:45:04 2004 Subject: [Python-Dev] Switching to VC.NET 2003 In-Reply-To: <3FF2E108.9060101@stackless.com> References: <3FF047E0.8070905@v.loewis.de> <3FF1C625.7070103@stackless.com> <200312301934.hBUJYPL01492@c-24-5-183-134.client.comcast.net> <3FF2E108.9060101@stackless.com> Message-ID: <3FF4163B.8080303@v.loewis.de> Christian Tismer wrote: >> MS also alluded to a free downloadable compiler, but I haven't tracked >> it down yet. And of course theer's always mingw (or whatever it's >> called). > > > If there is a way I can provide Windows binaries, and I can do > some interactive debugging, I'm willing to switch tools, sure. I doubt the free compiler provides any interactive debugging support. So you either have to continue to use VC6, or switch to VC.NET, or switch to Borland C, or use gdb (on cygwin). Regards, Martin From pedronis at bluewin.ch Thu Jan 1 08:06:32 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu Jan 1 08:02:58 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: <200401010444.i014i1o06815@c-24-5-183-134.client.comcast.ne t> References: <5.2.1.1.0.20040101014159.03a31f50@pop.bluewin.ch> <5.2.1.1.0.20040101023157.02851b70@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20040101135802.0288c280@pop.bluewin.ch> At 20:44 31.12.2003 -0800, Guido van Rossum wrote: >But I've seen enough people write code that parses <...> reprs in some >way to make me think that maybe they should be standardized somewhat, >at least to the point where different Python implementations should >not differ gratuitously. E.g. Jython could be much closer to CPython >by inserting 'at 0x'. It's not like standardizing this would close >off an important implementation freedom for other Python >implementation. (I won't go as far as requiring that the number >should be the same as hex(id(x)). :-) I still think that depending on <...> reprs should be non-portable and discouraged, also CPython is already rather whimsical in its own evolution (Python 2.3): >>> class X: pass ... >>> X >>> class X(object): pass ... >>> X >>> I may change my opinion if someone writes a (unit) test pinning down what is exactly meant by that somewhat. regards. From jjl at pobox.com Thu Jan 1 11:14:05 2004 From: jjl at pobox.com (John J Lee) Date: Thu Jan 1 11:14:14 2004 Subject: [Python-Dev] regrtest.py broken? Message-ID: When I run $ ./regrtest.py I get: [... lots of normal-looking output snipped ...] test_xmllib test_xmlrpc test_xpickle test_xreadline test_zipfile test_zipimport test_zlib 219 tests OK. 5 tests failed: test_future test_longexp test_md5 test_tarfile test_types 36 tests skipped: test__locale test_aepack test_al test_applesingle test_bsddb185 test_bsddb3 test_cd test_cl test_curses test_dbm test_email_codecs test_gdbm test_gl test_imgfile test_linuxaudiodev test_list test_locale test_macfs test_macostools test_mpz test_nis test_normalization test_ossaudiodev test_pep277 test_plistlib test_scriptpackages test_set test_socket_ssl test_socketserver test_sunaudiodev test_timeout test_tuple test_unicode_file test_urllibnet test_winreg test_winsound Traceback (most recent call last): File "./regrtest.py", line 1018, in ? main() File "./regrtest.py", line 305, in main e = _ExpectedSkips() File "./regrtest.py", line 962, in __init__ self.expected = set(s.split()) NameError: global name 'set' is not defined John From jjl at pobox.com Thu Jan 1 11:23:05 2004 From: jjl at pobox.com (John J Lee) Date: Thu Jan 1 11:23:16 2004 Subject: [Python-Dev] regrtest.py broken? In-Reply-To: References: Message-ID: John wrote: [...] > NameError: global name 'set' is not defined [...] Ah, new builtin. Sorry. John From guido at python.org Thu Jan 1 11:25:34 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 1 11:25:42 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: Your message of "Thu, 01 Jan 2004 14:06:32 +0100." <5.2.1.1.0.20040101135802.0288c280@pop.bluewin.ch> References: <5.2.1.1.0.20040101014159.03a31f50@pop.bluewin.ch> <5.2.1.1.0.20040101023157.02851b70@pop.bluewin.ch> <5.2.1.1.0.20040101135802.0288c280@pop.bluewin.ch> Message-ID: <200401011625.i01GPYs07746@c-24-5-183-134.client.comcast.net> > I still think that depending on <...> reprs should be non-portable > and discouraged, also CPython is already rather whimsical in its own > evolution (Python 2.3): > > >>> class X: pass > ... > >>> X > > >>> class X(object): pass > ... > >>> X > > >>> > > I may change my opinion if someone writes a (unit) test pinning down what > is exactly meant by that somewhat. That's a good point. I'll add a SF entry to request these unit tests. What you see as whimsical was actually done for compatibility reasons; the new-style classes look more like built-in classes, whose repr is or perhaps . (It says 'type' if it's pure C, 'class' if it was created by a Python class statement.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gerrit at nl.linux.org Thu Jan 1 11:31:52 2004 From: gerrit at nl.linux.org (Gerrit Holl) Date: Thu Jan 1 11:33:07 2004 Subject: [Python-Dev] Are we collecting benchmark results across machines In-Reply-To: <200401010026.i010Q3B06511@c-24-5-183-134.client.comcast.net> References: <200401010026.i010Q3B06511@c-24-5-183-134.client.comcast.net> Message-ID: <20040101163152.GA15524@nl.linux.org> [Dennis Allison] > > Well, add in: > > > > real 1m20.412s > > user 1m1.580s > > sys 0m2.100s > > > > for this somewhat creaky 600MHz G3 iBook... [Guido van Rossum] > I've been looking at user times only, but on that box the discrepancy > between user and real time is enormous! > Could you run pystone too? > > python -c 'from test.pystone import main; main(); main(); main()' > > and then report the smallest pystones/second value. Some data from an older machine, 199 MhZ Pentium Pro, 128 MB, Fedora 1: $ time python2.3 -O 'b.py' >@out real 2m8.363s user 2m53.420s sys 0m1.980s $ python2.3 -c 'from test.pystone import main; main(); main(); main()' Pystone(1.1) time for 50000 passes = 17 This machine benchmarks at 2941.18 pystones/second (showing only the fastest) I know nothing about benchmarking, and I don't know whether anyone can use this data either, but it sure won't hurt anybody either ;) Happy 2004, Gerrit Holl. -- 274. If any one hire a skilled artizan, he shall pay as wages of the ... five gerahs, as wages of the potter five gerahs, of a tailor five gerahs, of ... gerahs, ... of a ropemaker four gerahs, of ... gerahs, of a mason ... gerahs per day. -- 1780 BC, Hammurabi, Code of Law -- Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/ From guido at python.org Thu Jan 1 11:49:20 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 1 11:49:24 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: Your message of "Wed, 31 Dec 2003 21:39:09 PST." <200401010539.i015d9N07010@c-24-5-183-134.client.comcast.net> References: <5.2.1.1.0.20040101014159.03a31f50@pop.bluewin.ch> <200401010050.i010oXA06606@c-24-5-183-134.client.comcast.net> <200401010121.i011LRP06664@c-24-5-183-134.client.comcast.net> <200401010539.i015d9N07010@c-24-5-183-134.client.comcast.net> Message-ID: <200401011649.i01GnKF07866@c-24-5-183-134.client.comcast.net> > > The ordering is supposed to be an unimportant detail. The external > > program solution won't work due to the nature of the benchmark. I'll > > fix the benchmark some time next week. The fix should be fairly > > simple and localized. > > It was easier than I thought, so it's in place. Version 1.0.2 uses a > subclass of dict whose __repr__ sorts the keys. The running time > doesn't seem to be substantially affected. > > (Oh how I love subclassing builtin classes!) Make that 1.0.3. There were two relevant dict usages -- one for globals and one for locals. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From bkc at murkworks.com Thu Jan 1 12:59:08 2004 From: bkc at murkworks.com (Brad Clements) Date: Thu Jan 1 12:03:04 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: <200401010622.i016MDs8002169@localhost.localdomain> References: <200401010014.i010EGt06465@c-24-5-183-134.client.comcast.net> Message-ID: <3FF40C59.31756.38CC66EA@localhost> Python 2.3.2 (#49) pystone test on my Win2k SP4 machine. Note this is a 1.6 ghz Pentium M (Centrino) Pystone(1.1) time for 50000 passes = 1.3762 This machine benchmarks at 36331.9 pystones/second Faster than Anthony's 3Ghz P4. I believe this processor has a very large on-board cache.. I'm not sure how large. It's a Dell Lattitude D800. My point being, I think pystone performance is greatly effected by CPU cache, more so than CPU clock speed. -- Brad Clements, bkc@murkworks.com (315)268-1000 http://www.murkworks.com (315)268-9812 Fax http://www.wecanstopspam.org/ AOL-IM: BKClements From pedronis at bluewin.ch Thu Jan 1 12:27:16 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu Jan 1 12:23:41 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: <200401011625.i01GPYs07746@c-24-5-183-134.client.comcast.ne t> References: <5.2.1.1.0.20040101014159.03a31f50@pop.bluewin.ch> <5.2.1.1.0.20040101023157.02851b70@pop.bluewin.ch> <5.2.1.1.0.20040101135802.0288c280@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20040101181638.03198218@pop.bluewin.ch> At 08:25 01.01.2004 -0800, Guido van Rossum wrote: > > I still think that depending on <...> reprs should be non-portable > > and discouraged, also CPython is already rather whimsical in its own > > evolution (Python 2.3): > > > > >>> class X: pass > > ... > > >>> X > > > > >>> class X(object): pass > > ... > > >>> X > > > > >>> > > > > I may change my opinion if someone writes a (unit) test pinning down what > > is exactly meant by that somewhat. > >That's a good point. I'll add a SF entry to request these unit tests. > >What you see as whimsical was actually done for compatibility reasons; >the new-style classes look more like built-in classes, whose repr is > or perhaps . (It says 'type' if it's >pure C, 'class' if it was created by a Python class statement.) yes, but from the point of view of a classic -> new-style migration is not really compatible. seem also reasonable variations for a new-style X. I understand that the first output is from class_repr while the second is from the evolution of type_repr. The problem is that what was expected (as compatible) isn't defined. From pedronis at bluewin.ch Thu Jan 1 13:05:12 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu Jan 1 13:01:36 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: <200401010444.i014i1o06815@c-24-5-183-134.client.comcast.ne t> References: <5.2.1.1.0.20040101014159.03a31f50@pop.bluewin.ch> <5.2.1.1.0.20040101023157.02851b70@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20040101185446.03187a88@pop.bluewin.ch> At 20:44 31.12.2003 -0800, Guido van Rossum wrote: >But I've seen enough people write code that parses <...> reprs in some >way to make me think that maybe they should be standardized somewhat, >at least to the point where different Python implementations should >not differ gratuitously. E.g. Jython could be much closer to CPython >by inserting 'at 0x'. It's not like standardizing this would close >off an important implementation freedom for other Python >implementation. (I won't go as far as requiring that the number >should be the same as hex(id(x)). :-) btw in 2.1 there was no 0x (at least on Windows): Python 2.1.3 (#35, Apr 8 2002, 17:47:50) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> class C: pass ... >>> C >>> so in Jython: Jython 2.1 on java1.3.0 (JIT: null) Type "copyright", "credits" or "license" for more information. >>> class C: pass ... >>> C (and that number was not in hexadecimal) In the new world where Jython has a correct unique id: >>> class C: pass ... >>> C >>> 1 is id(C) but that's really not an address, so I'm not so keen in putting a 'at 0x' in there because is misleading and confusing cosidering the Java default repr for objects: java.lang.Object@ac7fbb which uses System.identityHashCode which is not unique, and is what we were using in Jython 2.1 (without going to hex). If I were to put an 'at 0x' in there, I would be tempted to revert to that value (System.identityHashCode) but given that is not unique and != id(.), it is rather less useful. From tim.one at comcast.net Thu Jan 1 14:52:11 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 1 14:52:13 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: <5.2.1.1.0.20040101185446.03187a88@pop.bluewin.ch> Message-ID: [Samuele Pedroni] > btw in 2.1 there was no 0x (at least on Windows): > > Python 2.1.3 (#35, Apr 8 2002, 17:47:50) [MSC 32 bit (Intel)] on > win32 Type "copyright", "credits" or "license" for more information. > >>> class C: pass > ... > >>> C > > >>> Yup. Everywhere Python uses printf's %p format code to display pointers, C doesn't define anything about what the output "looks like". In really old versions of MS C, %p used to embed a colon in the address too (segment:offset). For stuff going thru PyString_FromFormat(), I added crap to that function to special-case %p formats, near this comment: /* %p is ill-defined: ensure leading 0x. */ But older versions of Python don't have that hack -- and not all uses of %p in Python go thru PyString_FromFormat (but most do now). > ... > If I were to put an 'at 0x' in there, I would be tempted to revert > to that value (System.identityHashCode) but given that is not unique > and != id(.), it is rather less useful. CPython certainly intends to force a leading 0x (not 0X either -- it's forcing lowercase 'x' too) in such displays now. It's not doing anything to force hex display of the rest, although I don't know of a platform C %p that doesn't produce hex. Still, displaying id() in hex is also an intent. IOW, CPython is enduring some pain to try to make this consistent too; it's not only Jython that can't get a free ride here. From pedronis at bluewin.ch Thu Jan 1 15:09:39 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu Jan 1 15:06:04 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: References: <5.2.1.1.0.20040101185446.03187a88@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20040101205935.031a41c0@pop.bluewin.ch> At 14:52 01.01.2004 -0500, Tim Peters wrote: >IOW, CPython is enduring some pain to try to make this consistent too; it's >not only Jython that can't get a free ride here. is not about having a free ride, it's just that printing the id(.) in Jython as a 'at 0x' is misleading because id(.) now is different from System.identityHashCode which is what goes into the java cls@ repr. From bac at OCF.Berkeley.EDU Thu Jan 1 17:26:26 2004 From: bac at OCF.Berkeley.EDU (Brett) Date: Thu Jan 1 17:26:10 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <001001c3d03c$6894c020$e841fea9@oemcomputer> References: <001001c3d03c$6894c020$e841fea9@oemcomputer> Message-ID: <3FF49E92.4000208@ocf.berkeley.edu> Raymond Hettinger wrote: > > [Barry] > >> The problem is that relegating stuff to Demo would very likely lead >> to bitrot, which is not a good thing. Another advantage of having > > current, > >> working (and unit tested) pure-Python alternatives is that they are >> > > much > >> easier to modify if you're experimenting with new features, or bug >> fixes, etc. >> >> OTOH, some of the really ancient Python modules will occasionally >> need attention to bring them up to modern coding standards, idioms, >> and practices. > > > There a several ways to go: > > * have two visible modules like StringIO and cStringIO -- personally, > I see the approach as clutter but it makes it possible to unittest > both. > I agree with Raymond on this option; having two versions of the same thing when there is no difference sans performance just clutters up the stdlib. Kind of goes against TIOOWTDI. > * put the C module behind the scenes and import it into the python > module while leaving the old code conditionally excluded or just > commented out. > But that has the same issue as moving it to Demo since it still won't be used and thus tested. > * if the code is short, move it to the docs -- the itertools module > has pure python equivalents in the docs -- that makes the code > accessible and makes the docs more informative -- this might work for > bisect and heapq where the amount of code is small. > This is the best solution, but I am afraid that for the modules we are discussing this does not work because of the amount of code. I am afraid there is no good solution to this. Either we move the code to a place where it is not really used (such as Demo, but we can add a flag to regrtest.py to test modules in that directory so the bitrot is not horrendous) or not used at all which begins to wear away the usefulness of keeping the code around in the first place. Perhaps we need to just consider the fact that some modules have such beautiful code in Python that we just leave them as such. Not everything needs to be coded in C (and this is in no way meant to not show appreciation to Raymond and anyone else who has converted Python code to C! Some modules do have a basic performance need like bisect that should be dealt with when possible). The main reason I was able to get involved in python-dev was that I was able to jump in and help with the Python code without much of a learning curve since it did not require learning anything new beyond basic practices of the list. Maintainability is a really important thing and each C module makes that a little bit much harder. I think now that this issue has come up it should be enough for people to be aware that some people care enough about the Python code that this probably won't be an issue in the future. If someone has doubts as to whether someone will miss the Python code then they can just ask the list for a quick opinion. But I trust everyone with checkin rights who feels up to replacing a module to not be rash about it so I don't see a need for a vote for every module. As for dealing with the modules that have already been converted, I say put them in Demo. There is no need to do the importing in a Python module wrapper if there was enough of a performance reason to write it in C. And if the code is wanted as reference stick it in the C code's comments. OK, back to the Summary. -Brett From tim.one at comcast.net Thu Jan 1 19:01:53 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 1 19:01:55 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <005801c3cf56$703a1e60$e841fea9@oemcomputer> Message-ID: [Raymond] > I would be happy to bring back the python version and make it coexist > with the C version. string.py was the first poster child for this, but it's changed a lot over the years. Originally, string.py was pure Python. As the years went by, it *remained* pure Python, but more and more of the functions got overriden, at the bottom of the file, by importing C functions with the same names. That's kinda lost now, because the module functions turned into string methods, and most of the Python code that remains in string.py is trivial delegation to the string methods now. Several years ago, it all looked a lot like maketrans() looks today (a complete implementation in Python, and then at the bottom that's thrown away). I don't really care whether the Python implementations get tested. And I definitely don't want more messes like pickle+cPickle and StringIO+cStringIO -- they're *always* out of synch in some way, and I doubt that will ever be repaired. Putting them under Demo doesn't do any good, because nobody maintains that directory; we don't even ship that directory in the Windows installer, so putting stuff there renders it invisible to most of Python's users. > IMO, it was a programming pearl unto itself and efforts were already > made to preserve the extensive module docstring and make the > code match the original line for line with the variable names and > logic intact. Most Python users don't know C, and will never see the C implementation. > ... > Also, with the PyPy project, there is a chance that cleanly coded pure > python modules will survive the test of time longer than their C > counterparts. In the itertools module, I documented pure python > equivalents to make the documentation more specific, to enchance > understanding, and to make it easier to backport. For the most > part, that effort was successful. I love the Python-workalike docs for itertools! That's a very helpful form of documentation, building on Python knowledge to supply a more precise explanation than is possible with informal English. OTOH, the itertools functions have (unlike, e.g., pickle) pretty simple, well-defined jobs. Needing to present half a page (or more) of Python code instead wouldn't make for pleasant docs. For that reason, I was happy to leave datetime.py to die in the sandbox. > The only downside is the double maintenance problem of keeping the two > in sync. Given the stability of bisect, that is not much of an issue. The double-maintenance problem is severe for non-trivial code. The current string module, and bisect module, functions are trivial. heapq is somewhere in the middle, more like the string module used to be. I'd be perfectly happy if heapq.py were restored just as it was, and then a new section added at the bottom to replace it all with C functions. I don't think putting the Python implementations of heapq functions in the docs would be satisfying, because the details of their implementations don't shed light on the *purpose* of the functions. In contrast, the Python implementations of most itertools functions define the purpose of those functions more succinctly than English. The implementations of heapq functions are much more about a specific way of implementing a full binary tree than they are about heap operations. From guido at python.org Thu Jan 1 20:45:07 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 1 20:45:14 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: Your message of "Thu, 01 Jan 2004 21:09:39 +0100." <5.2.1.1.0.20040101205935.031a41c0@pop.bluewin.ch> References: <5.2.1.1.0.20040101185446.03187a88@pop.bluewin.ch> <5.2.1.1.0.20040101205935.031a41c0@pop.bluewin.ch> Message-ID: <200401020145.i021j7M08464@c-24-5-183-134.client.comcast.net> > At 14:52 01.01.2004 -0500, Tim Peters wrote: > >IOW, CPython is enduring some pain to try to make this consistent too; it's > >not only Jython that can't get a free ride here. [Samuele] > is not about having a free ride, it's just that printing the id(.) > in Jython as a 'at 0x' is misleading because id(.) > now is different from System.identityHashCode which is what goes > into the java cls@ repr. I don't care either way. The importance of having *some* number here is that for ordinary objects, if you print two different ones, the repr() will show you that they are different, and if you print the same one twice, repr() will show that too. I don't recall ever having taken the id() of an object or the address from a repr() and looked it up in memory using a debugger. OTOH if there's a way to get the Java runtime to print things like @ it might make sense to get Python's runtime to print the same number after 'at'; in general giving developers a warm fuzzy feeling of comfort through familiarity is good if it doesn't interfere with other goals. So I'd be happy if we could standardize on 'at 0x...' at least for all objects where CPython currently uses that, letting implementations choose what to put after the 0x, as long as it's hex digits. We can discuss what's the best repr() for new-style classes going forward; since the module is included in the repr I'm less concerned about having the address in there (multiple classes with the same name are an anomaly, while multiple instances with the same class are normal). --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Thu Jan 1 21:13:10 2004 From: bac at OCF.Berkeley.EDU (Brett) Date: Thu Jan 1 21:12:52 2004 Subject: [Python-Dev] PEP 292 and templating In-Reply-To: <001601c3bec9$2f258d80$e841fea9@oemcomputer> References: <001601c3bec9$2f258d80$e841fea9@oemcomputer> Message-ID: <3FF4D3B6.106@ocf.berkeley.edu> Raymond Hettinger wrote: > Is there interest in having a templating module with two functions one > for simple substitutions and the other with more tools? > > > > The first would be Barry?s simple substitutions using only $name or > ${name} for templates exposed to the user. > > > > The second would extend the first with Cheetah style dotted names for > more advanced templates controlled by the programmer. > I am down with the first and basically okay with the second. Regardless I would like to see a module with helpful string manipulation functions. -Brett From pedronis at bluewin.ch Thu Jan 1 21:24:42 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu Jan 1 21:21:07 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: <200401020145.i021j7M08464@c-24-5-183-134.client.comcast.ne t> References: <5.2.1.1.0.20040101185446.03187a88@pop.bluewin.ch> <5.2.1.1.0.20040101205935.031a41c0@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20040102031220.0288a308@pop.bluewin.ch> > > At 14:52 01.01.2004 -0500, Tim Peters wrote: > > >IOW, CPython is enduring some pain to try to make this consistent too; > it's > > >not only Jython that can't get a free ride here. > >[Samuele] > > is not about having a free ride, it's just that printing the id(.) > > in Jython as a 'at 0x' is misleading because id(.) > > now is different from System.identityHashCode which is what goes > > into the java cls@ repr. > >I don't care either way. The importance of having *some* number here >is that for ordinary objects, if you print two different ones, the >repr() will show you that they are different, and if you print the >same one twice, repr() will show that too. I don't recall ever having >taken the id() of an object or the address from a repr() and looked it >up in memory using a debugger. > >OTOH if there's a way to get the Java runtime to print things like >@ it might make sense to get Python's runtime to print the >same number after 'at'; in general giving developers a warm fuzzy >feeling of comfort through familiarity is good if it doesn't interfere >with other goals. yes, which is really has that on its side but >>> import java.lang.System >>> _ihc = java.lang.System.identityHashCode >>> class C: ... def __repr__(self): return "" % _ihc(self) ... >>> c=C() >>> c >>> d={} >>> for i in range(10000): ... new_c = C() ... ihc = _ihc(new_c) ... if d.has_key(ihc): ... print "clash" ... c = d[ihc] ... break ... else: ... d[ihc] = new_c ... clash >>> java.lang.Object.toString(c) 'org.python.core.PyInstance@e07d3e' >>> java.lang.Object.toString(new_c) 'org.python.core.PyInstance@e07d3e' >>> c >>> new_c >>> c is new_c 0 if we standardize on 'at 0x', then at 0x has the above problem, at 0x suggests to much the what comes after the 0x is the same as what come after @ in the Java repr, which not being the case would be confusing. From aleaxit at yahoo.com Fri Jan 2 05:46:10 2004 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Jan 2 05:46:16 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: <200401010622.i016MDs8002169@localhost.localdomain> References: <200401010622.i016MDs8002169@localhost.localdomain> Message-ID: <200401021146.10939.aleaxit@yahoo.com> On Thursday 01 January 2004 07:22 am, Anthony Baxter wrote: ... > (Hm: should the default number of passes for pystone be raised? One and > a half seconds seems rather a short time for a benchmark...) An old rule of thumb, when benchmarking something on many machines that exhibit a wide range of performance, is to try to organize the benchmarks in terms, not of completing a set number of repetitions, but rather of running as many repetitions as feasible within a (more or less) set time. Since, at the end, you'll be reporting in terms of "s per second", it should make no real difference BUT it makes it more practical to run the "same" benchmark on machines (or more generally implementations) spanning orders of magnitude in terms of the performance they exhibit. (If the check for "how long have I been running so far" is very costly, you'll of course do it only "once every N repetitions" rather than doing it at every single pass). Wanting to run pystone on pypy, but not wanting to alter it TOO much from the CPython original so as to keep the results easily comparable, I just made the number of loops an optional command line parameter of the pystone script (as checked in within the pypy project). Since the laptops we had around at the Amsterdam pypy sprint ran about 0.8 to 3 pystones/sec with pypy (a few thousand times slower than with CPython 2.3.*), running all of the 50,000 iterations was just not practical at all. Anyway, the ability of optionally passing in the number of iterations on the command line would also help with your opposite problem of too-fast machines -- if 50k loops just aren't enough for a reasonably-long run, you could use more. Alex From mwh at python.net Fri Jan 2 07:12:37 2004 From: mwh at python.net (Michael Hudson) Date: Fri Jan 2 07:12:41 2004 Subject: [Python-Dev] Specifying Python version when running piethon benchmark In-Reply-To: <16371.46526.669386.381784@montanaro.dyndns.org> (Skip Montanaro's message of "Wed, 31 Dec 2003 23:53:02 -0600") References: <200312312334.hBVNY7506326@c-24-5-183-134.client.comcast.net> <16371.46526.669386.381784@montanaro.dyndns.org> Message-ID: <2mu13e7dmy.fsf@starship.python.net> Skip Montanaro writes: > Dennis> I suspect there are other folks who have run the pie-con > Dennis> benchmarks on their machines. Perhaps we should construct a > Dennis> chart. > > 800 MHz Ti PowerBook (G4), using Python from CVS, output of make (second > run of two, to make sure pyo files were already generated): > > time python -O b.py >@out > > real 0m49.999s > user 0m46.610s > sys 0m1.030s > cmp @out out > > The best-of-three pystone measurement is 10964.9 pystones/second. Bloody hell, that about what I get on my 600Mhz G3 iBook (same model as Dan's, sounds like). Does your TiBook have no cache or a *really* slow bus or something? Cheers, mwh -- I've reinvented the idea of variables and types as in a programming language, something I do on every project. -- Greg Ward, September 1998 From mwh at python.net Fri Jan 2 07:17:09 2004 From: mwh at python.net (Michael Hudson) Date: Fri Jan 2 07:17:12 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <001001c3d03c$6894c020$e841fea9@oemcomputer> (Raymond Hettinger's message of "Thu, 1 Jan 2004 02:53:55 -0500") References: <001001c3d03c$6894c020$e841fea9@oemcomputer> Message-ID: <2mpte27dfe.fsf@starship.python.net> "Raymond Hettinger" writes: > * have two visible modules like StringIO and cStringIO -- > personally, I see the approach as clutter but it makes it possible > to unittest both. At least for pickle, and at least historically, pickle has offered something cPickle hasn't: the ability to subclass. Cheers, mwh -- 34. The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From walter at livinglogic.de Fri Jan 2 07:31:43 2004 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri Jan 2 07:31:48 2004 Subject: [Python-Dev] regrtest.py mods for code coverage In-Reply-To: <16370.64551.432152.383782@gargle.gargle.HOWL> References: <16370.64551.432152.383782@gargle.gargle.HOWL> Message-ID: <3FF564AF.1010304@livinglogic.de> Barry A. Warsaw wrote: > Hi all, > > I wanted to improve the code coverage for base64.py (see previous > message, re: RFC 3548 support) in the test_base64.py unit test. So I > did a fairly simpleminded adaptation of Zope3's test.py -T code > coverage flag to regrtest.py. Very handy for upping coverage of a > module's code! > > The patch is available here: > > http://sourceforge.net/tracker/index.php?func=detail&aid=868499&group_id=5470&atid=305470 > > Currently assigned to Jeremy for review. Please let me know what you > think. Somewhat related: It would be useful if configure had an option --enable-coverage that adds the appropriate compiler options. Currently I'm patching the Makefile by hand (or rather script) for the stuff on http://coverage.livinglogic.de/ Bye, Walter D?rwald From pedronis at bluewin.ch Fri Jan 2 07:45:53 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri Jan 2 07:42:19 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: <200401020145.i021j7M08464@c-24-5-183-134.client.comcast.ne t> References: <5.2.1.1.0.20040101185446.03187a88@pop.bluewin.ch> <5.2.1.1.0.20040101205935.031a41c0@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20040102134231.0289a318@pop.bluewin.ch> >So I'd be happy if we could standardize on 'at 0x...' at least for all >objects where CPython currently uses that, letting implementations >choose what to put after the 0x, as long as it's hex digits. my proposal is that both the following forms should be acceptable: 1. <... at 0x> 2. <... id# > and would change Jython to use the 2nd one. If we really want to just go with what is the status quo, i.e 1. then I would revert to use identityHashCode although it's not unique. regards. From skip at pobox.com Fri Jan 2 08:54:04 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 2 08:54:24 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: References: <005801c3cf56$703a1e60$e841fea9@oemcomputer> Message-ID: <16373.30716.718676.434794@montanaro.dyndns.org> >>>>> "Tim" == Tim Peters writes: Tim> [Raymond] >> I would be happy to bring back the python version and make it coexist >> with the C version. Tim> Putting them under Demo doesn't do any good, because nobody Tim> maintains that directory; we don't even ship that directory in the Tim> Windows installer, so putting stuff there renders it invisible to Tim> most of Python's users. I still don't see why putting the original Python implementations under Demo somewhere is necessarily bad. They would still be available in source distributions and they could be tested from there without being installed - at least on Unix-like systems - by tweaking the "test" target in the Makefile. Something like: test: all platform # first two runs - test all modules which will be installed -find $(srcdir)/Lib -name '*.py[co]' -print | xargs rm -f -$(TESTPYTHON) $(TESTPROG) $(TESTOPTS) $(TESTPYTHON) $(TESTPROG) $(TESTOPTS) # second two runs - as above but force the Python versions of # certain modules to be used -find $(srcdir)/Lib -name '*.py[co]' -print | xargs rm -f PYTHONPATH=$(srcdir)/Demo/lib -$(TESTPYTHON) $(TESTPROG) $(TESTOPTS) PYTHONPATH=$(srcdir)/Demo/lib $(TESTPYTHON) $(TESTPROG) $(TESTOPTS) might be sufficient. (You might have to rename some .so files in certain cases. Also, statically linked modules would pose problems.) >> In the itertools module, I documented pure python equivalents ... Tim> I love the Python-workalike docs for itertools! That's fine, but they can't easily be tested from there. Tim> I'd be perfectly happy if heapq.py were restored just as it was, Tim> and then a new section added at the bottom to replace it all with C Tim> functions. That's also tough to test. I believe there were situations over the years where the Python implementations of some functions in string.py and the corresponding C implementations in stropmodule.c got subtly out-of-sync. In part I think that's because the Python versions never got any exercise during the unit tests. The hack to suppress importing functions from strop would be ugly. I think you'd have to set an environment variable then test its presence in string.py before importing symbols from strop. If we are going to optimize standard library modules by implementing C counterparts but don't want the Python and C implementations to drift apart, we have to be able to test the Python versions somehow. Whatever that mechanism turns out to be has to be fairly convenient to use and shouldn't clutter up Python modules with checks solely used by the test framework. Skip From pedronis at bluewin.ch Fri Jan 2 09:47:04 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri Jan 2 09:43:33 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <16373.30716.718676.434794@montanaro.dyndns.org> References: <005801c3cf56$703a1e60$e841fea9@oemcomputer> Message-ID: <5.2.1.1.0.20040102153342.028a6078@pop.bluewin.ch> At 07:54 02.01.2004 -0600, Skip Montanaro wrote: >If we are going to optimize standard library modules by implementing C >counterparts but don't want the Python and C implementations to drift apart, >we have to be able to test the Python versions somehow. Whatever that >mechanism turns out to be has to be fairly convenient to use and shouldn't >clutter up Python modules with checks solely used by the test framework. I think that something like: heapq python code... try: from _heapq import * # or list the names explicitly except ImportError: pass can be made to work. The tricky part is more in test_heapq, but I think it can be rewritten in some form like: import heapq # import likely the C code test(heapq) # or heapq could be a global heapq = test.support.get_pure_python_version(heapq) test(heapq) or some variation. Assuming some conventions like _heapq/heapq etc, get_pure_python_version would return a module not in sys.modules populated just with the python code, basically tricking the last import momentarily to fail. Barring some strange intermodule dependency issues which should not be the case for this kind of modules this should be workable. regards. From guido at python.org Fri Jan 2 10:18:04 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 2 10:18:16 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: Your message of "Fri, 02 Jan 2004 11:46:10 +0100." <200401021146.10939.aleaxit@yahoo.com> References: <200401010622.i016MDs8002169@localhost.localdomain> <200401021146.10939.aleaxit@yahoo.com> Message-ID: <200401021518.i02FI5909318@c-24-5-183-134.client.comcast.net> > Anyway, the ability of optionally passing in the number of > iterations on the command line would also help with your opposite > problem of too-fast machines -- if 50k loops just aren't enough for > a reasonably-long run, you could use more. Yup. As a matter of historical detail, pystone used to have LOOPS set to 1000; in 1997 I changed it to 10K, and in 2002 I bumped it again to 50K. BTW, I'd gladly receive your patch for parameterizing LOOPS for inclusion into the standard Python library. --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at bluewin.ch Fri Jan 2 10:31:28 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri Jan 2 10:27:56 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <5.2.1.1.0.20040102153342.028a6078@pop.bluewin.ch> References: <16373.30716.718676.434794@montanaro.dyndns.org> <005801c3cf56$703a1e60$e841fea9@oemcomputer> Message-ID: <5.2.1.1.0.20040102155659.028ae438@pop.bluewin.ch> At 15:47 02.01.2004 +0100, Samuele Pedroni wrote: >import heapq # import likely the C code >test(heapq) # or heapq could be a global >heapq = test.support.get_pure_python_version(heapq) oops, that should have been test_support >test(heapq) here's a tentative minimally tested(*) version of get_pure_python_version: def get_pure_python_version(mod,bltin_mod_name=None): DUMMY = object() mod_name = mod.__name__ if bltin_mod_name is None: bltin_mod_name = '_'+mod_name import inspect,sys,new pure_py_ver = new.module('pure-python-'+mod_name) # ? saved_bltin_mod = sys.modules.get(bltin_mod_name,DUMMY) # import _ should fail momentarily sys.modules[bltin_mod_name] = None # or an empty module execfile(inspect.getsourcefile(mod),pure_py_ver.__dict__) if saved_bltin_mod is DUMMY: del sys.modules[bltin_mod_name] else: sys.modules[bltin_mod_name] = saved_bltin_mod return pure_py_ver we could also detect the case were there's no builtin version, which means that the first test likely already tested just the pure Python version. *: tested with: class array(object): def __init__(*args,**kws): pass def __repr__(self): return "" try: from array import array except ImportError: pass import array2 ppy = get_pure_python_version(array2) From pedronis at bluewin.ch Fri Jan 2 10:35:41 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri Jan 2 10:32:07 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <5.2.1.1.0.20040102155659.028ae438@pop.bluewin.ch> References: <5.2.1.1.0.20040102153342.028a6078@pop.bluewin.ch> <16373.30716.718676.434794@montanaro.dyndns.org> <005801c3cf56$703a1e60$e841fea9@oemcomputer> Message-ID: <5.2.1.1.0.20040102163344.028a2f70@pop.bluewin.ch> At 16:31 02.01.2004 +0100, Samuele Pedroni wrote: >import array2 >ppy = get_pure_python_version(array2) when you have a real interactive session transcript why not simply copy it (grr): >>> import array2 >>> array2.array >>> array2.array('i') array('i') >>> ppy = get_pure_python_version(array2,'array') >>> ppy.array >>> ppy.array('i') From guido at python.org Fri Jan 2 10:59:41 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 2 10:59:48 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: Your message of "Fri, 02 Jan 2004 15:47:04 +0100." <5.2.1.1.0.20040102153342.028a6078@pop.bluewin.ch> References: <005801c3cf56$703a1e60$e841fea9@oemcomputer> <5.2.1.1.0.20040102153342.028a6078@pop.bluewin.ch> Message-ID: <200401021559.i02FxfU09478@c-24-5-183-134.client.comcast.net> > I think that something like: > > > heapq python code... > > try: > from _heapq import * # or list the names explicitly > except ImportError: > pass > > > can be made to work. I like this too. > The tricky part is more in test_heapq, but I think it can be rewritten in > some form like: > > import heapq # import likely the C code > test(heapq) # or heapq could be a global > heapq = test.support.get_pure_python_version(heapq) > test(heapq) > > or some variation. Note that you could force the _heapq import to fail by putting an explicit None in sys.modules, like so: import heapq ...test the C impl... sys.modules['_heapq'] = None reload(heapq) # Now 'from _heapq import ...' will fail ...test the Python impl... # Restore the default situation del sys.modules['_heapq'] reload(heapq) I wish cStringIO and cPickle would go away, but unfortunately the Python versions have features that the C code don't have. Hopefully the use of new-style classes can avoid this issue for new module pairs. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at comcast.net Fri Jan 2 11:38:09 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 2 11:38:15 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <200401021559.i02FxfU09478@c-24-5-183-134.client.comcast.net> Message-ID: [Guido] > ... > I wish cStringIO and cPickle would go away, but unfortunately the > Python versions have features that the C code don't have. Alas, for pickle it's worse than that -- cPickle has lots of undocumented features that pickle.py knows nothing about, and those features are used in Zope. I believe cStringIO also has undocumented features in real use ... yup, cStringIO objects at least have an object.reset() method not in the Python version, equivalent to object.seek(0) (but, I guess, saves the expense of doing a LOAD_CONST on 0 ). cStringIO also provides a C-level API. > Hopefully the use of new-style classes can avoid this issue for new > module pairs. Or you can just stop accepting C code written by Zope Corp . From skip at pobox.com Fri Jan 2 11:54:26 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 2 11:54:42 2004 Subject: [Python-Dev] Specifying Python version when running piethon benchmark In-Reply-To: <2mu13e7dmy.fsf@starship.python.net> References: <200312312334.hBVNY7506326@c-24-5-183-134.client.comcast.net> <16371.46526.669386.381784@montanaro.dyndns.org> <2mu13e7dmy.fsf@starship.python.net> Message-ID: <16373.41538.86844.939008@montanaro.dyndns.org> >> 800 MHz Ti PowerBook (G4), using Python from CVS, output of make >> (second run of two, to make sure pyo files were already generated): >> >> time python -O b.py >@out >> >> real 0m49.999s >> user 0m46.610s >> sys 0m1.030s >> cmp @out out >> >> The best-of-three pystone measurement is 10964.9 pystones/second. Michael> Bloody hell, that about what I get on my 600Mhz G3 iBook (same Michael> model as Dan's, sounds like). Does your TiBook have no cache Michael> or a *really* slow bus or something? The Apple System Profiler says my bus speed is 133MHz. I have a 256K L2 cache and a 1MB L3 cache. The machine has 1GB of RAM. Some rather simple operations slow down considerably after the system's been up awhile (and I do tend to leave it up for days or weeks at a time). I don't recall how long it had been up when I ran those tests. I just ran pystone again - it's been up 2 days, 19 hrs at the moment - and got significantly better numbers: % python ~/src/python/head/dist/src/Lib/test/pystone.py Pystone(1.1) time for 50000 passes = 3.82 This machine benchmarks at 13089 pystones/second % python ~/src/python/head/dist/src/Lib/test/pystone.py Pystone(1.1) time for 50000 passes = 3.83 This machine benchmarks at 13054.8 pystones/second % python ~/src/python/head/dist/src/Lib/test/pystone.py Pystone(1.1) time for 50000 passes = 3.78 This machine benchmarks at 13227.5 pystones/second Rerunning the parrotbench code shows a decided improvement as well: % make time python -O b.py >@out real 0m40.018s user 0m38.620s sys 0m0.900s cmp @out out Skip From mcherm at mcherm.com Fri Jan 2 12:08:20 2004 From: mcherm at mcherm.com (Michael Chermside) Date: Fri Jan 2 12:08:26 2004 Subject: [Python-Dev] Pie-thon benchmark code ready Message-ID: <1073063300.3ff5a584ac289@mcherm.com> Samuele Pedroni writes: > at 0x [is not unique], > at 0x suggests to much the what comes after the 0x is the same > as what come after @ in the Java repr, which not being the case would be > confusing. Does "0x" also have a performance problem? I seem to recall your saying that a bit of work needed to be done to make id() unique, and I thought I remembered that this work could be skipped whenever id() wasn't used. That's all well and good since id() is not used all that much, but __repr__ *WILL* be used all over the place. Of course, it's likely my recollections are incorrect. -- Michael Chermside From aleaxit at yahoo.com Fri Jan 2 12:14:09 2004 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Jan 2 12:14:14 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: <200401021518.i02FI5909318@c-24-5-183-134.client.comcast.net> References: <200401010622.i016MDs8002169@localhost.localdomain> <200401021146.10939.aleaxit@yahoo.com> <200401021518.i02FI5909318@c-24-5-183-134.client.comcast.net> Message-ID: <200401021814.09030.aleaxit@yahoo.com> On Friday 02 January 2004 04:18 pm, Guido van Rossum wrote: > > Anyway, the ability of optionally passing in the number of > > iterations on the command line would also help with your opposite > > problem of too-fast machines -- if 50k loops just aren't enough for > > a reasonably-long run, you could use more. > > Yup. As a matter of historical detail, pystone used to have LOOPS set > to 1000; in 1997 I changed it to 10K, and in 2002 I bumped it again to > 50K. > > BTW, I'd gladly receive your patch for parameterizing LOOPS for > inclusion into the standard Python library. OK, I committed the modified pystone.py directly (I hope the change is small and "marginal" enough not to need formal review -- easy enough to back off if I've made some 'oops', anyway...). Alex From pedronis at bluewin.ch Fri Jan 2 12:37:26 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri Jan 2 12:33:53 2004 Subject: [Python-Dev] Pie-thon benchmark code ready In-Reply-To: <1073063300.3ff5a584ac289@mcherm.com> Message-ID: <5.2.1.1.0.20040102183124.028ae438@pop.bluewin.ch> At 09:08 02.01.2004 -0800, Michael Chermside wrote: >Samuele Pedroni writes: > > at 0x [is not unique], > > at 0x suggests to much the what comes after the 0x is the same > > as what come after @ in the Java repr, which not being the case would be > > confusing. > >Does "0x" also have a performance problem? I seem to recall >your saying that a bit of work needed to be done to make id() unique, >and I thought I remembered that this work could be skipped whenever >id() wasn't used. yes, OTOH although id() is convenient and often used in Python, a generic repr is the only place were it is kind of useful. There are other languages that do fine without id() and have only an 'is' operator, and their default repr OTOH is lacking because of non-uniquess issues or "addresses" changing under someone's nose. >That's all well and good since id() is not used all >that much, but __repr__ *WILL* be used all over the place. Of course, it's >likely my recollections are incorrect. all over the place yes, but I don't think there are a lot of performance critical code that produces tons of output with default <... at 0x...> reprs in it, Guido's benchmark notwithstanding. From guido at python.org Fri Jan 2 12:38:26 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 2 12:38:31 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: Your message of "Fri, 02 Jan 2004 18:14:09 +0100." <200401021814.09030.aleaxit@yahoo.com> References: <200401010622.i016MDs8002169@localhost.localdomain> <200401021146.10939.aleaxit@yahoo.com> <200401021518.i02FI5909318@c-24-5-183-134.client.comcast.net> <200401021814.09030.aleaxit@yahoo.com> Message-ID: <200401021738.i02HcQH09798@c-24-5-183-134.client.comcast.net> > OK, I committed the modified pystone.py directly (I hope the change is > small and "marginal" enough not to need formal review -- easy enough > to back off if I've made some 'oops', anyway...). Thanks! Looks benign enough, even though my ideas about main() have evolved some. (http://www.artima.com/weblogs/viewpost.jsp?thread=4829) But that would be harder to fold into the existing pystone framework, so I'd say let's leave this alone. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Jan 2 12:57:43 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 2 12:57:48 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: Your message of "Thu, 01 Jan 2004 12:59:08 EST." <3FF40C59.31756.38CC66EA@localhost> References: <200401010014.i010EGt06465@c-24-5-183-134.client.comcast.net> <3FF40C59.31756.38CC66EA@localhost> Message-ID: <200401021757.i02Hvhw09872@c-24-5-183-134.client.comcast.net> > Python 2.3.2 (#49) pystone test on my Win2k SP4 machine. > > Note this is a 1.6 ghz Pentium M (Centrino) > > Pystone(1.1) time for 50000 passes = 1.3762 > This machine benchmarks at 36331.9 pystones/second > > Faster than Anthony's 3Ghz P4. > > I believe this processor has a very large on-board cache.. I'm not sure how large. It's a > Dell Lattitude D800. > > My point being, I think pystone performance is greatly effected by > CPU cache, more so than CPU clock speed. My goal with this benchmark was not to compare CPUs but to see whether the pystone count times the time for the parrotbench is ~constant, and if it's not, what influences there are on the product. Can you report the parrotbench times? ftp://ftp.python.org/pub/python/parrotbench/parrotbench.tgz I'm only interested in the user time. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Jan 2 13:23:20 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 2 13:23:32 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: Your message of "Thu, 01 Jan 2004 17:22:13 +1100." <200401010622.i016MDs8002169@localhost.localdomain> References: <200401010622.i016MDs8002169@localhost.localdomain> Message-ID: <200401021823.i02INKI09915@c-24-5-183-134.client.comcast.net> > On this box (3.05G P4, 512M, Fedora Core 1): > > Using Python2.3.3: > real 0m16.027s > user 0m14.940s > sys 0m0.280s > > Pystone(1.1) time for 50000 passes = 1.43 > This machine benchmarks at 34965 pystones/second > > > Using current-CVS python (and the bytecode compiled with 2.3.3): > real 0m14.556s > user 0m14.130s > sys 0m0.280s > > Pystone(1.1) time for 50000 passes = 1.41 > This machine benchmarks at 35461 pystones/second > > (Hm: should the default number of passes for pystone be raised? One and > a half seconds seems rather a short time for a benchmark...) Probably. Maybe you can try the new pystone from CVS which has a command line option. (If I were to write it over I'd use the strategy from timeit, which has a self-adjusting strategy to pick an appropriate number of loops.) I multiplied the two numbers (pystones/sec and parrotbench seconds) for your two runs and found the product to be much higher for your first run than for the second. This is suspicious (perhaps points at too much variation in the pystone timing); for contrast, Skip's two runs before and after rebooting give a product within 0.05 percent of each other). Hm, of course this could also have to do with Python versions. Let me try... Yes, it looks like for me, on one machine I have handy here, Python 2.3.3 gives a higher product than current CVS (though not by as much as you measured). So what does this product mean? It's higher if pystone is faster, or parrotbench is slower. Any factor that makes both faster (or slower) by the same amount has no effect. Ah (thinking aloud here) the unit is "pystones". Not pystones per second, just pystones. And this leads to an interpretation: it is how many pystone loops execute in the time needed to complete the parrotbench benchmark (so the unit is really "pystones per parrotbench"). So what makes a machine run more pystones in a parrotbench? I don't know enough about CPUs and caches to draw a conclusion, and I've got to start doing real work today... --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Fri Jan 2 13:28:29 2004 From: aahz at pythoncraft.com (Aahz) Date: Fri Jan 2 13:28:33 2004 Subject: [Python-Dev] Specifying Python version when running piethon benchmark In-Reply-To: <16373.41538.86844.939008@montanaro.dyndns.org> References: <200312312334.hBVNY7506326@c-24-5-183-134.client.comcast.net> <16371.46526.669386.381784@montanaro.dyndns.org> <2mu13e7dmy.fsf@starship.python.net> <16373.41538.86844.939008@montanaro.dyndns.org> Message-ID: <20040102182829.GA3322@panix.com> On Fri, Jan 02, 2004, Skip Montanaro wrote: >Michael Hudson: >>Skip: >>> >>> 800 MHz Ti PowerBook (G4), using Python from CVS, output of make >>> (second run of two, to make sure pyo files were already generated): >>> >>> time python -O b.py >@out >>> >>> real 0m49.999s >>> user 0m46.610s >>> sys 0m1.030s >>> cmp @out out >>> >>> The best-of-three pystone measurement is 10964.9 pystones/second. >> >> Bloody hell, that about what I get on my 600Mhz G3 iBook (same >> model as Dan's, sounds like). Does your TiBook have no cache >> or a *really* slow bus or something? > > The Apple System Profiler says my bus speed is 133MHz. I have a 256K L2 > cache and a 1MB L3 cache. The machine has 1GB of RAM. > > Some rather simple operations slow down considerably after the system's been > up awhile (and I do tend to leave it up for days or weeks at a time). I > don't recall how long it had been up when I ran those tests. I just ran > pystone again - it's been up 2 days, 19 hrs at the moment - and got > significantly better numbers: What version of OS X and developer tools? I'm a tiny bit surprised that you say your machine stays up for weeks at a time; that implies you don't install security updates. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From skip at pobox.com Fri Jan 2 13:45:46 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 2 13:45:53 2004 Subject: [Python-Dev] Specifying Python version when running piethon benchmark In-Reply-To: <20040102182829.GA3322@panix.com> References: <200312312334.hBVNY7506326@c-24-5-183-134.client.comcast.net> <16371.46526.669386.381784@montanaro.dyndns.org> <2mu13e7dmy.fsf@starship.python.net> <16373.41538.86844.939008@montanaro.dyndns.org> <20040102182829.GA3322@panix.com> Message-ID: <16373.48218.780027.196723@montanaro.dyndns.org> >> Some rather simple operations slow down considerably after the >> system's been up awhile (and I do tend to leave it up for days or >> weeks at a time). I don't recall how long it had been up when I ran >> those tests. I just ran pystone again - it's been up 2 days, 19 hrs >> at the moment - and got significantly better numbers: aahz> What version of OS X and developer tools? I'm a tiny bit aahz> surprised that you say your machine stays up for weeks at a time; aahz> that implies you don't install security updates. On the contrary. Apple doesn't release security updates all that often. I run Software Update daily. If the presence of new software updates was the only reason to reboot I'd almost never reboot. Just to make sure that was the case I just ran Software Update manually. Nothing is out-of-date (well, except for the fact that I haven't coughed up the dough for 10.3.x yet). I'm running an up-to-date version of 10.2.8. Skip From bkc at murkworks.com Fri Jan 2 15:23:35 2004 From: bkc at murkworks.com (Brad Clements) Date: Fri Jan 2 14:27:17 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: <200401021757.i02Hvhw09872@c-24-5-183-134.client.comcast.net> References: Your message of "Thu, 01 Jan 2004 12:59:08 EST." <3FF40C59.31756.38CC66EA@localhost> Message-ID: <3FF57FAE.7422.3E770405@localhost> On 2 Jan 2004 at 9:57, Guido van Rossum wrote: > > Pystone(1.1) time for 50000 passes = 1.3762 > > This machine benchmarks at 36331.9 pystones/second > Can you report the parrotbench times? I'm on windows and don't have a 'time' executable, so I created a .bat file like this: O:\Python\parrot>type timit.bat echo "start" time < nul python -O b.py > xout echo "end" time < nul It said: start 14:23:54.85 end 14.24.10.09 that's about 16 seconds -- Brad Clements, bkc@murkworks.com (315)268-1000 http://www.murkworks.com (315)268-9812 Fax http://www.wecanstopspam.org/ AOL-IM: BKClements From guido at python.org Fri Jan 2 14:39:44 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 2 14:39:49 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: Your message of "Fri, 02 Jan 2004 15:23:35 EST." <3FF57FAE.7422.3E770405@localhost> References: Your message of "Thu, 01 Jan 2004 12:59:08 EST." <3FF40C59.31756.38CC66EA@localhost> <3FF57FAE.7422.3E770405@localhost> Message-ID: <200401021939.i02JdiK10072@c-24-5-183-134.client.comcast.net> > > > Pystone(1.1) time for 50000 passes = 1.3762 > > > This machine benchmarks at 36331.9 pystones/second > > > Can you report the parrotbench times? > > I'm on windows and don't have a 'time' executable, so I created a .bat file like this: > > O:\Python\parrot>type timit.bat > echo "start" > time < nul > python -O b.py > xout > echo "end" > time < nul > > It said: > > start > 14:23:54.85 > > end > 14.24.10.09 > > > that's about 16 seconds Thanks! Foer others wanting to do the same on Windows, the latest parrotbench distro (1.0.4) contains a program t.py that should make this a bit easier: you can simply invoke python -O t.py (The version of t.py included in 1.0.3 was faulty. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From bkc at murkworks.com Fri Jan 2 15:44:55 2004 From: bkc at murkworks.com (Brad Clements) Date: Fri Jan 2 14:48:42 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: <200401021939.i02JdiK10072@c-24-5-183-134.client.comcast.net> References: Your message of "Fri, 02 Jan 2004 15:23:35 EST." <3FF57FAE.7422.3E770405@localhost> Message-ID: <3FF584AE.15948.3E8A8E41@localhost> I fixed up my t.py to time both the imports and b.main() It reports .. 15.188 seconds on my P(M) 1.6 GHZ machine -- Brad Clements, bkc@murkworks.com (315)268-1000 http://www.murkworks.com (315)268-9812 Fax http://www.wecanstopspam.org/ AOL-IM: BKClements From guido at python.org Fri Jan 2 14:53:02 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 2 14:53:08 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: Your message of "Fri, 02 Jan 2004 15:44:55 EST." <3FF584AE.15948.3E8A8E41@localhost> References: Your message of "Fri, 02 Jan 2004 15:23:35 EST." <3FF57FAE.7422.3E770405@localhost> <3FF584AE.15948.3E8A8E41@localhost> Message-ID: <200401021953.i02Jr2Y10129@c-24-5-183-134.client.comcast.net> > I fixed up my t.py to time both the imports and b.main() Right, that's what the latest version (1.0.4) now does. > It reports .. 15.188 seconds on my P(M) 1.6 GHZ machine Hm... My IBM T40 with 1.4 GHz P(M) reports 15.608. I bet the caches are more similar, and affect performance more than CPU speed... --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Fri Jan 2 14:56:56 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 2 14:57:02 2004 Subject: [Python-Dev] 2.3.3 test_pty sometimes fails on Solaris Message-ID: <16373.52488.714921.417371@montanaro.dyndns.org> I'm installing 2.3.3 on a Solaris 8 system. If I run "make test", it reports that test_pty failed and gives this error: test test_pty failed -- slave_fd is not a tty but if I run "./python Lib/test/test_pty.py" or ./python Lib/test/regrtest test_pty it seems the test succeeds. It smells sort of like the bug report at http://www.python.org/sf/864374. Any suggestions where I should look for culprits? I know zilch about ptys. Thx, Skip From barry at barrys-emacs.org Fri Jan 2 15:16:18 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Fri Jan 2 15:16:27 2004 Subject: [Python-Dev] Switching to VC.NET 2003 In-Reply-To: <3FF4163B.8080303@v.loewis.de> References: <3FF047E0.8070905@v.loewis.de> <3FF1C625.7070103@stackless.com> <200312301934.hBUJYPL01492@c-24-5-183-134.client.comcast.net> <3FF2E108.9060101@stackless.com> <3FF4163B.8080303@v.loewis.de> Message-ID: <6.0.1.1.2.20040102200635.021a7620@torment.chelsea.private> There is the "Microsoft Visual C++ .NET 2003 Standard" that is cheap, ?90. Fully set of debugger and project tools, just won't do release builds. Barry At 01-01-2004 12:44, Martin v. Loewis wrote: >Christian Tismer wrote: >>>MS also alluded to a free downloadable compiler, but I haven't tracked >>>it down yet. And of course theer's always mingw (or whatever it's >>>called). >> >>If there is a way I can provide Windows binaries, and I can do >>some interactive debugging, I'm willing to switch tools, sure. > >I doubt the free compiler provides any interactive debugging >support. So you either have to continue to use VC6, or switch >to VC.NET, or switch to Borland C, or use gdb (on cygwin). > >Regards, >Martin > > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/nospam%40barrys-emacs.org From martin at v.loewis.de Fri Jan 2 16:40:32 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Fri Jan 2 16:40:36 2004 Subject: [Python-Dev] Switch to VC.NET 7.1 completed Message-ID: <3FF5E550.3030006@v.loewis.de> I now have committed the bits for building Python with VC.NET. Anybody regularly building on Windows should delete all object files, i.e. the PCBuild/temp folders, PCBuild/*.{lib,dll,pyd,exe} (probably more, but they should get generated anew), and object files of prereq libraries you regularly use. In the process, I also updated a few prerequisite libraries to the current versions: Tcl, Sleepycat, openssl (updating Tcl is not strictly needed - you do need to rebuild it, though). I have also incorporated a few more modules into python24.dll. I have moved the VC6 files to PC/VC6. I have no plans of maintaining them, and they are as they were last in PCBuild. I see no problem though with somebody else maintaining the for the time being, if somebody volunteers. Please let me know of any problems you encounter. Regards, Martin From bac at OCF.Berkeley.EDU Fri Jan 2 18:25:11 2004 From: bac at OCF.Berkeley.EDU (Brett) Date: Fri Jan 2 18:24:51 2004 Subject: [Python-Dev] python-dev Summary for 2003-12-01 through 2003-12-31 (rough draft) Message-ID: <3FF5FDD7.5000403@ocf.berkeley.edu> python-dev Summary for 2003-12-01 through 2003-12-31 ++++++++++++++++++++++++++++++++++++++++++++++++++++ This is a summary of traffic on the `python-dev mailing list`_ from December 1, 2003 through December 31, 2003. It is intended to inform the wider Python community of on-going developments on the list. To comment on anything mentioned here, just post to `comp.lang.python`_ (or email python-list@python.org which is a gateway to the newsgroup) with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join `python-dev`_! This is the thirty-first and -second summaries written by Brett Cannon (a friend of a friend actually reads this thing! Hi, Elle). To contact me, please send email to brett at python.org ; I do not have the time to keep up on comp.lang.python and thus do not always catch follow-ups posted there. All summaries are archived at http://www.python.org/dev/summary/ . Please note that this summary is written using reStructuredText_ which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST_ (otherwise it is probably regular expression syntax or a typo =); you can safely ignore it, although I suggest learning reST; it's simple and is accepted for `PEP markup`_ and gives some perks for the HTML output. Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils_ as-is unless it is from the original text file. .. _PEP Markup: http://www.python.org/peps/pep-0012.html The in-development version of the documentation for Python can be found at http://www.python.org/dev/doc/devel/ and should be used when looking up any documentation on something mentioned here. PEPs (Python Enhancement Proposals) are located at http://www.python.org/peps/ . To view files in the Python CVS online, go to http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/ . Reported bugs and suggested patches can be found at the SourceForge_ project page. .. _python-dev: http://www.python.org/dev/ .. _SourceForge: http://sourceforge.net/tracker/?group_id=5470 .. _python-dev mailing list: http://mail.python.org/mailman/listinfo/python-dev .. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python .. _Docutils: http://docutils.sf.net/ .. _reST: .. _reStructuredText: http://docutils.sf.net/rst.html .. contents:: .. _last summary: http://www.python.org/dev/summary/2003-10-16_2003-11-15.html ===================== Summary Announcements ===================== Sorry if this summary comes off as light, but I caught the flu the week of Christmas and it developed into walking pneumonia which I still have. On a more positive note, PyCon is hitting its stride. Online registration is available at http://pycon.org/dc2004 and early bird registration ends January 31. Online talk proposal submission is online at http://submit.pycon.org/ and the deadline is January 15. ========= Summaries ========= ---------------------------- 2.3.3 released to the masses ---------------------------- `Python 2.3.3`_ has gone out the door. Thanks to Anthony Baxter for being release manager (again!) and to all of python-dev and anyone who contributed code for this release. With this being a bugfix release this supercedes 2.3.2 and thus people should upgrade if possible. .. _Python 2.3.3: http://python.org/2.3.3/ Contributing threads: - `2.3.3 cycle `__ - `release23-maint branch CLOSED for release `__ - `Berkeley support in release23-maint `__ - `RELEASED Python 2.3.3 (release candidate 1) `__ - `2.3.3 portability audit `__ - `2.3.3 and beyond `__ - `RELEASED Python 2.3.3 (final) `__ - `status of 2.3 branch for maintenance checkins `__ ---------------------------------- Pie-thon competition work ramps up ---------------------------------- `Dan Sugalski`_, project leader of the Parrot_ VM that will be used for `Perl 6`_, reminded the list that the benchmark to be used for the `Pie-thon`_ needed to be written since the bytecode for the benchmark needed to be frozen. So Guido wrote some benchmarks. They are in CVS under nondist/sandbox/parrotbench . .. _Dan Sugalski: http://www.sidhe.org/~dan/blog/ .. _Parrot: http://www.parrotcode.org/ .. _Perl 6: http://dev.perl.org/perl6/ .. _Pie-thon: http://www.sidhe.org/~dan/blog/archives/000219.html Contributing threads: - `Merry December `__ - `Pie-thon benchmarks `__ - `Pie-thon benchmark code ready `__ -------------- PyCon is a go! -------------- http://www.pycon.org/ has gone live! Registration_ is live (early-bird ends January 31)! Online talk proposal submission is live (deadline is January 15)! .. _Registration: http://www.pycon.org/dc2004 Contributing threads: - `PyCon DC 2004 - Registration about to open! `__ - `PyCon DC 2004 - Submissions Now Open `__ ---------------------------------------- operator gains attrgetter and itemgetter ---------------------------------------- The operator module has now gained two new functions: attrgetter and itemgetter "which are useful for creating fast data extractor functions for map(), list.sort(), itertools.groupby(), and other functions that expect a function argument" according to Misc/NEWS . Contributing threads: - `Re: "groupby" iterator `__ ------------------- CObjects and safety ------------------- Michael Hudson pointed out how CObjects could be misused in Python code. Various ideas of how to make them safer by checking that the proper CObject was passed were proposed. The thread seemed to end without a resolution, though. Contributing threads: - `are CObjects inherently unsafe? `__ ----------------- Unicode is a pain ----------------- Want proof? How about the fact that you can store a character like "??" either as two characters ("a" followed by "previous character has an umlaut") or as one ("a with an umlaut"). The former is called "decomposed normal form" and is used in OS X. Windows, of course, uses the latter version. Contributing threads: - `test_unicode_file failing on Mac OS X `__ ------------------ Two new developers ------------------ Hye-Shik Chang has become a developer. You probably know him from his work on the CJK codecs. He is now an official developer. Vinay Sajip, implementor of the logging package has also been granted CVS checkin rights. Contributing threads: - `New developer `__ ------------------------ Compiling 2.4 under .NET ------------------------ Martin v. L?wis has started sandbox work on an MSI installer and moving Python 2.4 over to VC 7. Contributing threads: - `Py2.4 pre-alpha snapshot `__ - `First version of Python MSI available `__ - `Switching to VC.NET 2003 `__ ----------------------------- New method flag: METH_COEXIST ----------------------------- Raymond Hettinger, in his continual pursuit of speed, came up with a new method flag, METH_COEXIST, which causes a method to be used in place of a slot wrapper. The example that actually led to this is __contains__: a PyCFunction defining __contains__ tends to be faster than one in the sq_contains slot thanks to METH_O and other tricks. Contributing threads: - `FW: METH_COEXIST `__ ------------------------------ Better relative import support ------------------------------ There was a huge discussion on a better way to handle relative imports (think of the situation where you have your module ``import sys`` and you happen to have a module named sys in the same directory; should that local module be imported or the sys module from the stdlib?). Luckily Aahz volunteered to write a PEP on the whole thread so I am being spared from having to summarize the thing. =) Thanks, Aahz. Contributing threads: - `Re: Christmas Wishlist `__ - `Re: Python-Dev Digest, Vol 5, Issue 57 `__ - `Relative import `__ - `Another Strategy for Relative Import `__ ------------------------------ list.sorted becomes a built-in ------------------------------ Just as the title says, list.sorted has not been moved out of the list type and has been made a built-in. Contributing threads: - `python/dist/src/Python bltinmodule.c,2.304,2.305 `__ -------------------------------- What to do with old Python code? -------------------------------- Someone rewrote the bisect module in C. This brought up the question of what to do with the old Python version. Some suggest moving it to the Demo directory. Others suggest keeping the code but importing the C version in the Python module. The idea of keeping both was quickly shot down, though, like in the pickle/cPickle situation. This discussion is still going at this time. Contributing threads: - `SF patch 864863: Bisect C implementation `__ From kbk at shore.net Fri Jan 2 19:31:07 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri Jan 2 19:31:12 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines References: <3FF57FAE.7422.3E770405@localhost> <3FF584AE.15948.3E8A8E41@localhost> <200401021953.i02Jr2Y10129@c-24-5-183-134.client.comcast.net> Message-ID: <87hdzd27qs.fsf@hydra.localdomain> Guido van Rossum writes: > Hm... My IBM T40 with 1.4 GHz P(M) reports 15.608. I bet the caches > are more similar, and affect performance more than CPU speed... You have a 32K L1 and a 1024K (!) L2. What a great machine! As an example of the other end of the spectrum, I'm running current, but low end hardware: a 2.2 GHz Celeron, 400 MHz FSB. 256MB DDR SDRAM. Verified working as expected by Intel Processor Speed Test. My L1 is 12K trace, 8K data. My L2 is only 128K, and when there's a miss there's no L3 to fall back on. Cost for processor and mobo, new, $120. I find this setup pretty snappy for what I do on it: development and home server. It's definitely not my game machine :-) Python 2.2.1 (#1, Oct 17 2003, 16:36:36) [GCC 2.95.3 20010125 (prerelease, propolice)] on openbsd3 [best of 3]: Pystone(1.1) time for 10000 passes = 0.93 This machine benchmarks at 10752.7 pystones/second Python 2.3.3 (#15, Jan 2 2004, 14:39:36) [best of 3]: Pystone(1.1) time for 50000 passes = 3.46 This machine benchmarks at 14450.9 pystones/second Python 2.4a0 (#40, Jan 1 2004, 22:22:45) [current cvs] [best of 3]: Pystone(1.1) time for 50000 passes = 2.91 This machine benchmarks at 17182.1 pystones/second (but see the p.s. below) Now the parrotbench, version 1.04. [make extra passes to get .pyo first] First, python 2.3.3: best 3: 31.1/31.8/32.3 Next, python 2.4a0, current cvs: best 3: 31.8/31.9/32.1 Since I noticed quite different ratios between the individual tests compared to what was posted by Seo Sanghyeon on the pypy list, here's my numbers (2.4a0): hydra /home/kbk/PYTHON/python/nondist/sandbox/parrotbench$ make times for i in 0 1 2 3 4 5 6; do echo b$i.py; time /home/kbk/PYSRC/python b$i.py >@out$i; cmp @out$i out$i; done b0.py 5.48s real 5.30s user 0.05s system b1.py 1.36s real 1.22s user 0.10s system b2.py 0.44s real 0.42s user 0.04s system b3.py 2.01s real 1.94s user 0.04s system b4.py 1.69s real 1.63s user 0.05s system b5.py 4.80s real 4.73s user 0.02s system b6.py 1.84s real 1.56s user 0.26s system I notice that some of these tests are a little faster on 2.3.3 while others are faster on 2.4, resulting in the overall time being about the same on both releases. N.B. compiling Python w/o the stack protector doesn't make a noticeable difference ;-) There may be some other problem with this box that I haven't yet discovered, but right now I'm blaming the tiny cache for performance being 2 - 3 x lower than expected from the clock rate, compared to what others are getting. -- KBK p.s. I saw quite a large outlier on 2.4 pystone when I first tried it. I didn't believe it, but was able to scroll back and clip it: Python 2.4a0 (#40, Jan 1 2004, 22:22:45) [GCC 2.95.3 20010125 (prerelease, propolice)] on openbsd3 Type "help", "copyright", "credits" or "license" for more information. >>> from test.pystone import main >>> main(); main(); main() Pystone(1.1) time for 50000 passes = 4.22 This machine benchmarks at 11848.3 pystones/second Pystone(1.1) time for 50000 passes = 4.21 This machine benchmarks at 11876.5 pystones/second Pystone(1.1) time for 50000 passes = 4.21 This machine benchmarks at 11876.5 pystones/second This is 30% lower than the rate quoted above. I haven't been able to duplicate it. Maybe the OS or X was doing something which tied up the cache. This is a fairly lightly loaded machine running X, Ion, and emacs. I've also seen 20% variations in the 2.2.1 pystone benchmark. It seems to me that this benchmark is pretty cache sensitive and should be done on an unloaded system, preferable x/o X, and with the results averaged over many random trials if comparisions are desired, especially if the cache is small. I don't see the same variation in the parrotbench. It's just consistently low for this box. From kbk at shore.net Fri Jan 2 20:17:41 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri Jan 2 20:25:13 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines References: <3FF57FAE.7422.3E770405@localhost> <3FF584AE.15948.3E8A8E41@localhost> <200401021953.i02Jr2Y10129@c-24-5-183-134.client.comcast.net> Message-ID: <87llopzv7u.fsf@hydra.localdomain> Guido van Rossum writes: > Hm... My IBM T40 with 1.4 GHz P(M) reports 15.608. I bet the caches > are more similar, and affect performance more than CPU speed... You have a 32K L1 and a 1024K (!) L2. What a great machine! As an example of the other end of the spectrum, I'm running current, but low end hardware: a 2.2 GHz Celeron, 400 MHz FSB. 256MB DDR SDRAM. Verified working as expected by Intel Processor Speed Test. My L1 is 12K trace, 8K data. My L2 is only 128K, and when there's a miss there's no L3 to fall back on. Cost for processor and mobo, new, $120. I find this setup pretty snappy for what I do on it: development and home server. It's definitely not my game machine :-) Python 2.2.1 (#1, Oct 17 2003, 16:36:36) [GCC 2.95.3 20010125 (prerelease, propolice)] on openbsd3 [best of 3]: Pystone(1.1) time for 10000 passes = 0.93 This machine benchmarks at 10752.7 pystones/second Python 2.3.3 (#15, Jan 2 2004, 14:39:36) [best of 3]: Pystone(1.1) time for 50000 passes = 3.46 This machine benchmarks at 14450.9 pystones/second Python 2.4a0 (#40, Jan 1 2004, 22:22:45) [current cvs] [best of 3]: Pystone(1.1) time for 50000 passes = 2.91 This machine benchmarks at 17182.1 pystones/second (but see the p.s. below) Now the parrotbench, version 1.04. [make extra passes to get .pyo first] First, python 2.3.3: best 3: 31.1/31.8/32.3 Next, python 2.4a0, current cvs: best 3: 31.8/31.9/32.1 Since I noticed quite different ratios between the individual tests compared to what was posted by Seo Sanghyeon on the pypy list, here's my numbers (2.4a0): hydra /home/kbk/PYTHON/python/nondist/sandbox/parrotbench$ make times for i in 0 1 2 3 4 5 6; do echo b$i.py; time /home/kbk/PYSRC/python b$i.py >@out$i; cmp @out$i out$i; done b0.py 5.48s real 5.30s user 0.05s system b1.py 1.36s real 1.22s user 0.10s system b2.py 0.44s real 0.42s user 0.04s system b3.py 2.01s real 1.94s user 0.04s system b4.py 1.69s real 1.63s user 0.05s system b5.py 4.80s real 4.73s user 0.02s system b6.py 1.84s real 1.56s user 0.26s system I notice that some of these tests are a little faster on 2.3.3 while others are faster on 2.4, resulting in the overall time being about the same on both releases. N.B. compiling Python w/o the stack protector doesn't make a noticeable difference ;-) There may be some other problem with this box that I haven't yet discovered, but right now I'm blaming the tiny cache for performance being 2 - 3 x lower than expected from the clock rate, compared to what others are getting. -- KBK p.s. I saw quite a large outlier on 2.4 pystone when I first tried it. I didn't believe it, but was able to scroll back and clip it: Python 2.4a0 (#40, Jan 1 2004, 22:22:45) [GCC 2.95.3 20010125 (prerelease, propolice)] on openbsd3 Type "help", "copyright", "credits" or "license" for more information. >>> from test.pystone import main >>> main(); main(); main() Pystone(1.1) time for 50000 passes = 4.22 This machine benchmarks at 11848.3 pystones/second Pystone(1.1) time for 50000 passes = 4.21 This machine benchmarks at 11876.5 pystones/second Pystone(1.1) time for 50000 passes = 4.21 This machine benchmarks at 11876.5 pystones/second This is 30% lower than the rate quoted above. I haven't been able to duplicate it. Maybe the OS or X was doing something which tied up the cache. This is a fairly lightly loaded machine running X, Ion, and emacs. I've also seen 20% variations in the 2.2.1 pystone benchmark. It seems to me that this benchmark is pretty cache sensitive and should be done on an unloaded system, preferable x/o X, and with the results averaged over many random trials if comparisions are desired, especially if the cache is small. I don't see the same variation in the parrotbench. It's just consistently low for this box. From martin at v.loewis.de Sat Jan 3 05:43:58 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 05:44:07 2004 Subject: [Python-Dev] Head broken when --enable-unicode=ucs4 is used In-Reply-To: <1072798884.9216.77.camel@anthem> References: <1072798884.9216.77.camel@anthem> Message-ID: <3FF69CEE.4050103@v.loewis.de> Barry Warsaw wrote: > I've been building Python with --enable-unicode=ucs4 ever since I > upgraded to RedHat 9. IIUC, this option is required to get Python and > RH9's default Tk to play nicely together. I can't reproduce this. Does your pyconfig.h have Py_UNICODE_SIZE? Regards, Martin From martin at v.loewis.de Sat Jan 3 06:44:52 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 06:45:03 2004 Subject: [Python-Dev] Relaxing Unicode error handling Message-ID: <3FF6AB34.1000502@v.loewis.de> People keep complaining that they get Unicode errors when funny characters start showing up in their inputs. In some cases, these people would apparantly be happy if Python would just "keep going", IOW, they want to get moji-bake (garbled characters), instead of exceptions that abort their applications. I'd like to add a static property unicode.errorhandling, which defaults to "strict". Applications could set this to "replace" or "ignore", silencing the errors, and risking loss of data. What do you think? Regards, Martin From astrand at lysator.liu.se Sat Jan 3 08:47:41 2004 From: astrand at lysator.liu.se (=?iso-8859-1?Q?Peter_=C5strand?=) Date: Sat Jan 3 08:47:49 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module Message-ID: There's a new PEP available: PEP 324: popen5 - New POSIX process module A copy is included below. Comments are appreciated. ---- PEP: 324 Title: popen5 - New POSIX process module Version: $Revision: 1.4 $ Last-Modified: $Date: 2004/01/03 10:32:53 $ Author: Peter Astrand Status: Draft Type: Standards Track (library) Created: 19-Nov-2003 Content-Type: text/plain Python-Version: 2.4 Abstract This PEP describes a new module for starting and communicating with processes on POSIX systems. Motivation Starting new processes is a common task in any programming language, and very common in a high-level language like Python. Good support for this task is needed, because: - Inappropriate functions for starting processes could mean a security risk: If the program is started through the shell, and the arguments contain shell meta characters, the result can be disastrous. [1] - It makes Python an even better replacement language for over-complicated shell scripts. Currently, Python has a large number of different functions for process creation. This makes it hard for developers to choose. The popen5 modules provides the following enhancements over previous functions: - One "unified" module provides all functionality from previous functions. - Cross-process exceptions: Exceptions happening in the child before the new process has started to execute are re-raised in the parent. This means that it's easy to handle exec() failures, for example. With popen2, for example, it's impossible to detect if the execution failed. - A hook for executing custom code between fork and exec. This can be used for, for example, changing uid. - No implicit call of /bin/sh. This means that there is no need for escaping dangerous shell meta characters. - All combinations of file descriptor redirection is possible. For example, the "python-dialog" [2] needs to spawn a process and redirect stderr, but not stdout. This is not possible with current functions, without using temporary files. - With popen5, it's possible to control if all open file descriptors should be closed before the new program is executed. - Support for connecting several subprocesses (shell "pipe"). - Universal newline support. - A communicate() method, which makes it easy to send stdin data and read stdout and stderr data, without risking deadlocks. Most people are aware of the flow control issues involved with child process communication, but not all have the patience or skills to write a fully correct and deadlock-free select loop. This means that many Python applications contain race conditions. A communicate() method in the standard library solves this problem. Rationale The following points summarizes the design: - popen5 was based on popen2, which is tried-and-tested. - The factory functions in popen2 have been removed, because I consider the class constructor equally easy to work with. - popen2 contains several factory functions and classes for different combinations of redirection. popen5, however, contains one single class. Since popen5 supports 12 different combinations of redirection, providing a class or function for each of them would be cumbersome and not very intuitive. Even with popen2, this is a readability problem. For example, many people cannot tell the difference between popen2.popen2 and popen2.popen4 without using the documentation. - One small utility function is provided: popen5.run(). It aims to be an enhancement over os.system(), while still very easy to use: - It does not use the Standard C function system(), which has limitations. - It does not call the shell implicitly. - No need for quoting; using a variable argument list. - The return value is easier to work with. - The "preexec" functionality makes it possible to run arbitrary code between fork and exec. One might ask why there are special arguments for setting the environment and current directory, but not for, for example, setting the uid. The answer is: - Changing environment and working directory is considered fairly common. - Old functions like spawn() has support for an "env"-argument. - env and cwd are considered quite cross-platform: They make sense even on Windows. - No MS Windows support is available, currently. To be able to provide more functionality than what is already available from the popen2 module, help from C modules is required. Specification This module defines one class called Popen: class Popen(args, bufsize=0, argv0=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, preexec_args=(), close_fds=0, cwd=None, env=None, universal_newlines=0) Arguments are: - args should be a sequence of program arguments. The program to execute is normally the first item in the args sequence, but can be explicitly set by using the argv0 argument. The Popen class uses os.execvp() to execute the child program. - bufsize, if given, has the same meaning as the corresponding argument to the built-in open() function: 0 means unbuffered, 1 means line buffered, any other positive value means use a buffer of (approximately) that size. A negative bufsize means to use the system default, which usually means fully buffered. The default value for bufsize is 0 (unbuffered). - stdin, stdout and stderr specify the executed programs' standard input, standard output and standard error file handles, respectively. Valid values are PIPE, an existing file descriptor (a positive integer), an existing file object, and None. PIPE indicates that a new pipe to the child should be created. With None, no redirection will occur; the child's file handles will be inherited from the parent. Additionally, stderr can be STDOUT, which indicates that the stderr data from the applications should be captured into the same file handle as for stdout. - If preexec_fn is set to a callable object, this object will be called in the child process just before the child is executed, with arguments preexec_args. - If close_fds is true, all file descriptors except 0, 1 and 2 will be closed before the child process is executed. - If cwd is not None, the current directory will be changed to cwd before the child is executed. - If env is not None, it defines the environment variables for the new process. - If universal_newlines is true, the file objects fromchild and childerr are opened as a text files, but lines may be terminated by any of '\n', the Unix end-of-line convention, '\r', the Macintosh convention or '\r\n', the Windows convention. All of these external representations are seen as '\n' by the Python program. Note: This feature is only available if Python is built with universal newline support (the default). Also, the newlines attribute of the file objects fromchild, tochild and childerr are not updated by the communicate() method. The module also defines one shortcut function: run(*args): Run command with arguments. Wait for command to complete, then return the returncode attribute. Example: retcode = popen5.run("stty", "sane") Exceptions ---------- Exceptions raised in the child process, before the new program has started to execute, will be re-raised in the parent. Additionally, the exception object will have one extra attribute called 'child_traceback', which is a string containing traceback information from the child's point of view. The most common exception raised is OSError. This occurs, for example, when trying to execute a non-existent file. Applications should prepare for OSErrors. A PopenException will also be raised if Popen is called with invalid arguments. Security -------- popen5 will never call /bin/sh implicitly. This means that all characters, including shell metacharacters, can safely be passed to child processes. Popen objects ------------- Instances of the Popen class have the following methods: poll() Returns -1 if child process hasn't completed yet, or its exit status otherwise. See below for a description of how the exit status is encoded. wait() Waits for and returns the exit status of the child process. The exit status encodes both the return code of the process and information about whether it exited using the exit() system call or died due to a signal. Functions to help interpret the status code are defined in the os module (the W*() family of functions). communicate(input=None) Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate. The optional stdin argument should be a string to be sent to the child process, or None, if no data should be sent to the child. communicate() returns a tuple (stdout, stderr). Note: The data read is buffered in memory, so do not use this method if the data size is large or unlimited. The following attributes are also available: fromchild A file object that provides output from the child process. tochild A file object that provides input to the child process. childerr A file object that provides error output from the child process. pid The process ID of the child process. returncode The child return code. A None value indicates that the process hasn't terminated yet. A negative value means that the process was terminated by a signal with number -returncode. Open Issues Perhaps the module should be called something like "process", instead of "popen5". Reference Implementation A reference implementation is available from http://www.lysator.liu.se/~astrand/popen5/. References [1] Secure Programming for Linux and Unix HOWTO, section 8.3. http://www.dwheeler.com/secure-programs/ [2] Python Dialog http://pythondialog.sourceforge.net/ Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: -- /Peter ?strand From claird at lairds.com Sat Jan 3 09:15:48 2004 From: claird at lairds.com (Cameron Laird) Date: Sat Jan 3 09:16:00 2004 Subject: [Python-Dev] HP-UX clean-up Message-ID: Python generation for HP-UX is broken. I want to fix it. Do I need to make the case for the first proposition? I think the threading problems are widely recognized ... This isn't an accusation of moral turpitude, incidentally; it just hasn't worked out to solve HP-UX difficulties, for plenty of legiti- mate reasons. I understand that. I happened to try a vanilla generation of 2.3.3 under HP-UX 10.20 yesterday, and encountered multiple non-trivial faults (curses, threading, ...). Experience tells me I'd find roughly the same level of problems with other releases of Python and HP-UX. Do I need to make the case that fixing HP-UX (and eventually Irix and so on) is a Good Thing? I'll assume not, at least for now. I recognize that there's been a lot of churn in HP-UX fixes-- tinkering with threading libraries, compilation options, and so on. I'm sensitive to such. My plan is to make conservative changes, ones unlikely to degrade the situation for other HP-UX users. I have plenty of porting background; I'm sure we can improve on what we have now. Here's what I need: 1. Is it best to carry this on here, or should those of us with HP-UX interests go off by ourselves for a while, and return once we've established consensus? I've been away, and don't feel I know the culture here well. Who *are* the folks with an HP-UX interest? 2. Does anyone happen to have an executable built that can "import Tkinter"? That's my most immediate need. If you can share that with me, it'd help my own situation, and I'll continue to push for corrections in the standard distribution, anyway. I can compile _tkinter.c "by hand", but (more details, later--the current point is just that stuff doesn't work automatically ... 3. I'm very rusty with setup.py. While I was around for its birth, as I recall, I feel clumsy with it now. Is there a write-up anyone would recommend on good setup.py style? From aahz at pythoncraft.com Sat Jan 3 09:29:55 2004 From: aahz at pythoncraft.com (Aahz) Date: Sat Jan 3 09:29:58 2004 Subject: [Python-Dev] Re: Relaxing Unicode error handling Message-ID: <20040103142955.GA8199@panix.com> Martin: > > I'd like to add a static property unicode.errorhandling, which defaults > to "strict". Applications could set this to "replace" or "ignore", > silencing the errors, and risking loss of data. > > What do you think? Two traditional questions: * How does this work for library writers? * How does this work with threads? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From perky at i18n.org Sat Jan 3 09:33:32 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Sat Jan 3 09:33:38 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FF6AB34.1000502@v.loewis.de> References: <3FF6AB34.1000502@v.loewis.de> Message-ID: <20040103143332.GA14313@i18n.org> On Sat, Jan 03, 2004 at 12:44:52PM +0100, Martin v. Loewis wrote: > People keep complaining that they get Unicode errors > when funny characters start showing up in their inputs. > > In some cases, these people would apparantly be happy > if Python would just "keep going", IOW, they want to > get moji-bake (garbled characters), instead of > exceptions that abort their applications. > > I'd like to add a static property unicode.errorhandling, > which defaults to "strict". Applications could set this > to "replace" or "ignore", silencing the errors, and > risking loss of data. I'd like to see this feature in python very much. :) But don't we keep these sort of variables in sys module? Hye-Shik From pje at telecommunity.com Sat Jan 3 10:51:51 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 3 10:50:44 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FF6AB34.1000502@v.loewis.de> Message-ID: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> At 12:44 PM 1/3/04 +0100, Martin v. Loewis wrote: >People keep complaining that they get Unicode errors >when funny characters start showing up in their inputs. > >In some cases, these people would apparantly be happy >if Python would just "keep going", IOW, they want to >get moji-bake (garbled characters), instead of >exceptions that abort their applications. > >I'd like to add a static property unicode.errorhandling, >which defaults to "strict". Applications could set this >to "replace" or "ignore", silencing the errors, and >risking loss of data. > >What do you think? When I've gotten UnicodeErrors, it pointed out an error in my programming that needed to be fixed - i.e., that I forgot what kind of strings I was dealing with, and needed to be explicit about it, or at least use the replace/ignore option *at the point of decoding*. (Errors should not pass silently, unless explicitly silenced.) A global setting makes it possible to create code that relies on the setting being one way or the other, and those pieces of code will not then work together. (Only one obvious way to do it.) Admittedly, my experience with using Unicode is very limited, dealing primarily with the ISO-8859-x and Japanese language codecs, with decoding fairly centralized. It's possible that there are use cases I'm unfamiliar with that would scatter decode()'s all over the place, and that would make adding the "ignore" parameter to each use unbearably tedious. OTOH, I don't think that adding more stateful globals to Python is a good idea, and what's the harm of having somebody write: def garble(s,codec): s.decode(codec,'ignore') Or, if it's desired that this be available as part of Python, perhaps adding 'decode_replace' and 'decode_ignore' staticmethods to the Unicode class? Or, am I missing the point entirely, and there's some other circumstance where one gets UnicodeErrors besides .decode()? If the use case is mixing strings and unicode objects (i.e. adding, joining, searching, etc.), then I'd have to say a big fat -1, as opposed to merely a -0 for having other ways to spell .decode(codec,"ignore"). If I in my youth had seen such a flag as you describe, I'd have used it, and then missed out on lots of very educational error messages. From aleax at aleax.it Sat Jan 3 11:15:06 2004 From: aleax at aleax.it (Alex Martelli) Date: Sat Jan 3 11:15:14 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <20040103143332.GA14313@i18n.org> References: <3FF6AB34.1000502@v.loewis.de> <20040103143332.GA14313@i18n.org> Message-ID: [[ I _hope_ I've figured out how to convince the Mail app on my brand-new iBook to send out plain text, NOT HTML, mails -- if this comes as HTML mail, I apologize -- please let me know so I'll keep digging (advice from more experienced Mac'ers welcome too!_)]] On Jan 3, 2004, at 3:33 PM, Hye-Shik Chang wrote: > On Sat, Jan 03, 2004 at 12:44:52PM +0100, Martin v. Loewis wrote: >> People keep complaining that they get Unicode errors >> when funny characters start showing up in their inputs. >> >> In some cases, these people would apparantly be happy >> if Python would just "keep going", IOW, they want to >> get moji-bake (garbled characters), instead of >> exceptions that abort their applications. >> >> I'd like to add a static property unicode.errorhandling, ... > I'd like to see this feature in python very much. :) I agree it would be an enhancement. > But don't we keep these sort of variables in sys module? Why would it need to be a module top-level variable? As it strongly relates to a single, specific type, making it an attribute of the type feels MUCH better to me. Alex From guido at python.org Sat Jan 3 11:41:20 2004 From: guido at python.org (Guido van Rossum) Date: Sat Jan 3 11:41:26 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: Your message of "Sat, 03 Jan 2004 12:44:52 +0100." <3FF6AB34.1000502@v.loewis.de> References: <3FF6AB34.1000502@v.loewis.de> Message-ID: <200401031641.i03GfLL11733@c-24-5-183-134.client.comcast.net> > People keep complaining that they get Unicode errors > when funny characters start showing up in their inputs. > > In some cases, these people would apparantly be happy > if Python would just "keep going", IOW, they want to > get moji-bake (garbled characters), instead of > exceptions that abort their applications. > > I'd like to add a static property unicode.errorhandling, > which defaults to "strict". Applications could set this > to "replace" or "ignore", silencing the errors, and > risking loss of data. > > What do you think? This is a global default, right? Seems to work for the socket timeout; I think the case for a global default is similar. So +eps from me (no more than epsilon because I'm not a Unicode user myself). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Jan 3 11:47:59 2004 From: guido at python.org (Guido van Rossum) Date: Sat Jan 3 11:48:04 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: Your message of "Sat, 03 Jan 2004 09:15:48 EST." References: Message-ID: <200401031647.i03Glxt11783@c-24-5-183-134.client.comcast.net> > Python generation for HP-UX is broken. I want to fix it. You have my support! This has always been a problem because we could never find people who had (a) good developer-style access to HP-UX and (b) enough understanding of Python to be able to debug it. Hopefully you qualify for both; for (b) python-dev can act as a crutch. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From astrand at lysator.liu.se Sat Jan 3 12:03:01 2004 From: astrand at lysator.liu.se (Peter Astrand) Date: Sat Jan 3 12:03:05 2004 Subject: [Python-Dev] Re: PEP 324: popen5 - New POSIX process module In-Reply-To: Message-ID: On Sat, 3 Jan 2004, Peter ?strand wrote: > PEP 324: popen5 - New POSIX process module > > A copy is included below. Comments are appreciated. There are some issues wrt Windows support. Currently, popen5 does not support Windows at all. To be able to do so, we must chose between: 1) Rely on Mark Hammonds Win32 extensions. This makes it possible to keep popen5 as a pure Python module. The drawback is that Windows support won't be available unless the user installs win32all. Does anyone think there's a problem with adding a module to the Python standard library that uses win32all? or 2) Write supporting code in C (basically, copy the required process handling functions from win32all). -- /Peter ?strand From neal at metaslash.com Sat Jan 3 12:27:21 2004 From: neal at metaslash.com (Neal Norwitz) Date: Sat Jan 3 12:27:29 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: Message-ID: <20040103172721.GZ1274@epoch.metaslash.com> On Sat, Jan 03, 2004 at 09:15:48AM -0500, Cameron Laird wrote: > > I happened to try a vanilla generation of 2.3.3 under HP-UX > 10.20 yesterday, and encountered multiple non-trivial faults > (curses, threading, ...). Experience tells me I'd find roughly > the same level of problems with other releases of Python and > HP-UX. I think newer versions of HPUX are a little better. Since most of the HPUX resources available (snake-farm and HP Test Drive machines) tend to be 11.x+. But very few have access to HPUX, Irix, or AIX. > Who *are* the folks with an HP-UX interest? I wouldn't say I have an interest in HPUX beyond the fact that I'd like to see it work and have access to some boxes. :-) I can try to make some time available to help. > 2. Does anyone happen to have an executable > built that can "import Tkinter"? That's my > most immediate need. If you can share that > with me, it'd help my own situation, and I'll > continue to push for corrections in the > standard distribution, anyway. If so it would probably be for HPUX 11.something. Neal From martin at v.loewis.de Sat Jan 3 12:34:43 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 12:35:06 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: Message-ID: <3FF6FD33.1050604@v.loewis.de> Cameron Laird wrote: > Do I need to make the case that fixing HP-UX (and eventually > Irix and so on) is a Good Thing? Go ahead and fix it, as long as you don't break anything else. Regards, Martin From martin at v.loewis.de Sat Jan 3 12:37:37 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 12:38:13 2004 Subject: [Python-Dev] Re: Relaxing Unicode error handling In-Reply-To: <20040103142955.GA8199@panix.com> References: <20040103142955.GA8199@panix.com> Message-ID: <3FF6FDE1.8040607@v.loewis.de> Aahz wrote: > * How does this work for library writers? Library writers should avoid using it. If the application uses it, libraries should not notice, since they won't get exceptions that they should not have gotten in the first place. > * How does this work with threads? It's shared across all threads, and thread-safe (i.e. you can continue to not pass error handlers in any thread, and the thread will use either the old or the new error handler, not something inbetween - thanks to the GIL). Regards, Martin From martin at v.loewis.de Sat Jan 3 12:40:53 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 12:41:09 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <20040103143332.GA14313@i18n.org> References: <3FF6AB34.1000502@v.loewis.de> <20040103143332.GA14313@i18n.org> Message-ID: <3FF6FEA5.1000106@v.loewis.de> Hye-Shik Chang wrote: > I'd like to see this feature in python very much. :) > But don't we keep these sort of variables in sys module? Yes, but a) sys.setdefaultunicodeerrorhandling is quite an ugly function name, and b) those things are in the sys module only because the builtins had no attributes in the past (and neither did types). Regards, Martin From martin at v.loewis.de Sat Jan 3 12:46:11 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 12:46:43 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> References: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> Message-ID: <3FF6FFE3.60608@v.loewis.de> Phillip J. Eby wrote: > When I've gotten UnicodeErrors, it pointed out an error in my > programming that needed to be fixed - i.e., that I forgot what kind of > strings I was dealing with, and needed to be explicit about it, or at > least use the replace/ignore option *at the point of decoding*. I agree, and I would alwise advise not to use the feature I'm proposing. However, now people say "I have followed this advise, and I got banana software: matures at the customer. We would rather be certain that the customer won't see exceptions that we never got in testing". It is for these users where the feature would be used. > Or, am I missing the point entirely, and there's some other circumstance > where one gets UnicodeErrors besides .decode()? If the use case is > mixing strings and unicode objects (i.e. adding, joining, searching, > etc.), then I'd have to say a big fat -1, as opposed to merely a -0 for > having other ways to spell .decode(codec,"ignore"). Yes, it is these use cases: Somebody invokes an SQL method, which happens to return a Unicode string, and then adds a latin-1 byte string to it. It works for all ASCII byte strings, but then the customer happens to enter accented characters, and the application crashes without offering to safe recent changes. So I guess that's -1 from you. Regards, Martin From claird at lairds.com Sat Jan 3 13:22:56 2004 From: claird at lairds.com (Cameron Laird) Date: Sat Jan 3 13:23:02 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <20040103172721.GZ1274@epoch.metaslash.com> Message-ID: > From neal@metaslash.com Sat Jan 03 12:28:17 2004 > . > . > . > I think newer versions of HPUX are a little better. Since most of the > HPUX resources available (snake-farm and HP Test Drive machines) tend > to be 11.x+. But very few have access to HPUX, Irix, or AIX. > > Who *are* the folks with an HP-UX interest? > I wouldn't say I have an interest in HPUX beyond the fact that > I'd like to see it work and have access to some boxes. :-) > I can try to make some time available to help. > > 2. Does anyone happen to have an executable > > built that can "import Tkinter"? That's my > > most immediate need. If you can share that > > with me, it'd help my own situation, and I'll > > continue to push for corrections in the > > standard distribution, anyway. > If so it would probably be for HPUX 11.something. > Neal Your characterization is likely apt: several of us simply want to see it work and have access to boxes. It's hard to have much more enthusiasm for it than H-P (the company to cashier OSF1, MPEix, ...) does--and that's precious little. I'll try to confine myself to technical points though, especially in anything broadcast to all of python-dev. Does someone have a Tkinter-importable Python for 11.x? I'd like to know that ... Whether to support just 11.x, or aim also for 10.20 and above, is a thorny question; it's probably easier just to do the work than to settle that one by pure reasoning. It suits me to start with 10.20, so that's what I'll do, until someone makes a specific argument to the contrary. I can probably get Irix and AIX later in the year--maybe by February. First things first ... Has anyone been in contact with the HP-UX Porting Center folks recently? I wonder if I should enlist their help ... From barry at python.org Sat Jan 3 13:26:17 2004 From: barry at python.org (Barry Warsaw) Date: Sat Jan 3 13:26:30 2004 Subject: [Python-Dev] Head broken when --enable-unicode=ucs4 is used In-Reply-To: <3FF69CEE.4050103@v.loewis.de> References: <1072798884.9216.77.camel@anthem> <3FF69CEE.4050103@v.loewis.de> Message-ID: <1073154377.6563.55.camel@anthem> On Sat, 2004-01-03 at 05:43, Martin v. Loewis wrote: > Barry Warsaw wrote: > > I've been building Python with --enable-unicode=ucs4 ever since I > > upgraded to RedHat 9. IIUC, this option is required to get Python and > > RH9's default Tk to play nicely together. > > I can't reproduce this. Does your pyconfig.h have Py_UNICODE_SIZE? It does, and I've just done a 'make distclean; cvs up; configure --enable-unicode=ucs4'. The problem has gone away. go-bother-the-perlies-little-gremlin-ly y'rs, -Barry From barry at python.org Sat Jan 3 13:29:49 2004 From: barry at python.org (Barry Warsaw) Date: Sat Jan 3 13:30:00 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FF6FFE3.60608@v.loewis.de> References: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> <3FF6FFE3.60608@v.loewis.de> Message-ID: <1073154588.6563.59.camel@anthem> On Sat, 2004-01-03 at 12:46, Martin v. Loewis wrote: > However, now people say "I have followed this advise, and I got > banana software: matures at the customer. We would rather be certain > that the customer won't see exceptions that we never got in testing". > > It is for these users where the feature would be used. Agreed. Even when you know what to do, it can be very difficult to track everything down, especially when you're trying to wedge Unicode and i18n into an application that wasn't originally designed for it . +1 from me, although I agree with Alex's very persuasive plain text message that it probably ought to live on the type object. -Barry From aahz at pythoncraft.com Sat Jan 3 13:39:32 2004 From: aahz at pythoncraft.com (Aahz) Date: Sat Jan 3 13:39:35 2004 Subject: [Python-Dev] Re: Relaxing Unicode error handling In-Reply-To: <3FF6FDE1.8040607@v.loewis.de> References: <20040103142955.GA8199@panix.com> <3FF6FDE1.8040607@v.loewis.de> Message-ID: <20040103183932.GA25221@panix.com> On Sat, Jan 03, 2004, Martin v. Loewis wrote: > Aahz wrote: >> >>* How does this work for library writers? > > Library writers should avoid using it. If the application uses it, > libraries should not notice, since they won't get exceptions that they > should not have gotten in the first place. What if a library wants to ensure that it *does* get appropriate exceptions so that it can handle them? >>* How does this work with threads? > > It's shared across all threads, and thread-safe (i.e. you can continue > to not pass error handlers in any thread, and the thread will use > either the old or the new error handler, not something inbetween - > thanks to the GIL). Right, but if different parts of an application are turning this off and on, you'll get invalid results, such as unexpected exceptions. The question isn't whether Python's internal state will be consistent, but whether applications can ensure a consistent state for themselves. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From jcarlson at uci.edu Sat Jan 3 14:02:44 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat Jan 3 14:05:16 2004 Subject: [Python-Dev] Re: PEP 324: popen5 - New POSIX process module In-Reply-To: References: Message-ID: <20040103105257.FAA5.JCARLSON@uci.edu> > There are some issues wrt Windows support. Currently, popen5 does not > support Windows at all. To be able to do so, we must chose between: > > 1) Rely on Mark Hammonds Win32 extensions. This makes it possible to keep > popen5 as a pure Python module. The drawback is that Windows support won't > be available unless the user installs win32all. Does anyone think there's > a problem with adding a module to the Python standard library that uses > win32all? I am not familliar with any other standard library module that requres a non-standard module. I don't believe this is a good idea. > 2) Write supporting code in C (basically, copy the required process > handling functions from win32all). Depending on how stable the code has been, this may be the best idea. As long as the maintainer of the popen5 windows support code kept themselves updated on the status of applicable changes to win32all, this should go on without a hitch. It does bring up the fact that doing so would result in a possibly nontrivial amount of code duplication between a new portion of the python standard library, and a platform specific module. - Josiah From martin at v.loewis.de Sat Jan 3 15:06:30 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 15:06:44 2004 Subject: [Python-Dev] Re: Relaxing Unicode error handling In-Reply-To: <20040103183932.GA25221@panix.com> References: <20040103142955.GA8199@panix.com> <3FF6FDE1.8040607@v.loewis.de> <20040103183932.GA25221@panix.com> Message-ID: <3FF720C6.9060802@v.loewis.de> Aahz wrote: >>Library writers should avoid using it. If the application uses it, >>libraries should not notice, since they won't get exceptions that they >>should not have gotten in the first place. > > > What if a library wants to ensure that it *does* get appropriate > exceptions so that it can handle them? It would explicitly need to encode/decode strings as us-ascii, instead of relying on the default encoding (which it shouldn't do in the first place). > Right, but if different parts of an application are turning this off and > on, you'll get invalid results, such as unexpected exceptions. The > question isn't whether Python's internal state will be consistent, but > whether applications can ensure a consistent state for themselves. When you turn this on, you will get invalid results, period. This is precisely what people request. Regards, Martin From pf_moore at yahoo.co.uk Sat Jan 3 15:45:04 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sat Jan 3 15:45:13 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <3cb2xhgn.fsf@yahoo.co.uk> <3FF20589.2030702@v.loewis.de> Message-ID: "Martin v. Loewis" writes: > Paul Moore wrote: > >> Hmm, for example, does the installer set the registry key >> HKLM\Software\Microsoft\Windows\CurrentVersion\App Paths\python.exe > > Yes, it does. > >> Maybe just a single option in the installer, "Make this the default >> version of Python", which is true by default, would be best. This >> could control file associations, App Paths, and anything else related. > > Authoring new user interface interface is a tedious task. > It is much easier to just have a public property, so you would > have to install it as > > msiexec.exe python2.4.0.msi DEFAULTPYTHON=1 That sounds good enough - I know little about MSI, so when you say "public property" it doesn't mean much to me. But if it translates to "command line option with no GUI interface" then I have no problem at all with this - it's for advanced users only, so that's not a problem. Paul. -- This signature intentionally left blank From jcarlson at uci.edu Sat Jan 3 15:58:41 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat Jan 3 16:01:20 2004 Subject: [Python-Dev] Re: Are we collecting benchmark results across machines In-Reply-To: <87llopzv7u.fsf@hydra.localdomain> References: <200401021953.i02Jr2Y10129@c-24-5-183-134.client.comcast.net> <87llopzv7u.fsf@hydra.localdomain> Message-ID: <20040103121039.FAAB.JCARLSON@uci.edu> Another small set of pystone and parrotbench results... Best of 7 runs on a standard windows build of Python on windows 2000: Processor P4 2.26ghz P4 Cel 1.7ghz P2 Cel 400mhz bus 533 mhz 400 mhz 66 mhz l2cache 512k 128k 128k memory 1G DualChan DDR333 256M DDR200 384M SDR66 2.2.2 Pystone ~25600 ~18500 ~6086 2.3.2 Pystone ~31800 ~6830 2.3.2 Parrot 16.06s 83.4s 2.3.2 Pys*Par 510708 569622 A few things to note: I don't have consistant access to the 1.7ghz celeron, which is why it didn't get any 2.3 tests. The p2 celeron is a dual processor system. There does not seem to be an easy way to give a process one processor affinity on the command line. However, even inserting a pause into pystone in order to alter processor affinity does not seem to appreciably alter the pystone speed. This may suggest that Python has a working set so large that it doesn't fit into the 128k of L2 cache on either of the celerons. This is supported by the fact that based on the python 2.2.2 pystone results, the 2.26 P4 performs like a 2.35 ghz P4 celeron (256/185*1.7), which could be due to the larger cache, higher speed memory, or more likely both. It is also likely that the P4s aren't as fast per clock as the P2 due to the heavy branch misprediction penalties that the architecture suffers from. Strange thing: The 2.26 ghz P4 gains more in the Pystone benchmark from the 2.2.2 -> 2.3.2 transition than the celeron 400, 24% and 12% increases in Pystones respectively. - Josiah From martin at v.loewis.de Sat Jan 3 16:02:42 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 16:02:52 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <3cb2xhgn.fsf@yahoo.co.uk> <3FF20589.2030702@v.loewis.de> Message-ID: <3FF72DF2.30006@v.loewis.de> Paul Moore wrote: > That sounds good enough - I know little about MSI, so when you say > "public property" it doesn't mean much to me. But if it translates to > "command line option with no GUI interface" then I have no problem at > all with this - it's for advanced users only, so that's not a problem. Meanwhile, I have done both: I had to add an "Advanced" dialog, anyway to support selection of ALLUSERS (or just me). Still, with public properties, configuration in "unattended installation" (ie. no or restricted GUI) becomes possible, so I leave the public property in. I'll release an updated installer shortly. Regards, Martin From raymond.hettinger at verizon.net Sat Jan 3 16:27:45 2004 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat Jan 3 16:28:19 2004 Subject: [Python-Dev] Toowtdi: Datatype conversions Message-ID: <004101c3d240$6df330a0$bb29a044@oemcomputer> Choosing between: list(d) or d.keys() Which is the one obvious way of turning a dictionary into a list? IMO, list(d) is it. Which is the fastest? timeit.Timer() says d.keys() is several times faster. So, one question is whether set() and frozenset() should grow an analogue to the keys() method: >>> set('banana').elements() ['a', 'b', 'n'] This speeds-up the uniquification use case but comes at the expense of fattening the API and adding a second way to do it. Another question is whether there should be a method for conversion to a dictionary. Given the absence of use cases, the answer is no, but assuming there were, what would be the right way to go? set.asdict() would be several times faster than dict.fromkeys(s). One bright idea is to make the constructors a little bit smarter so that list(d) would automagically invoke d.keys() whenever d is a dictionary. The problem with this idea is that list creation is separate from list initialization. list.__init__() starts with an existing list and replaces its contents with the initializer. The precludes returning a list built by d.keys(). The situation for dicts is similar although there is a curious difference, dict.__init__() updates rather than replaces the contents of the underlying dictionary. Another bright idea is to support faster datatype conversion by adding an optional __len__() method to the iteration protocol so that list(), tuple(), dict(), and set() could allocate sufficient space for loading any iterable that knows its own length. The advantages are faster type conversion (by avoiding resizing), keeping the APIs decoupled, and keeping the visible API thin. This disadvantage is that it clutters the C code with special case handling and that it doesn't work with generators or custom iterators (unless they add support for __len__). Raymond Hettinger From Jack.Jansen at cwi.nl Sat Jan 3 16:39:03 2004 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Sat Jan 3 16:39:08 2004 Subject: [Python-Dev] Re: Relaxing Unicode error handling In-Reply-To: <3FF720C6.9060802@v.loewis.de> References: <20040103142955.GA8199@panix.com> <3FF6FDE1.8040607@v.loewis.de> <20040103183932.GA25221@panix.com> <3FF720C6.9060802@v.loewis.de> Message-ID: <3FCD9F6C-3E35-11D8-89E7-000A27B19B96@cwi.nl> On 3-jan-04, at 21:06, Martin v. Loewis wrote: > Aahz wrote: >>> Library writers should avoid using it. If the application uses it, >>> libraries should not notice, since they won't get exceptions that >>> they >>> should not have gotten in the first place. >> What if a library wants to ensure that it *does* get appropriate >> exceptions so that it can handle them? > > It would explicitly need to encode/decode strings as us-ascii, > instead of relying on the default encoding (which it shouldn't do > in the first place). Do I understand correctly then that this relaxed error handling could be seen as an "argument" to the encoding, i.e. it turns "us-ascii" into "us-ascii-relaxed"? Because if that is so, then isn't the best way to implement this to not bother with sys.relaxedunicodeerrors, but in stead use a special encoding name (or a parameter to an encoding name, such as "us-ascii;relaxed")? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From martin at v.loewis.de Sat Jan 3 16:39:30 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 16:39:50 2004 Subject: [Python-Dev] Toowtdi: Datatype conversions In-Reply-To: <004101c3d240$6df330a0$bb29a044@oemcomputer> References: <004101c3d240$6df330a0$bb29a044@oemcomputer> Message-ID: <3FF73692.9010103@v.loewis.de> Raymond Hettinger wrote: > Choosing between: > > list(d) or d.keys() > > Which is the one obvious way of turning a dictionary into a list? > IMO, list(d) is it. Neither. There is no obvious way to turn a dictionary into a list; lists and dictionaries are completely different things. Dictionaries are similar to sets; in fact, Smalltalk as an Association class (Assocation key: value:), and Dictionary is a set of associations. Then, assuming there is an obvious way to convert a set into a list, the most obvious way to convert a dictionary into a list is d.items() > So, one question is whether set() and frozenset() should grow an > analogue to the keys() method: But this is a completely different issue! For sets, there is an obvious way (if you accept that the list will have the same elements in arbitrary order), then list(a_set) is the most obvious way, and it should work fastest. > Another question is whether there should be a method for conversion > to a dictionary. Given the absence of use cases, the answer is no, > but assuming there were, what would be the right way to go? There is no obvious way to convert a set into a dictionary, as you don't know what the values should be (refuse the temptation to guess). If there was a use case, that use case would indicate what the values should be, and, from the use case, it would be clear what the method name would be. It would not be "asdict". > One bright idea is to make the constructors a little bit smarter > so that list(d) would automagically invoke d.keys() whenever > d is a dictionary. But who needs list(d)? > Another bright idea is to support faster datatype conversion > by adding an optional __len__() method to the iteration > protocol so that list(), tuple(), dict(), and set() could > allocate sufficient space for loading any iterable that > knows its own length. That is useful, also for list comprehension. > The advantages are faster type conversion (by avoiding resizing), > keeping the APIs decoupled, and keeping the visible API thin. > This disadvantage is that it clutters the C code with special > case handling and that it doesn't work with generators or > custom iterators (unless they add support for __len__). I see no reason why it should not work for custom iterators. For generators, you typically don't know how many results you will get in the end, so it is no loss that you cannot specify that. Regards, Martin From fumanchu at amor.org Sat Jan 3 16:37:38 2004 From: fumanchu at amor.org (Robert Brewer) Date: Sat Jan 3 16:43:34 2004 Subject: [Python-Dev] Toowtdi: Datatype conversions Message-ID: Raymond Hettinger wrote: > Choosing between: > > list(d) or d.keys() > > Which is the one obvious way of turning a dictionary into a list? > IMO, list(d) is it. Except that, "turning a dictionary into a list" as an English phrase is indeterminate--do you want the keys or the values or both? list(d) happens to choose the keys, but that doesn't make it the "one obvious way". If I were being introduced to list() for the first time, I would probably expect/desire it to be lossless, producing something like d.items() instead. > So, one question is whether set() and frozenset() should grow an > analogue to the keys() method: > >>> set('banana').elements() > ['a', 'b', 'n'] I don't think so, for the reason that .keys() is effectively a disambiguator as I just described. With sets, there is no mapping, and therefore no ambiguity. Robert Brewer MIS Amor Ministries fumanchu@amor.org From martin at v.loewis.de Sat Jan 3 16:50:24 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 16:50:49 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: References: Message-ID: <3FF73920.8090402@v.loewis.de> Peter ?strand wrote: > This PEP describes a new module for starting and communicating > with processes on POSIX systems. I see many aspects in this PEP that improve the existing implementation without changing the interface. I would suggest that you try to enhance the existing API (making changes to its semantics where reasonable), instead of coming up with a completely new module. With that approach, existing applications could use these features with no or little change. > - One "unified" module provides all functionality from previous > functions. I doubt this is a good thing. Different applications have different needs - having different API for them is reasonable. > > - Cross-process exceptions: Exceptions happening in the child > before the new process has started to execute are re-raised in > the parent. This means that it's easy to handle exec() > failures, for example. With popen2, for example, it's > impossible to detect if the execution failed. This is a bug in popen2, IMO. Fixing it is a good thing, but does not require a new module. > - A hook for executing custom code between fork and exec. This > can be used for, for example, changing uid. Such a hook could be merged as a keyword argument into the existing API. > - No implicit call of /bin/sh. This means that there is no need > for escaping dangerous shell meta characters. This could be an option to the existing API. Make sure it works on all systems, though. > - All combinations of file descriptor redirection is possible. > For example, the "python-dialog" [2] needs to spawn a process > and redirect stderr, but not stdout. This is not possible with > current functions, without using temporary files. Sounds like a new function on the popen2 module. > - With popen5, it's possible to control if all open file > descriptors should be closed before the new program is > executed. This should be an option on the existing API. > - Support for connecting several subprocesses (shell "pipe"). Isn't this available already, as the shell supports pipe creation, anyway? > - Universal newline support. This should be merged into the existing code. > - A communicate() method, which makes it easy to send stdin data > and read stdout and stderr data, without risking deadlocks. > Most people are aware of the flow control issues involved with > child process communication, but not all have the patience or > skills to write a fully correct and deadlock-free select loop. Isn't asyncore supposed to simplify that? So in short, I'm -1 on creating a new module, but +1 on merging most of these features into the existing code base - they are good features. Regards, Martin From martin at v.loewis.de Sat Jan 3 16:53:36 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 16:53:54 2004 Subject: [Python-Dev] Re: Relaxing Unicode error handling In-Reply-To: <3FCD9F6C-3E35-11D8-89E7-000A27B19B96@cwi.nl> References: <20040103142955.GA8199@panix.com> <3FF6FDE1.8040607@v.loewis.de> <20040103183932.GA25221@panix.com> <3FF720C6.9060802@v.loewis.de> <3FCD9F6C-3E35-11D8-89E7-000A27B19B96@cwi.nl> Message-ID: <3FF739E0.9010701@v.loewis.de> Jack Jansen wrote: > Because if that is so, then isn't the best way to implement this to > not bother with sys.relaxedunicodeerrors, but in stead use a special > encoding name (or a parameter to an encoding name, such as > "us-ascii;relaxed")? I would not know how to implement this scheme, if you want it to work for all codecs. Also, I want the error handling to be changable at run-time, whereas I maintain that the system default encoding should always remain ASCII. Regards, Martin From nhodgson at bigpond.net.au Sat Jan 3 17:07:50 2004 From: nhodgson at bigpond.net.au (Neil Hodgson) Date: Sat Jan 3 17:07:13 2004 Subject: [Python-Dev] Relaxing Unicode error handling References: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> <3FF6FFE3.60608@v.loewis.de> Message-ID: <00ec01c3d246$07594360$3da48490@neil> Martin v. Loewis: > However, now people say "I have followed this advise, and I got > banana software: matures at the customer. We would rather be certain > that the customer won't see exceptions that we never got in testing". > ... > Yes, it is these use cases: Somebody invokes an SQL method, which > happens to return a Unicode string, and then adds a latin-1 byte > string to it. It works for all ASCII byte strings, but then the > customer happens to enter accented characters, and the application > crashes without offering to safe recent changes. The problem here to me is that the failure is input data dependent so often hidden. There should be a way to remove this dependency on input data. Is there any way to specify that implicit conversions raise an exception? Perhaps by setting the default encoding to "undefined". Then these developers can find and fix their bugs before delivery to customers. Neil From astrand at lysator.liu.se Sat Jan 3 17:27:12 2004 From: astrand at lysator.liu.se (Peter Astrand) Date: Sat Jan 3 17:27:21 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: <3FF73920.8090402@v.loewis.de> Message-ID: On Sat, 3 Jan 2004, Martin v. Loewis wrote: > > - One "unified" module provides all functionality from previous > > functions. > > I doubt this is a good thing. Different applications have different > needs - having different API for them is reasonable. I don't agree. I have used all of the existing mechanism in lots of apps, and it's just a pain. There are lots of functions to choose between, but none does what you really want. > > - Cross-process exceptions: Exceptions happening in the child > > before the new process has started to execute are re-raised in > > the parent. This means that it's easy to handle exec() > > failures, for example. With popen2, for example, it's > > impossible to detect if the execution failed. > > This is a bug in popen2, IMO. Fixing it is a good thing, but does not > require a new module. "Fixing popen2" would mean a break old applications; exceptions will happen, which apps are not prepared of. > > - A hook for executing custom code between fork and exec. This > > can be used for, for example, changing uid. > > Such a hook could be merged as a keyword argument into the existing > API. Into which module/method/function? There is no one flexible enough. The case for redirecting only stderr is just one example; this is simple not possible with the current API. > > - All combinations of file descriptor redirection is possible. > > For example, the "python-dialog" [2] needs to spawn a process > > and redirect stderr, but not stdout. This is not possible with > > current functions, without using temporary files. > > Sounds like a new function on the popen2 module. To support all combinations, 12 different functions are necessary. Who will remember what popen2.popen11() means? > > - Support for connecting several subprocesses (shell "pipe"). > > Isn't this available already, as the shell supports pipe creation, > anyway? With popen5, you can do it *without* using the shell. > > - Universal newline support. > > This should be merged into the existing code. There's already a bug about this; bug 788035. This is what one of the comment says: "But this whole popen{,2,3,4} section of posixmodule.c is so fiendishly complicated with all the platform special cases that I'm loath to touch it..." I haven't checked if this is really true, though. > > - A communicate() method, which makes it easy to send stdin data > > and read stdout and stderr data, without risking deadlocks. > > Most people are aware of the flow control issues involved with > > child process communication, but not all have the patience or > > skills to write a fully correct and deadlock-free select loop. > > Isn't asyncore supposed to simplify that? Probably not. The description says: "This module provides the basic infrastructure for writing asynchronous socket service clients and servers." It's not obvious to me how this module could be use as a "shell backquote" replacement (which is what communicate() is about). It's probably possible though; I haven't tried. Even if this is possible I guess we need some kind of "entry" or "wrapper" method in the popen module to simplify things for the user. My guess is that an communicate() method that uses asyncore would be as long/complicated as the current implementation. The current implementation is only 68 lines, including comments. > So in short, I'm -1 on creating a new module, but +1 on merging > most of these features into the existing code base - they are good > features. Well, I don't see how this could be done easily: The current API is not flexible enough, and some things (like cross-process exceptions) breaks compatibility. Writing a good popen module is hard. Providing cross-platform support (for Windows, for example) is even harder. Trying to retrofit a good popen implementation into an old API without breaking compatibility seems impossible to me. I'm not prepared to try. -- /Peter ?strand From martin at v.loewis.de Sat Jan 3 19:14:52 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 19:15:10 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: References: Message-ID: <3FF75AFC.3020204@v.loewis.de> Peter Astrand wrote: > I don't agree. I have used all of the existing mechanism in lots of apps, > and it's just a pain. There are lots of functions to choose between, but > none does what you really want. So enhance them, instead of replacing them. >>> - Cross-process exceptions: Exceptions happening in the child >>> before the new process has started to execute are re-raised in >>> the parent. >>> >>This is a bug in popen2, IMO. Fixing it is a good thing, but does not >>require a new module. > > "Fixing popen2" would mean a break old applications; exceptions will > happen, which apps are not prepared of. I find that an acceptable incompatibility, and it will likely break no existing application. Applications usually expect that the program they start actually exists; it is a good thing that they now can detect the error that the missing/non-executable application. Errors should never pass silently. >>> - A hook for executing custom code between fork and exec. This >>> can be used for, for example, changing uid. >> >>Such a hook could be merged as a keyword argument into the existing >>API. > > > Into which module/method/function? For example, popen2.popen2, as argument preexec_fn. > There is no one flexible enough. The > case for redirecting only stderr is just one example; this is simple not > possible with the current API. Can you elaborate? What is the specific problem, how does your preexec function look like, and how is it used with popen5. I can then show you how it could be used with popen2, if that was enhanced appropriately. >>> - All combinations of file descriptor redirection is possible. >>> For example, the "python-dialog" [2] needs to spawn a process >>> and redirect stderr, but not stdout. This is not possible with >>> current functions, without using temporary files. >> >>Sounds like a new function on the popen2 module. > > > To support all combinations, 12 different functions are necessary. Who > will remember what popen2.popen11() means? Why is that? Just add a single function, with arguments stdin/stdout/stderr. No need for 12 functions. Then explain the existing functions in terms of your new function (if possible). >>> - Support for connecting several subprocesses (shell "pipe"). >> >>Isn't this available already, as the shell supports pipe creation, >>anyway? > > > With popen5, you can do it *without* using the shell. Why is that a good thing? > There's already a bug about this; bug 788035. This is what one of the > comment says: > > "But this whole popen{,2,3,4} section of posixmodule.c is so fiendishly > complicated with all the platform special cases that I'm loath to touch > it..." > > I haven't checked if this is really true, though. You really should work with the existing code base. Ignoring it is a guarantee that your PEP will be rejected. (Studying it, and then providing educated comments about it, might get you through) I think this is the core problem of your approach: You throw away all past history, and imply that you can do better than all prior contributors could. Honestly, this is doubtful. The current code is so complicated because implementing pipes is complicated. > Well, I don't see how this could be done easily: The current API is not > flexible enough, and some things (like cross-process exceptions) breaks > compatibility. I never said it would be easy. However, introducing a new popen module is a major change, and there must be strong indications that the current API cannot be enhanced before throwing it away. There should be one-- and preferably only one --obvious way to do it. As for breaking compatibility: This is what the PEP should study in detail. It is sometimes acceptable to break compatibility, if applications are likely to be improved by the change. *Any* change can, in principle, break compatibility. Suppose I had an application that did from popen5 import open This application might break if your proposed change is implemented, as a new module is added. So you can't claim "I will break no programs". > Writing a good popen module is hard. Providing cross-platform support (for > Windows, for example) is even harder. Trying to retrofit a good popen > implementation into an old API without breaking compatibility seems > impossible to me. I'm not prepared to try. So I continue to be -1 with your PEP. Regards, Martin From tim.one at comcast.net Sat Jan 3 19:31:11 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 3 19:31:14 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: Message-ID: [Cameron Laird] > ... > Has anyone been in contact with the HP-UX Porting Center > folks recently? I wonder if I should enlist their help ... If and only if they know something about HP-UX <0.5 wink>. Nobody here really seems to. Have you tried their port? 2.3.1 appears to be the most recent: http://hpux.cs.utah.edu/hppd/hpux/Languages/python-2.3.1/ By memory, in the past we've gotten complaints from HP-UX users about previous Porting Center Pythons in two areas: 1. Apparently some (all?) of their Python ports were built with threading disabled entirely. 2. Flaky floating-point behavior. An example of the latter complaint is here (not a particularly good example, just the first I bumped into via Googling): http://www.ifost.org.au/Software/ The python for HP-UX 11 that is found at the HP public domain software porting centre has buggy floating point problems. I recompiled it with gcc 3 and while there are some round-off issues, at least it's not totally deranged... [a link to python-211-hpux11-gcc.gz] From barry at barrys-emacs.org Sat Jan 3 19:44:42 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Sat Jan 3 19:44:48 2004 Subject: [Python-Dev] Re: PEP 324: popen5 - New POSIX process module In-Reply-To: <20040103105257.FAA5.JCARLSON@uci.edu> References: <20040103105257.FAA5.JCARLSON@uci.edu> Message-ID: <6.0.1.1.2.20040104003136.022157d8@torment.chelsea.private> At 03-01-2004 19:02, you wrote: > > There are some issues wrt Windows support. Currently, popen5 does not > > support Windows at all. To be able to do so, we must chose between: When it does support windows please make it work the same on all platforms. The existing popen code for unix is buggy and not compatible with the windows version or the docs. > > 2) Write supporting code in C (basically, copy the required process > > handling functions from win32all). > >Depending on how stable the code has been, this may be the best idea. As >long as the maintainer of the popen5 windows support code kept >themselves updated on the status of applicable changes to win32all, this >should go on without a hitch. The win32all extension is a thin wrapper over the windows API. The proposed popen5 code would simply be some windows specific code that calls windows API directly. There is no code in win32all that would be needed to be duplicated as far as I can see. Did I miss something? > - Josiah Barry From guido at python.org Sat Jan 3 20:15:04 2004 From: guido at python.org (Guido van Rossum) Date: Sat Jan 3 20:15:10 2004 Subject: [Python-Dev] Toowtdi: Datatype conversions In-Reply-To: Your message of "Sat, 03 Jan 2004 16:27:45 EST." <004101c3d240$6df330a0$bb29a044@oemcomputer> References: <004101c3d240$6df330a0$bb29a044@oemcomputer> Message-ID: <200401040115.i041F4n29619@c-24-5-183-134.client.comcast.net> > Choosing between: > > list(d) or d.keys() > > Which is the one obvious way of turning a dictionary into a list? > IMO, list(d) is it. Ugh. I really hope we aren't going to teach people to write list(d) instead of d.keys(). The latter is totally clear. The former requires one to stop and remember that this uses the keys only. This is different for sets, where there's no ambiguity in what list(d) could possibly *mean* (except for the ordering, which is a second-order issue that doesn't affect the *type* of the result). While I *like* being able to write for key in d: ... instead of for key in d.keys(): ... I'm not so sure that having list(d) do anything at all was such a great idea. Not because of TOOWTDI, but because it doesn't tell the reader enough. And the polymorphism properties are really weird: if d could be either a mapping or a sequence, list(d) either loses information or it doesn't. --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Sat Jan 3 20:37:22 2004 From: python at rcn.com (Raymond Hettinger) Date: Sat Jan 3 20:38:43 2004 Subject: [Python-Dev] Toowtdi: Datatype conversions In-Reply-To: <3FF73692.9010103@v.loewis.de> Message-ID: <006e01c3d263$68e8fa40$bb29a044@oemcomputer> [Raymond Hettinger] > > Which is the one obvious way of turning a dictionary into a list? [Martin v. Loewis] > There is no obvious way to turn a dictionary into a list; > lists and dictionaries are completely different things. [Guido] > Ugh. I really hope we aren't going to teach people to write list(d) > instead of d.keys(). Put another way, which is preferable: list(d.iterkeys()) vs. d.keys() list(d.itervalues()) vs. d.values() list(d.iteritems()) vs. d.items() [Martin] > Dictionaries are similar to sets; in fact, Smalltalk as an > Association class (Assocation key: value:), and Dictionary > is a set of associations. Your perceptiveness is uncanny! This issue arose for me while writing an extension module that implements Smalltalk bags which *do* have meaningful conversions to sets, lists, and dicts: >>> list(b) ['dog', 'dog', 'cat'] >>> dict(b.iter_with_counts) {'dog':2, 'cat':1} >>> set(b) set(['dog', 'cat']) >>> list(b.iter_unique) ['dog', 'cat'] The alternative API is: bag.asList() bag.asDict() bag.asSet() bag.unique() I'm trying to decide which API is the cleanest and has reasonable performance. The former has two fewer methods. The latter has much better performance but won't support casts to subclasses of list/set/dict. > > So, one question is whether set() and frozenset() should grow an > > analogue to the keys() method: [Robert Brewer] > I don't think so, for the reason that .keys() is effectively > a disambiguator as I just described. With sets, there is no mapping, > and therefore no ambiguity. [Martin] > But this is a completely different issue! For sets, there is an > obvious way (if you accept that the list will have the same elements > in arbitrary order), then > > list(a_set) > > is the most obvious way, and it should work fastest. Right! Unfortunately, it can never be as fast as a set.elements() method -- the underlying d.keys() method has too many advantages (looping with an in-lined version of PyDict_Next(), knowing the size of the dictionary, writing with PyList_SET_ITEM, and having all steps in-lines with no intervening function calls). > > Another bright idea is to support faster datatype conversion > > by adding an optional __len__() method to the iteration > > protocol so that list(), tuple(), dict(), and set() could > > allocate sufficient space for loading any iterable that > > knows its own length. [Martin] > That is useful, also for list comprehension. > > > The advantages are faster type conversion (by avoiding resizing), > > keeping the APIs decoupled, and keeping the visible API thin. > > This disadvantage is that it clutters the C code with special > > case handling and that it doesn't work with generators or > > custom iterators (unless they add support for __len__). > > I see no reason why it should not work for custom iterators. > For generators, you typically don't know how many results you > will get in the end, so it is no loss that you cannot specify > that. That makes sense. Looking at the code for list_fill, I see that some length checking is already done but only if the underlying object fills sq_length. I think that check should be replaced by a call to PyObject_Size(). That leaves a question as to how to best empower the dictionary constructor. If the source has an underlying dictionary (a Bag is a good example), then nothing beats PyDict_Copy(). For sets, that only works if you accept the default value of True. The set constructor has the same issue when the length of the iterable is knowable. The problem is that there is no analogue to PyList(n) which returns a presized collection. On a separate issue, does anyone care that dict.__init__() has update behavior instead of replace behavior like list.__init__()? Raymond From python at rcn.com Sat Jan 3 20:50:13 2004 From: python at rcn.com (Raymond Hettinger) Date: Sat Jan 3 20:50:47 2004 Subject: [Python-Dev] Toowtdi: Datatype conversions In-Reply-To: <200401040115.i041F4n29619@c-24-5-183-134.client.comcast.net> Message-ID: <006f01c3d265$185d3f80$bb29a044@oemcomputer> [Raymond Hettinger] > > Which is the one obvious way of turning a dictionary into a list? [Martin v. Loewis] > There is no obvious way to turn a dictionary into a list; lists and > dictionaries are completely different things. [Guido] > Ugh. I really hope we aren't going to teach people to write list(d) > instead of d.keys(). Put another way, which is preferable: list(d.iterkeys()) vs. d.keys() list(d.itervalues()) vs. d.values() list(d.iteritems()) vs. d.items() [Martin] > Dictionaries are similar to sets; in fact, Smalltalk as an Association > class (Assocation key: value:), and Dictionary is a set of > associations. Your perceptiveness is uncanny! This issue arose for me while writing an extension module that implements Smalltalk bags which *do* have meaningful conversions to sets, lists, and dicts: >>> list(b) ['dog', 'dog', 'cat'] >>> dict(b.iter_with_counts) {'dog':2, 'cat':1} >>> set(b) set(['dog', 'cat']) >>> list(b.iter_unique) ['dog', 'cat'] The alternative API is: bag.asList() bag.asDict() bag.asSet() bag.unique() I'm trying to decide which API is the cleanest and has reasonable performance. The former has two fewer methods. The latter has much better performance but won't support casts to subclasses of list/set/dict. [Raymond] > > So, one question is whether set() and frozenset() should grow an > > analogue to the keys() method: [Robert Brewer] > I don't think so, for the reason that .keys() is effectively > a disambiguator as I just described. With sets, there is no mapping, > and therefore no ambiguity. [Martin] > But this is a completely different issue! For sets, there is an > obvious way (if you accept that the list will have the same elements > in arbitrary order), then > > list(a_set) > > is the most obvious way, and it should work fastest. Right! Unfortunately, it can never be as fast as a set.elements() method -- the underlying d.keys() method has too many advantages (looping with an in-lined version of PyDict_Next(), knowing the size of the dictionary, writing with PyList_SET_ITEM, and having all steps in-lines with no intervening function calls). > > Another bright idea is to support faster datatype conversion by > > adding an optional __len__() method to the iteration protocol so > > that list(), tuple(), dict(), and set() could allocate sufficient > > space for loading any iterable that knows its own length. [Martin] > That is useful, also for list comprehension. > > > The advantages are faster type conversion (by avoiding resizing), > > keeping the APIs decoupled, and keeping the visible API thin. This > > disadvantage is that it clutters the C code with special case > > handling and that it doesn't work with generators or custom > > iterators (unless they add support for __len__). > > I see no reason why it should not work for custom iterators. For > generators, you typically don't know how many results you will get in > the end, so it is no loss that you cannot specify that. That makes sense. Looking at the code for list_fill, I see that some length checking is already done but only if the underlying object fills sq_length. I think that check should be replaced by a call to PyObject_Size(). That leaves a question as to how to best empower the dictionary constructor. If the source has an underlying dictionary (a Bag is a good example), then nothing beats PyDict_Copy(). For sets, that only works if you accept the default value of True. The set constructor has the same issue when the length of the iterable is knowable. The problem is that there is no analogue to PyList(n) which returns a presized collection. On a separate issue, does anyone care that dict.__init__() has update behavior instead of replace behavior like list.__init__()? Raymond From jcarlson at uci.edu Sat Jan 3 21:01:01 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat Jan 3 21:03:23 2004 Subject: [Python-Dev] Re: PEP 324: popen5 - New POSIX process module In-Reply-To: <6.0.1.1.2.20040104003136.022157d8@torment.chelsea.private> References: <20040103105257.FAA5.JCARLSON@uci.edu> <6.0.1.1.2.20040104003136.022157d8@torment.chelsea.private> Message-ID: <20040103165802.E77C.JCARLSON@uci.edu> [Barry] > When it does support windows please make it work the same on all platforms. > The existing popen code for unix is buggy and not compatible with the > windows version or the docs. Agreed. > The win32all extension is a thin wrapper over the windows API. The > proposed popen5 code would simply be some windows specific code that > calls windows API directly. There is no code in win32all that would > be needed to be duplicated as far as I can see. Did I miss something? I was not familliar with the structure of win32all. My vote goes for 'add in the required C portions for windows support'. And 'make all versions work the same'. [Peter in regards to asyncore] > Probably not. The description says: > > "This module provides the basic infrastructure for writing asynchronous > socket service clients and servers." > > It's not obvious to me how this module could be use as a "shell backquote" > replacement (which is what communicate() is about). It's probably possible > though; I haven't tried. Even if this is possible I guess we need some > kind of "entry" or "wrapper" method in the popen module to simplify things > for the user. My guess is that an communicate() method that uses asyncore > would be as long/complicated as the current implementation. The current > implementation is only 68 lines, including comments. Using asyncore in *nix to do file IO doesn't seem to be that bad. I tried to do the same thing on Windows (to get stdin, stdout, and stderr to be non-blocking), but select doesn't work for file handles or pipes in Windows. If my memory serves me correctly, it was fairly trivial. I've actually got a bit of other code for buffered IO on sockets that is an asynchat work-alike, is around 70 lines, and uses asyncore. Using asynchat itself and setting a proper terminator would be easy, less than 20 lines (by my estimation). You include all the logic of asyncore.poll in popen5.Popen.communicate. Using asyncore.poll may be to your benefit. One thing to note is that string concatenation is not necessarily fast. That is: STR += DATA can be slow for initially large STR. Using a list with list.append and ''.join(list) is significantly faster for large buffers. I have been contemplating submitting a patch for asynchat that makes it behave better for large buffers. The only issue that I can see in using asyncore or asynchat is that each of {stdin, stdout, stderr} may need to have their own instance of asyncore.dispatcher or asynchat.async_chat. That would be pretty ugly. I still like the idea of popen5, but I think you may want to at least take a look at asyncore and friends. - Josiah From martin at v.loewis.de Sat Jan 3 21:21:26 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 3 21:21:41 2004 Subject: [Python-Dev] Toowtdi: Datatype conversions In-Reply-To: <006e01c3d263$68e8fa40$bb29a044@oemcomputer> References: <006e01c3d263$68e8fa40$bb29a044@oemcomputer> Message-ID: <3FF778A6.6020406@v.loewis.de> Raymond Hettinger wrote: > Put another way, which is preferable: > > list(d.iterkeys()) vs. d.keys() > list(d.itervalues()) vs. d.values() > list(d.iteritems()) vs. d.items() Clearly, the direct methods. Why do you ask? > The alternative API is: > > bag.asList() > bag.asDict() > bag.asSet() > bag.unique() For bags, I would say that this API is most appropriate, from a pure Python point of view. Of course, as you are modelling Smalltalk, you should let yourself guide by the methods that Smalltalk provides (which is just asSet, asArray, and sortedByCount, AFAICT). It appears that Smalltalk has no way of converting the Bag to a Dictionary. I may be missing something here, since that is an obvious conversion - but then, also an unnecessary one. > The former has two fewer methods. The latter has much better > performance but won't support casts to subclasses of list/set/dict. Users in need of such conversions could always convert the result of some .as method. >> list(a_set) >> >>is the most obvious way, and it should work fastest. > > > Right! Unfortunately, it can never be as fast as a set.elements() > method -- the underlying d.keys() method has too many advantages Well, list construction could special-case sets. > That leaves a question as to how to best empower the dictionary > constructor. Why is that an interesting question, again (for the set case)? Regards, Martin From tim.one at comcast.net Sat Jan 3 21:53:54 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 3 21:53:56 2004 Subject: [Python-Dev] Switch to VC.NET 7.1 completed In-Reply-To: <3FF5E550.3030006@v.loewis.de> Message-ID: [Martin v. Loewis] > ... > I have moved the VC6 files to PC/VC6. I have no plans of > maintaining them, and they are as they were last in > PCBuild. I see no problem though with somebody else > maintaining the for the time being, if somebody > volunteers. A clean checkout of VC6 now builds as well as it did before this. I didn't update it to use any newer versions of external modules, and am out of time for this so won't. _sre.c still generates lots of warnings about mixed signed/unsigned comparisons. test_bsddb3 still has tons of failures on Win98SE (many more than in 2.3.3). test_unicode_file still fails on Win98SE: test_unicode_file test test_unicode_file failed -- Traceback (most recent call last): File "C:\CODE\PYTHON\lib\test\test_unicode_file.py", line 142, in test_single_files self._test_single(TESTFN_UNICODE) File "C:\CODE\PYTHON\lib\test\test_unicode_file.py", line 116, in _test_single self._do_single(filename) File "C:\CODE\PYTHON\lib\test\test_unicode_file.py", line 32, in _do_single os.utime(filename, None) UnicodeEncodeError: 'ascii' codec can't encode characters in position 6-7: ordinal not in range(128) test_urllib2 still fails on Win98SE: test_urllib2 test test_urllib2 failed -- Traceback (most recent call last): File "C:\CODE\PYTHON\lib\test\test_urllib2.py", line 345, in test_file r = h.file_open(Request(url)) File "C:\CODE\PYTHON\lib\urllib2.py", line 1058, in file_open return self.open_local_file(req) File "C:\CODE\PYTHON\lib\urllib2.py", line 1073, in open_local_file stats = os.stat(localfile) OSError: [Errno 2] No such file or directory: '\\test.txt' test___all__ just started failing on Win98SE, but I assume that's due to a very recent checkin from Barry, and that it will fail the same way everywhere now: test___all__ test test___all__ failed -- Traceback (most recent call last): File "C:\CODE\PYTHON\lib\test\test___all__.py", line 69, in test_all self.check_all("base64") File "C:\CODE\PYTHON\lib\test\test___all__.py", line 42, in check_all exec "from %s import *" % modname in names File "", line 1, in ? AttributeError: 'module' object has no attribute 'freenet_b64encode' From pje at telecommunity.com Sat Jan 3 22:28:46 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 3 22:28:22 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <00ec01c3d246$07594360$3da48490@neil> References: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> <3FF6FFE3.60608@v.loewis.de> Message-ID: <5.1.0.14.0.20040103221807.03f7b990@mail.telecommunity.com> At 09:07 AM 1/4/04 +1100, Neil Hodgson wrote: > The problem here to me is that the failure is input data dependent so >often hidden. There should be a way to remove this dependency on input data. >Is there any way to specify that implicit conversions raise an exception? >Perhaps by setting the default encoding to "undefined". Then these >developers can find and fix their bugs before delivery to customers. Gosh, that sure would have made my last i18n project go faster, and I'd have had more confidence that I'd found all possible conversion issues. OTOH, I would probably have had to make all my string constants Unicode. On the *other* other hand, that wouldn't necessarily have been a bad thing, except for the size, age, and brittleness of the code base in question. From pje at telecommunity.com Sat Jan 3 22:57:32 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 3 22:57:16 2004 Subject: [Python-Dev] Switching to VC.NET 2003 In-Reply-To: <3FF059A1.8090202@v.loewis.de> References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> Message-ID: <5.1.0.14.0.20040103225221.035d77f0@mail.telecommunity.com> At 05:43 PM 12/29/03 +0100, Martin v. Loewis wrote: >Phillip J. Eby wrote: > >>Was the question of mingw32-built extensions ever resolved? That is, >>will we be able to build extensions for the standard Windows Python using >>the distutils' "mingw32" compiler, as is possible with Python 2.2? > >I don't know; I don't use mingw32. > >OTOH, I cannot see what the problem might be - except that you will >have to link with msvcr71.dll at the end, not with msvcrt40.dll. > >>If this hasn't been resolved, and somebody can send me a binary for a >>.NET-built Python, I'll be happy to test. > >Sure: Please try my installer at > >http://www.dcl.hpi.uni-potsdam.de/home/loewis/python2.4.0.msi I just had a chance to try this out. One problem: pyconfig.h is missing from the installation. I copied over my pyconfig.h from Python 2.2, and was able to compile all my target extensions using the distutils 'mingw32' compiler selection. However, they crash the interpreter immediately upon import. >Notice that this doesn't include msvcr71.dll, which I could provide >in private. > >It also does not include msvcr71.lib - I have no clue how you would >get a copy of that. But then, somehow, msvcrt40.lib (or is it >msvcrt40.a) must have gotten into mingw32 also. I'm guessing that this is related to the coredumps. I probably need to update my cygwin installation. Note that one doesn't explicitly link the msvcrt40 with earlier Pythons, so finding out how to change it to the 71 version is going to be... interesting. Will let you know what I come up with. From guido at python.org Sat Jan 3 23:07:40 2004 From: guido at python.org (Guido van Rossum) Date: Sat Jan 3 23:07:49 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib timeit.py, 1.15, 1.16 In-Reply-To: Your message of "Sat, 03 Jan 2004 19:47:54 PST." References: Message-ID: <200401040407.i0447es29897@c-24-5-183-134.client.comcast.net> > Make timings more consistent by temporarily disabling GC. Shouldn't this be an option at best? What if I want to time some code that expects GC to be on? --Guido van Rossum (home page: http://www.python.org/~guido/) From andymac at bullseye.apana.org.au Sat Jan 3 21:41:45 2004 From: andymac at bullseye.apana.org.au (Andrew MacIntyre) Date: Sat Jan 3 23:37:46 2004 Subject: [Python-Dev] Are we collecting benchmark results across machines In-Reply-To: References: Message-ID: <20040104133013.Q3659@bullseye.apana.org.au> System is an AMD Athlon XP1600+ (1.4GHz) with 512MB PC2100 RAM, running FreeBSD 4.8. ---------------------------------------------------------------- $ python2.3 Python 2.3.3 (#4, Jan 4 2004, 12:39:32) [GCC 2.95.4 20020320 [FreeBSD]] on freebsd4 Type "help", "copyright", "credits" or "license" for more information. >>> ^D $ make time time python2.3 -O b.py >@out 21.19 real 20.88 user 0.26 sys cmp @out out $ python2.3 /usr/local/lib/python2.3/test/pystone.py Pystone(1.1) time for 50000 passes = 1.89062 This machine benchmarks at 26446.3 pystones/second -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au (pref) | Snail: PO Box 370 andymac@pcug.org.au (alt) | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From pje at telecommunity.com Sat Jan 3 23:49:46 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 3 23:49:36 2004 Subject: [Python-Dev] SUCCESS! 2.4 extensions w/mingw (was Re: Switching to VC.NET 2003) In-Reply-To: <5.1.0.14.0.20040103225221.035d77f0@mail.telecommunity.com> References: <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> Message-ID: <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> (and there was much rejoicing...) At 10:57 PM 1/3/04 -0500, Phillip J. Eby wrote: >At 05:43 PM 12/29/03 +0100, Martin v. Loewis wrote: >> >>Sure: Please try my installer at >> >>http://www.dcl.hpi.uni-potsdam.de/home/loewis/python2.4.0.msi > >I just had a chance to try this out. One problem: pyconfig.h is missing >from the installation. > >I copied over my pyconfig.h from Python 2.2, and was able to compile all >my target extensions using the distutils 'mingw32' compiler >selection. However, they crash the interpreter immediately upon import. But, they work now that I'm using the 2.4 pyconfig.h, copied and pasted from viewcvs. :) I didn't realize at first that it lived in 'PC' rather than 'Include' or 'PCBuild'. >>It also does not include msvcr71.lib - I have no clue how you would >>get a copy of that. But then, somehow, msvcrt40.lib (or is it >>msvcrt40.a) must have gotten into mingw32 also. > >I'm guessing that this is related to the coredumps. I probably need to >update my cygwin installation. Note that one doesn't explicitly link the >msvcrt40 with earlier Pythons, so finding out how to change it to the 71 >version is going to be... interesting. Will let you know what I come up with. As it turns out, it was unnecessary to do anything special, once I had the right pyconfig.h. I just used my normal setup.py, after upgrading to all the latest gcc/mingw/libtool/automake/etc. packages of my cygwin installation. I was able to build kjbuckets (Aaron Watters' data structure library, used by gadfly), an older version of ZODB 4's C '_persistence' support module, and the Pyrex-generated C speedups code for PyProtocols. All my tests for these and a handful of other modules passed. It looks like everything's cool for building 2.4 extensions with mingw. My only suggestion at this point would be to add 'pyconfig.h' to be installed in Python24/include. It'd be nice to throw in the libs/libpython24.a file, too, but it's big (603K) and I guess everybody who's been using mingw to this point either knows the procedure for making it by heart, or at least has a bookmark to one of the webpages that explains how to do it. (Such as http://sebsauvage.net/python/mingw.html ) From pje at telecommunity.com Sat Jan 3 23:52:50 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 3 23:52:39 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229123908.023122e0@telecommunity.com> Message-ID: <5.1.0.14.0.20040103235009.01f21990@mail.telecommunity.com> At 12:02 PM 12/30/03 +0000, Paul Moore wrote: >One thing that I imagine does need looking at is modifying distutils >to check whether Python was built with VC7.1, and if so add a >-lmsvcr71 flag to the gcc command line when compiling with mingw. This >can be hacked by hand, but only at the expense of making setup.py >version-specific (or doing your own test in setup.py). No changes needed in either the distutils or setup.py, at least on my machine as of this evening. Apparently mingw is smart enough to figure this out on its own. Yay! Of course, I haven't exactly made an exhaustive test that *all* extensions are compilable with mingw, but I'm at least happy that all of the extensions I normally build for Windows are okay. From python at rcn.com Sun Jan 4 00:35:49 2004 From: python at rcn.com (Raymond Hettinger) Date: Sun Jan 4 00:36:23 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib timeit.py, 1.15, 1.16 In-Reply-To: <200401040407.i0447es29897@c-24-5-183-134.client.comcast.net> Message-ID: <007901c3d284$9c63ad40$bb29a044@oemcomputer> > > Make timings more consistent by temporarily disabling GC. > > Shouldn't this be an option at best? I think not. The default behavior should be the one that gives the most consistent timings between runs. > What if I want to time some code > that expects GC to be on? >>> Timer(stmt, 'import gc; gc.enable()'.timeit() If desired, I can add a note to that effect in the docs. However, GC is such a random event (in terms of when it occurs, what it affects, how much time it takes, and the state of the system) that you would always want it off unless you're specifically measuring GC performance across two sets of code that handle GC differently. GC performance does make a difference when measuring the performance of a whole, memory intensive application; however, that tends to fall outside the scope and purpose of the module. If you really want an option, I can put it in, but why clutter the API with something that few people care about or understand. An alternative is to stick with my original idea of running a gc.collect() before each timing pass. I would have done that but Tim's comment left me with doubts about whether it was the right thing to do. On the surface it makes sense and empirical tests show that the timings do become more stable. However, trashing the cache before each run may create timings that don't reflect a normal operating environment (though the timeit environment is already anything but normal). Raymond From astrand at lysator.liu.se Sun Jan 4 06:02:59 2004 From: astrand at lysator.liu.se (Peter Astrand) Date: Sun Jan 4 06:03:18 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: <3FF75AFC.3020204@v.loewis.de> Message-ID: On Sun, 4 Jan 2004, Martin v. Loewis wrote: > > "Fixing popen2" would mean a break old applications; exceptions will > > happen, which apps are not prepared of. > > I find that an acceptable incompatibility, and it will likely break > no existing application. Not true. There are lots or apps out there that uses fallback commands: tries to execute one, and if it doesn't exist, tries another one. (One example is jakarta-gump, see http://cvs.apache.org/viewcvs.cgi/jakarta-gump/python/gump/utils/launcher.py?rev=1.6&view=auto) With the current API, you do this by checking if the return code is 127. No-one is prepared for an exception. The return code stuff is also very problematic, and is another reason why make a new module and not "enhance" the old ones. With the current API (which calls /bin/sh most of the time), some returncodes are overloaded by the shell. The shell uses these return codes: 126: the command was found but is not executable 127: the command was not found 128+n: the command was terminated by signal n This means that it is currently impossible to use these return codes for programs launched via the current API, since you cannot tell the difference between a 127 generated by a successful call to your command, and a 127 generated by the shell. I don't see how this can be solved by "enhancing" the current functions, without breaking old applications. >Applications usually expect that the program > they start actually exists; it is a good thing that they now can > detect the error that the missing/non-executable application. There are lots of other errors as well, not just missing/non-executable programs. > > There is no one flexible enough. The > > case for redirecting only stderr is just one example; this is simple not > > possible with the current API. > > Can you elaborate? What is the specific problem, how does your preexec > function look like, and how is it used with popen5. I can then show you > how it could be used with popen2, if that was enhanced appropriately. Yes, the preexec function feature could possiby be added popen2. This is not the problem. > >>Sounds like a new function on the popen2 module. > > > > > > To support all combinations, 12 different functions are necessary. Who > > will remember what popen2.popen11() means? > > Why is that? Just add a single function, with arguments > stdin/stdout/stderr. No need for 12 functions. Then explain the existing > functions in terms of your new function (if possible). Just like popen5.Popen? Yes, that could be done. We would still have the problem with returncode incompatibilites, exceptions and such. > > With popen5, you can do it *without* using the shell. > > Why is that a good thing? 1) Performance. No need for parsing .bashrc on every call... 2) Security. You can do pipes without having to deal with all the quoting issues. 3) Getting rid of the shells overloading of return codes It's also much more elegant, IMHO. > I think this is the core problem of your approach: You throw away all > past history, and imply that you can do better than all prior > contributors could. Honestly, this is doubtful. In a discussion like this, I think it's important to separate the new API from the new implementation: 1) The new API. If you look at the popen5 implementation and PEP, it's obvious that I haven't throwed away the history. I have tried to take all the good parts from the various existing functions. The documentation contains 140 lines describing how to migrate from the earlier functions. Much of the current API has really never been designed. The API for the functions os.popen, os.system, os.popen2 comes from the old POSIX functions. These were never intended to be flexible, cross-platform on anything like that. So, it's not hard to do better than these. 2) The new implementation. When I wrote popen5, I took some good ideas out of popen2. The rest of the code is written from scratch. >The current code > is so complicated because implementing pipes is complicated. Let's keep the POSIX stuff separated from the Windows stuff. popen2.py does not depend on posixmodule.c on POSIX systems, and popen2.py is not complicated at all. The popen* stuff for Windows (and OS2 etc) in posixmodule.c is complicated because: 1) It's written in low-level C 2) It contains lots of old DOS stuff 3) It tries to launch the program through the shell (which is always a pain). > > Well, I don't see how this could be done easily: The current API is not > > flexible enough, and some things (like cross-process exceptions) breaks > > compatibility. > > I never said it would be easy. However, introducing a new popen module > is a major change, and there must be strong indications that the current > API cannot be enhanced before throwing it away. I wouldn't say that introducing a new module is a "major change". Of course, we don't want to end up writing "popen6" in two years, because we've realized that "popen5" is too limited. That's why we should try to get it exactly right this time. I think it would be more useful it we put our energy into trying to accomplish that. > As for breaking compatibility: This is what the PEP should study in > detail. It is sometimes acceptable to break compatibility, if > applications are likely to be improved by the change. *Any* change > can, in principle, break compatibility. Suppose I had an application > that did > > from popen5 import open > > This application might break if your proposed change is implemented, > as a new module is added. So you can't claim "I will break no programs". Isn't this quite a silly example? -- /Peter ?strand From doko at cs.tu-berlin.de Sun Jan 4 06:06:46 2004 From: doko at cs.tu-berlin.de (Matthias Klose) Date: Sun Jan 4 06:10:54 2004 Subject: [Python-Dev] trying to build 2.3.3 with db-4.2.52 Message-ID: <16375.62406.740857.546472@gargle.gargle.HOWL> Trying to build python2.3.3 with db-4.2.52, I get many failures in various tests: 5 tests failed: test_anydbm test_bsddb test_dircache test_shelve test_whichdb all of the same type: test test_whichdb failed -- Traceback (most recent call last): File "/home/packages/python/2.3/python2.3-2.3.3/Lib/test/test_whichdb.py", line 49, in test_whichdb_name f = mod.open(_fname, 'c') File "/home/packages/python/2.3/python2.3-2.3.3/Lib/dbhash.py", line 16, in open return bsddb.hashopen(file, flag, mode) File "/home/packages/python/2.3/python2.3-2.3.3/Lib/bsddb/__init__.py", line 192, in hashopen d.open(file, db.DB_HASH, flags, mode) DBError: (38, 'Function not implemented -- process-private: unable to initialize environment lock: Function not implemented') using db-4.1.25, all the tests succeed. any hints? The build environment is a current Debian GNU/Linux unstable i386 system. Matthias From martin at v.loewis.de Sun Jan 4 06:31:21 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sun Jan 4 06:31:56 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <5.1.0.14.0.20040103221807.03f7b990@mail.telecommunity.com> References: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> <3FF6FFE3.60608@v.loewis.de> <5.1.0.14.0.20040103221807.03f7b990@mail.telecommunity.com> Message-ID: <3FF7F989.4070601@v.loewis.de> Phillip J. Eby wrote: >> Is there any way to specify that implicit conversions raise an exception? >> Perhaps by setting the default encoding to "undefined". > > > Gosh, that sure would have made my last i18n project go faster, and I'd > have had more confidence that I'd found all possible conversion issues. I don't know whether Neil was actually aware of that feature: In site.py, if you uncomment the block if 0: # Enable to switch off string to Unicode coercion and implicit # Unicode to string conversion. encoding = "undefined" you get precisely that. Regards, Martin From martin at v.loewis.de Sun Jan 4 06:35:54 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sun Jan 4 06:36:09 2004 Subject: [Python-Dev] Switching to VC.NET 2003 In-Reply-To: <5.1.0.14.0.20040103225221.035d77f0@mail.telecommunity.com> References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.0.14.0.20040103225221.035d77f0@mail.telecommunity.com> Message-ID: <3FF7FA9A.5080903@v.loewis.de> Phillip J. Eby wrote: > I just had a chance to try this out. One problem: pyconfig.h is missing > from the installation. > > I copied over my pyconfig.h from Python 2.2, and was able to compile all > my target extensions using the distutils 'mingw32' compiler selection. This is now fixed with today's installer: http://www.dcl.hpi.uni-potsdam.de/home/loewis/python2.4.0.12421.msi It might be useful to try again. > I'm guessing that this is related to the coredumps. I probably need to > update my cygwin installation. Note that one doesn't explicitly link > the msvcrt40 with earlier Pythons, so finding out how to change it to > the 71 version is going to be... interesting. Will let you know what I > come up with. I believe mingw has a default crt it links with, and you probably need to change that. Regards, Martin From martin at v.loewis.de Sun Jan 4 06:42:24 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sun Jan 4 06:42:35 2004 Subject: [Python-Dev] SUCCESS! 2.4 extensions w/mingw (was Re: Switching to VC.NET 2003) In-Reply-To: <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> References: <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> Message-ID: <3FF7FC20.6060406@v.loewis.de> Phillip J. Eby wrote: > It looks like everything's cool for building 2.4 extensions > with mingw. Very good! You should still double-check what DLLs your .pyds depend on; this is best done with depends.exe (if you have that). GNU objdump may also be able to do that. > My only suggestion at this point would be to add 'pyconfig.h' to be > installed in Python24/include. Done (see other message). > It'd be nice to throw in the > libs/libpython24.a file, too, but it's big (603K) and I guess everybody > who's been using mingw to this point either knows the procedure for > making it by heart, or at least has a bookmark to one of the webpages > that explains how to do it. It would also require to have a certain amount of cygwin on the packaging machine, right? If the procedure could be modified to only use tools available on the target machine, I could look into generating that library on installation time. Alternatively, distutils could be modified to generate it on first use. Regards, Martin From pyth at devel.trillke.net Sun Jan 4 06:43:19 2004 From: pyth at devel.trillke.net (Holger Krekel) Date: Sun Jan 4 06:43:23 2004 Subject: [Python-Dev] Enhancing PEP 310 (or conflating, your decision :-) Message-ID: <20040104114319.GB6482@solar.trillke.net> hello python-dev, and hello Michael and Paul, here are, eventually, some remarks, ideas and food for thought regarding PEP 310. ++ with with yield The 'with' as proposed by PEP 310 implements semantics that you can today implement with a try-finally construct. (see the PEP for explanation). But I think that we can make PEP 310 do better than try-finally in that we can allow yield inside with-blocks. Let's recap PEP 310 quickly: __enter__ is called upon entering a with code block __exit__ is called upon leaving a with code block (should be renamed '__leave__' which i do subsequently :-) Compared to the try-finally statement we explicitely have some crucial information: the __enter__ method which encapsulates the prepare/acquire code for the block. Thus we could specifically allow: with handler(): for x in someiter: yield x and define the semantics so that handler.__enter__ is called every time the generator frame resumes execution and handler.__leave_ when it is suspended/destroyed [*]. (It *may* also be feasible to indicate the specific reason for leaving the code block (suspend/finish) to the __leave__ method or probably better to call another method). PEP325 proposes a coarser-grained "close" method on generators that is triggered when the generator has finished work. In constract, the above semantic extension of PEP 310 allows finer-grained locking as seen when using locks/critical regions as it directly allows yield within try-finally-alike semantics. ++ with and exceptions In an independent consideration, it would be useful if we had a way to react on exceptions from the with-object, say with an __except__ method that retrieves exceptions from 'its' code block. I am not sure, however, that this can be implemented in a straight forward way (Michael commented on this earlier with a similar sentiment IIRC). But it does seem to be a nice abstraction that you can provide an object that encapsulates how you deal with exceptions because you can change exception-handling code in *one* place instead of possibly many scattered try-finally/except-locations. ++ with loops In another independent consideration, we could extend PEP 310 to enable loop-constructs: the __leave__ method returns True if it wants the interpreter to revisit the code-block. Well this one is really more food for thought more than anything else but ... taking and combinining the above ideas we might be able to make with line in file(somefn): # work with line (probably even yield it ...) # the file is guaranteed to be closed here! work with clean semantics. Hmmm, IIRC Armin Rigo had a similar suggestion to reuse the 'for' statement instead of introducing a new with-keyword alltogether. This doesn't sound too far away now, anymore :-) ++ cheers and a happy new year, holger [*] disclaimer: I am not entirely sure if this can be implemented in a straight forward way but it shouldn't be too hard :-) From martin at v.loewis.de Sun Jan 4 06:48:09 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sun Jan 4 06:48:45 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: References: Message-ID: <3FF7FD79.5000209@v.loewis.de> Peter Astrand wrote: > The popen* stuff for Windows (and OS2 etc) in posixmodule.c is complicated > because: > > 1) It's written in low-level C > > 2) It contains lots of old DOS stuff Such code could be eliminated now; DOS is not supported anymore. Would you like to contribute a patch in that direction? Regards, Martin From nhodgson at bigpond.net.au Sun Jan 4 07:09:42 2004 From: nhodgson at bigpond.net.au (Neil Hodgson) Date: Sun Jan 4 07:09:02 2004 Subject: [Python-Dev] Relaxing Unicode error handling References: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> <3FF6FFE3.60608@v.loewis.de> <5.1.0.14.0.20040103221807.03f7b990@mail.telecommunity.com> <3FF7F989.4070601@v.loewis.de> Message-ID: <002501c3d2bb$a2e24140$3da48490@neil> Martin v. Loewis: > I don't know whether Neil was actually aware of that feature: In > site.py, if you uncomment the block > ... At the start of writing the mail, I just thought there should be a codec that failed conversions, by the end of the mail I'd found "undefined", and after sending found the site.py switch. When following unexpected conversion failure discussions on c.l.py, I can't recall this being mentioned as a debugging tool. Turning it on found several uninteresting ASCII literal promotion 'problems' in a small working application so the technique could be too noisy for some users. Neil From martin at v.loewis.de Sun Jan 4 07:28:52 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sun Jan 4 07:29:09 2004 Subject: [Python-Dev] New version of installer Message-ID: <3FF80704.4020808@v.loewis.de> I have put a new version of the installer at http://www.dcl.hpi.uni-potsdam.de/home/loewis/python2.4.0.12421.msi This fixes all problems I know of, in particular: - pyconfig.h, wininst.exe, and msvcr71.dll are installed - an "Advanced" dialog offers choice of registering extensions, and of all-users-vs-just-me installation If you find further problems, please let me know. I'm particularly interested in whether this can be installed and run successfully on Windows 95 (and other W9x versions). If you don't have installer, get it from http://www.microsoft.com/downloads/release.asp?ReleaseID=17343 Regards, Martin From martin at v.loewis.de Sun Jan 4 07:52:53 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sun Jan 4 07:53:09 2004 Subject: [Python-Dev] New version of installer In-Reply-To: <3FF80704.4020808@v.loewis.de> References: <3FF80704.4020808@v.loewis.de> Message-ID: <3FF80CA5.7030802@v.loewis.de> Martin v. Loewis wrote: > http://www.microsoft.com/downloads/release.asp?ReleaseID=17343 Actually, you should get it from http://www.microsoft.com/downloads/release.asp?ReleaseID=32831 For NT 4 and Win2k, see http://www.microsoft.com/downloads/details.aspx?FamilyID=4b6140f9-2d36-4977-8fa1-6f8a0f5dca8f Regards, Martin From skip at manatee.mojam.com Sun Jan 4 08:01:08 2004 From: skip at manatee.mojam.com (Skip Montanaro) Date: Sun Jan 4 08:06:15 2004 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200401041301.i04D18ch013001@manatee.mojam.com> Bug/Patch Summary ----------------- 618 open / 4468 total bugs (+81) 224 open / 2514 total patches (+36) New Bugs -------- html problem in mailbox docs (2003-12-30) http://python.org/sf/867556 Configure error with 2.3.3 on AIX 5.2 using gcc 3.3.2 (2003-12-31) http://python.org/sf/868529 HTTPResponse.read(amt) fails when response length is UNKNOWN (2003-12-31) http://python.org/sf/868571 Calling builtin function 'eval' from C causes seg fault. (2003-12-31) http://python.org/sf/868706 os.tmpfile seek(0) fails on cygwin (2004-01-01) http://python.org/sf/868743 Multiple problems with GC in 2.3.3 (2004-01-01) http://python.org/sf/868896 setgroups rejects long integer arguments (2004-01-02) http://python.org/sf/869197 Quicktime missing funcitonality (2004-01-02) http://python.org/sf/869649 segmentation fault in test_re (2004-01-03) http://python.org/sf/870120 New Patches ----------- Bad behavior of email.Charset.Charset when locale is tr_TR (2003-12-29) http://python.org/sf/866982 regrtest.py -T for code coverage (2003-12-31) http://python.org/sf/868499 Allow use of embedded Tcl without requiring Tk (2004-01-02) http://python.org/sf/869468 Building on Mac OS X 10.1 (2004-01-02) http://python.org/sf/869555 Add start and end optional args to array.index (2004-01-02) http://python.org/sf/869688 make Demo/scripts/primes.py usable as a module (2004-01-04) http://python.org/sf/870286 Fix docstring for posixpath.getctime (2004-01-04) http://python.org/sf/870287 Closed Bugs ----------- Document sys.path on Windows (2001-09-05) http://python.org/sf/459007 __cdecl / __stdcall unspecified in *.h (2001-10-15) http://python.org/sf/471432 locale.setlocale is broken (2002-01-15) http://python.org/sf/504219 isdir behavior getting odder on UNC path (2002-02-05) http://python.org/sf/513572 long file name support broken in windows (2002-04-10) http://python.org/sf/542314 Windows getpass bug (2002-04-20) http://python.org/sf/546558 os.tmpfile() can fail on win32 (2002-08-05) http://python.org/sf/591104 os.popen() negative error code IOError (2002-08-29) http://python.org/sf/602245 long list in Pythonwin -> weird text (2002-09-03) http://python.org/sf/604387 registry functions don't handle null characters (2003-01-21) http://python.org/sf/672132 PyErr_Warn may cause import deadlock (2003-02-09) http://python.org/sf/683658 print raises exception when no console available (2003-03-19) http://python.org/sf/706263 test_socketserver: Fatal error: Invalid thread state (2003-05-26) http://python.org/sf/743692 ipv6 sockaddr is bad (2003-05-27) http://python.org/sf/744164 Failure in comparing unicode string (2003-06-26) http://python.org/sf/761267 No syntax hilite on new file until saved as *.py (2003-07-21) http://python.org/sf/775012 IDLE: "Save As..." keybind (Ctrl+Shift+S) does not work (2003-07-21) http://python.org/sf/775353 Tooltip-window doesn't vanish if... (2003-07-22) http://python.org/sf/775535 idle.py doesn't accept ( in some cases (2003-07-22) http://python.org/sf/775541 Tab / Space Configuration Does Not Work in IDLE (2003-08-05) http://python.org/sf/783887 IDLE starts with no menus (Cygwin build of python2.3) (2003-08-11) http://python.org/sf/786827 Arguments tooltip wrong if def contains tuple (2003-08-20) http://python.org/sf/791968 IDLE / PyOS_InputHook (2003-08-31) http://python.org/sf/798058 color highlighting not working on EDIT windows (2003-09-04) http://python.org/sf/800432 tarfile violates bufsize (2003-09-25) http://python.org/sf/812325 Fatal Python error: GC object already tracked (2003-10-02) http://python.org/sf/816476 Idle fails on loading .idlerc if Home path changes. (2003-10-22) http://python.org/sf/828049 Inconsitent line numbering in traceback (2003-10-29) http://python.org/sf/832535 Ctrl+key combos stop working in IDLE (2003-10-31) http://python.org/sf/833957 os.chmod/os.utime/shutil do not work with unicode filenames (2003-11-20) http://python.org/sf/846133 mbcs encoding ignores errors (2003-11-28) http://python.org/sf/850997 HTMLParser parsers AT&T to AT (2003-12-08) http://python.org/sf/856617 Hangs when opening 2nd shell with subprocess under windows (2003-12-23) http://python.org/sf/865014 Closed Patches -------------- Fix for bug 494589 (2002-12-25) http://python.org/sf/658599 Faster commonprefix (OS independent) (2003-02-06) http://python.org/sf/681780 expose lowlevel setlocale (2003-07-24) http://python.org/sf/776854 Tk Dialog Upon Subprocess Socket Error (2003-07-26) http://python.org/sf/778323 Highlight builtins (2003-09-13) http://python.org/sf/805830 cmath.log optional base argument, fixes #823209 (2003-10-18) http://python.org/sf/826074 Small error in test_format (2003-11-25) http://python.org/sf/849252 Modify Setup.py to Detect Tcl/Tk on BSD (2003-11-28) http://python.org/sf/850977 heapq: A way to change the compare function (2003-12-28) http://python.org/sf/866594 From pf_moore at yahoo.co.uk Sun Jan 4 09:57:57 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sun Jan 4 09:57:59 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229123908.023122e0@telecommunity.com> <3FF206A0.3070002@v.loewis.de> Message-ID: <7k07lq16.fsf@yahoo.co.uk> "Martin v. Loewis" writes: > Paul Moore wrote: >> I don't know if Martin has already done this, but it needs doing. I'm >> not a distutils expert, but I am willing to look at it in the longer >> term. > > I haven't changed any Python file at all in the process of compiling > with VC 7.1. However, Python (since 2.3) indicates the compiler used > to build it in sys.version; I believe this could be used as an > indication to find out whether it was build with VC6 or VC7.1 (dunno > whether it could also tell apart 7.0 and 7.1). I have a patch which seems to do the right thing with respect to adding "-lmsvcr71" to the GCC link command. The resulting build looks OK, and building a simple extension works fine. Apparently, however, Phillip J. Eby got a working DLL without needing this. And indeed I have no problems omitting the -lmsvcr71. So we don't have a real example of the "potential issue" with mismatched MSVC runtimes. Nevertheless, if you do "objdump -p myext.pyd" and look at the import tables, you do see that runtime symbols are satisfied from msvcrt without the -l flag, and from msvcr71 with it (apart from abort, which is always from msvcrt, which seems to be a peculiarity of how the mingw startup code is linked). I've uploaded it to SourceForge (http://www.python.org/sf/870382) and assigned it to you. I hope that's OK. Paul. From barry at barrys-emacs.org Sun Jan 4 10:16:18 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Sun Jan 4 10:16:32 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: <3FF75AFC.3020204@v.loewis.de> References: <3FF75AFC.3020204@v.loewis.de> Message-ID: <6.0.1.1.2.20040104151236.02209ea0@torment.chelsea.private> At 04-01-2004 00:14, Martin v. Loewis wrote: >>With popen5, you can do it *without* using the shell. > >Why is that a good thing? Because using the shell on windows is causing a DOS box window to appear for every popen2/3/4 use in a windowed python program on Windows. Barry From pje at telecommunity.com Sun Jan 4 11:00:45 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 4 11:00:57 2004 Subject: [Python-Dev] SUCCESS! 2.4 extensions w/mingw (was Re: Switching to VC.NET 2003) In-Reply-To: <3FF7FC20.6060406@v.loewis.de> References: <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20040104104243.02671b70@mail.telecommunity.com> At 12:42 PM 1/4/04 +0100, Martin v. Loewis wrote: >Phillip J. Eby wrote: > >>It looks like everything's cool for building 2.4 extensions with mingw. > >Very good! You should still double-check what DLLs your .pyds depend >on; this is best done with depends.exe (if you have that). Nope. Apparently that's part of the Visual C toolkit. > GNU objdump >may also be able to do that. Also nope. But I googled and found there's a 'cygcheck' that does basically the same thing. It shows that the .pyd's are still linking w/msvcrt.dll. :( I have no idea how to stop it, either. I don't know what effects this might have, although clearly it hasn't stopped the extensions I built from working. But none of the extensions I tried do anything with e.g. file descriptors, raw memory allocation, or floating point math, and I'd assume that those are the areas most likely to have issues. >>It'd be nice to throw in the libs/libpython24.a file, too, but it's big >>(603K) and I guess everybody who's been using mingw to this point either >>knows the procedure for making it by heart, or at least has a bookmark to >>one of the webpages that explains how to do it. > >It would also require to have a certain amount of cygwin on the >packaging machine, right? > >If the procedure could be modified to only use tools available on the >target machine, I could look into generating that library on >installation time. Alternatively, distutils could be modified to >generate it on first use. Probably distutils would be the place to do it, since you may install Python long before you realize you'd like to make C extensions with mingw32. So, the needed tools (pexports.exe and dlltool.exe) wouldn't be available at install time. Doing it at first use has only the issue that the user would have to have rights to write to the Python24/libs directory, even if they're not planning to install anything there. So, distutils would have to build the library in the local build/ directory if it couldn't write to the libs directory. Hm. Really, it begins to sound as though there should either be a separate distutils command to build the library, or just a script you can run to build and/or install the library. I'll consider writing something -- at least a doc patch, if not a script -- for 2.4. From pje at telecommunity.com Sun Jan 4 11:02:50 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 4 11:03:00 2004 Subject: [Python-Dev] New version of installer In-Reply-To: <3FF80704.4020808@v.loewis.de> Message-ID: <5.1.0.14.0.20040104110132.03a11890@mail.telecommunity.com> At 01:28 PM 1/4/04 +0100, Martin v. Loewis wrote: >I'm particularly interested in whether this can be installed and >run successfully on Windows 95 (and other W9x versions). If you don't >have installer, get it from > >http://www.microsoft.com/downloads/release.asp?ReleaseID=17343 FYI, I was able to run the original (haven't tried the new one yet) on Windows 98SE. I didn't have to download the installer; I guess I'd done it before for something else. I just double-clicked the msi and it worked. It uninstalled cleanly, too. From pje at telecommunity.com Sun Jan 4 11:11:47 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 4 11:12:01 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: <7k07lq16.fsf@yahoo.co.uk> References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229123908.023122e0@telecommunity.com> <3FF206A0.3070002@v.loewis.de> Message-ID: <5.1.0.14.0.20040104110430.03a11ec0@mail.telecommunity.com> At 02:57 PM 1/4/04 +0000, Paul Moore wrote: >I have a patch which seems to do the right thing with respect to >adding "-lmsvcr71" to the GCC link command. > >The resulting build looks OK, and building a simple extension works >fine. Apparently, however, Phillip J. Eby got a working DLL without >needing this. And indeed I have no problems omitting the -lmsvcr71. So >we don't have a real example of the "potential issue" with mismatched >MSVC runtimes. Nevertheless, if you do "objdump -p myext.pyd" and look Aha, so *that*'s the option I needed ("private headers" didn't look like something to do with dependencies). Dumping my kjbuckets.pyd, I got: DLL Name: msvcrt.dll vma: Hint/Ord Member-Name Bound-To c1f8 36 __dllonexit c208 147 _errno c214 510 abort c21c 522 calloc c228 537 fflush c234 547 fputc c23c 552 free c244 560 fwrite c250 603 malloc c25c 636 sprintf I didn't think kjbuckets used that many C functions. Apparently there are lots of debugging prints. protocols/_speedupds.pyd had much fewer: DLL Name: msvcrt.dll vma: Hint/Ord Member-Name Bound-To 9258 36 __dllonexit 9268 147 _errno 9274 510 abort 927c 537 fflush 9288 552 free 9290 603 malloc and ZODB 4's _persistence.pyd has: DLL Name: msvcrt.dll vma: Hint/Ord Member-Name Bound-To 6200 36 __dllonexit 6210 108 _assert 621c 147 _errno 6228 510 abort 6230 537 fflush 623c 552 free 6244 603 malloc 6250 666 time So it really does appear that the mscvcr71 linkage would be a very good idea. >at the import tables, you do see that runtime symbols are satisfied >from msvcrt without the -l flag, and from msvcr71 with it (apart from >abort, which is always from msvcrt, which seems to be a peculiarity of >how the mingw startup code is linked). Do you think that's likely to cause any issues? From guido at python.org Sun Jan 4 11:41:07 2004 From: guido at python.org (Guido van Rossum) Date: Sun Jan 4 11:41:12 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib timeit.py, 1.15, 1.16 In-Reply-To: Your message of "Sun, 04 Jan 2004 00:35:49 EST." <007901c3d284$9c63ad40$bb29a044@oemcomputer> References: <007901c3d284$9c63ad40$bb29a044@oemcomputer> Message-ID: <200401041641.i04Gf7b06436@c-24-5-183-134.client.comcast.net> > > > Make timings more consistent by temporarily disabling GC. > > > > Shouldn't this be an option at best? > > I think not. The default behavior should be the one that > gives the most consistent timings between runs. If the timed code's malloc behavior is predictable (and I say it usually is!) then the GC behavior should (in theory) also be predictable, at least if you do a full GC before starting. But I haven't experimented with this, and from your reversal I take it that you have. > > What if I want to time some code > > that expects GC to be on? > > >>> Timer(stmt, 'import gc; gc.enable()'.timeit() > > If desired, I can add a note to that effect in the docs. Please do. > However, GC is such a random event (in terms of when it occurs, > what it affects, how much time it takes, and the state of the > system) that you would always want it off unless you're specifically > measuring GC performance across two sets of code that handle GC > differently. If I have a piece of code that allocates so much as to trigger GC regularly, I'd want the GC behavior included in the timings. > GC performance does make a difference when measuring the performance > of a whole, memory intensive application; however, that tends to > fall outside the scope and purpose of the module. Usually. That's why I'm asking for a way to change the default behavior. > If you really want an option, I can put it in, but why clutter > the API with something that few people care about or understand. Your setup suggestion is fine. > An alternative is to stick with my original idea of running > a gc.collect() before each timing pass. I would have done that > but Tim's comment left me with doubts about whether it was the > right thing to do. On the surface it makes sense and empirical > tests show that the timings do become more stable. However, > trashing the cache before each run may create timings that don't > reflect a normal operating environment (though the timeit environment > is already anything but normal). Here I agree with Tim. --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Sun Jan 4 12:31:44 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun Jan 4 12:31:47 2004 Subject: [Python-Dev] Weekly Python Bug/Patch Summary In-Reply-To: <200401041301.i04D18ch013001@manatee.mojam.com> (Skip Montanaro's message of "Sun, 4 Jan 2004 07:01:08 -0600") References: <200401041301.i04D18ch013001@manatee.mojam.com> Message-ID: <87n093r56n.fsf@hydra.localdomain> Skip Montanaro writes: > Bug/Patch Summary > ----------------- > > 618 open / 4468 total bugs (+81) ^^^ is this supposed to be the change in the period? > 224 open / 2514 total patches (+36) [...] > Closed Bugs > ----------- [...] > Closed Patches > -------------- What is the algorithm used? Many of the IDLE bugs and patches listed here as closed are still open. I looked at a couple of non-IDLE items and they were open, also. -- KBK From eppstein at ics.uci.edu Sun Jan 4 13:18:22 2004 From: eppstein at ics.uci.edu (David Eppstein) Date: Sun Jan 4 13:18:27 2004 Subject: [Python-Dev] Re: Relaxing Unicode error handling References: <5.1.0.14.0.20040103102300.04171880@mail.telecommunity.com> <3FF6FFE3.60608@v.loewis.de> Message-ID: In article <3FF6FFE3.60608@v.loewis.de>, "Martin v. Loewis" wrote: > > Or, am I missing the point entirely, and there's some other circumstance > > where one gets UnicodeErrors besides .decode()? If the use case is > > mixing strings and unicode objects (i.e. adding, joining, searching, > > etc.), then I'd have to say a big fat -1, as opposed to merely a -0 for > > having other ways to spell .decode(codec,"ignore"). > > Yes, it is these use cases: Somebody invokes an SQL method, which > happens to return a Unicode string, and then adds a latin-1 byte > string to it. It works for all ASCII byte strings, but then the > customer happens to enter accented characters, and the application > crashes without offering to safe recent changes. > > So I guess that's -1 from you. I am -1 also on allowing the programmer to let such errors pass silently. The world of unicode encodings and decodings is painful enough without giving people more freedom to create broken files. I have an application which has to guess which encoding to use for certain files (the file format specifies an encoding but nobody pays attention), and one of the sample files I found on the web can't be read correctly because it mixes UTF-8 and I think CP1252. I am very much in favor of anything that prevents more such files from being created. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From jcarlson at uci.edu Sun Jan 4 14:31:53 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun Jan 4 14:34:45 2004 Subject: [Python-Dev] PEP 326 now online Message-ID: <20040104112807.0E8F.JCARLSON@uci.edu> I believe I've addressed the majority of concerns had for the All (previously Some) concept and object (the PEP editor seems to agree), and the PEP has been posted as PEP 326. Give it a look: http://www.python.org/peps/pep-0326.html As always, comments, questions, and suggestions are always appreciated. - Josiah From gerrit at nl.linux.org Sun Jan 4 14:44:03 2004 From: gerrit at nl.linux.org (Gerrit Holl) Date: Sun Jan 4 14:45:10 2004 Subject: [Python-Dev] PEP 326 now online In-Reply-To: <20040104112807.0E8F.JCARLSON@uci.edu> References: <20040104112807.0E8F.JCARLSON@uci.edu> Message-ID: <20040104194403.GA16853@nl.linux.org> [I already sent this in private mail before this message arose here] PEP 326 reads: > Users of Python have had the constant None, which represents a lack of > value, for quite some time (possibly from the beginning, this is > unknown to the author at the current time). The author believes that > ``All`` should be introduced in order to represent a functionally > infinite value, with similar behavior corresponding to None. I think the described behavious of None is one of the rarest uses of None. Actually, I didn't even know of the existence of this behaviour. I think None is mainly used to show the absence of *anything*: when a function does not return a value, it return None, when some variable does not have a value yet, it may be set to None, when a HTTP connection is not opened yet, it's socket is None, etc. > Introducing an ``All`` built-in that works as described does not take > much effort. A sample Python implementation is included. I don't think it is used very often. ``All`` is not a very straightforward name as well. The implementation is created very quickly: it can be done even simpler than you did, although it won't be repr'ed All then: class Infinity(object): def __cmp__(self, other): return 1 This is usable everywhere you describe. Because ``All`` is not a very straightforward name, I think the namespace cluttering is worse than the advantages. To convince me, I think you'll need to come up with a better name. It isn't ``All`` really, because that makes sense only with container types, it isn't ``Infinity`` because that makes sense only with numbers (strings?), so this is the biggest problem of your proposal. just my 0.02 nanoEUR, yours, Gerrit. -- 41. If any one fence in the field, garden, and house of a chieftain, man, or one subject to quit-rent, furnishing the palings therefor; if the chieftain, man, or one subject to quit-rent return to field, garden, and house, the palings which were given to him become his property. -- 1780 BC, Hammurabi, Code of Law -- Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/ From jcarlson at uci.edu Sun Jan 4 15:00:37 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun Jan 4 15:02:46 2004 Subject: [Python-Dev] Re: PEP 326: A Case for All In-Reply-To: <20040104174459.GA16449@nl.linux.org> References: <20040104174459.GA16449@nl.linux.org> Message-ID: <20040104113316.0E92.JCARLSON@uci.edu> > PEP 326 reads: > > Users of Python have had the constant None, which represents a lack of > > value, for quite some time (possibly from the beginning, this is > > unknown to the author at the current time). The author believes that > > ``All`` should be introduced in order to represent a functionally > > infinite value, with similar behavior corresponding to None. [Gerrit Holl] > I think the described behavious of None is one of the rarest uses of > None. Actually, I didn't even know of the existence of this behaviour. > I think None is mainly used to show the absence of *anything*: when a > function does not return a value, it return None, when some variable > does not have a value yet, it may be set to None, when a HTTP connection > is not opened yet, it's socket is None, etc. As previously offered by Tim Peters: http://mail.python.org/pipermail/python-dev/2003-December/041374.html The behavior of None being smaller than anything else was a concious decision. The use of All doesn't change the behavior of None. How people end up using All in their algorithms is up to them. Does the Python language restrict the usage of None? Of course not. Just as a function can return None (as a signal to the caller), a function can now return All as a signal to the caller. Further uses of All will be left as an exercise to whomever wants to use it. > > Introducing an ``All`` built-in that works as described does not take > > much effort. A sample Python implementation is included. > > I don't think it is used very often. ``All`` is not a very > straightforward name as well. The implementation is created very > quickly: it can be done much simpler than you did, although it won't be > called All then: > > class Infinity(object): > def __cmp__(self, other): > return 1 The problem with your ``Infinity`` implementation is that it compares greater than itself and doesn't follow the standard python object convention of: a = obj;b = eval(repr(obj));a == b >>> class Infinity(object): ... def __cmp__(self, other): ... return 1 ... >>> I = Infinity() >>> I == I 0 >>> I is I 1 >>> I <__main__.Infinity object at 0x007A5720> The implementation that I offer in the PEP fixes the problem with your ``Infinity`` implementation. > This is usable everywhere you describe. Because ``All`` is not a very > straightforward name, I think the namespace cluttering is worse than the > advantages. To convince me, I think you'll need to come up with a better > name. It isn't ``All`` really, because that makes sence only with > container types, it isn't ``Infinity`` because that makes sense only > with numbers (strings?), so this is the biggest problem of your > proposal. All is a resonably lengthed name for the concept of 'bigger than anything'. This was independantly offered by two developers on the python-dev mailing list: Tim Peters: http://mail.python.org/pipermail/python-dev/2003-December/041337.html David LeBlanc http://mail.python.org/pipermail/python-dev/2003-December/041386.html It also has a similar naming semantic of None. What is the opposite of None (conceptually)? All of course. - Josiah From python at rcn.com Sun Jan 4 15:34:31 2004 From: python at rcn.com (Raymond Hettinger) Date: Sun Jan 4 15:35:27 2004 Subject: [Python-Dev] PEP 326 now online In-Reply-To: <20040104112807.0E8F.JCARLSON@uci.edu> Message-ID: <002c01c3d302$33b374a0$e841fea9@oemcomputer> > I believe I've addressed the majority of concerns had for the All > (previously Some) concept and object (the PEP editor seems to agree), > and the PEP has been posted as PEP 326. The name must be something else like Infinity or something more suggestive of what it does. All, Some, and No already have meanings in logic and would guarantee naming conflicts with existing code. The use cases are very thin but have a ring of truth -- I think many people have encountered a situation where they needed an initial value guaranteed to be larger than any element in a set. It would also be helpful to see how well None has fared as a negative infinity object. Your case would be much stronger if it could be shown that None was widely used in that capacity and that it had simplified the code where it was used. The opposition to new builtins is fierce, so it would be wise to bolster the use cases with examples from real code which would be improved by having an Infinity constant. A little user friendliness testing on newbies couldn't hurt either (try the tutor list). Also, develop some strong arguments about why it wouldn't be preferable to have this as a recipe or example; to tuck it into some other namespace; or to attach it to a type (like float.infinity, etc). Under alternatives, be sure to list Alex Martelli's suggestion to use the key= construct with min() and max(). This could potentially trivialize all the issues with writing clean code for finding the biggest and smallest things around: min([1,2,3,4]) --> 1 min([(a,1), (b,2), (c,3), (d,4)], key=itemgetter(1)) --> (a,1) max([1,2,3,4]) --> 4 max([(a,1), (b,2), (c,3), (d,4)], key=itemgetter(1)) --> (d,4) All of the examples in the PEP are subsumed by Alex's proposal. The only example I can think of that still warrants a constant is for NULL objects stored in a database -- even that example is weak because there are likely better ways to handle missing data values. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From pje at telecommunity.com Sun Jan 4 15:35:46 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 4 15:36:27 2004 Subject: [Python-Dev] SUCCESS! 2.4 extensions w/mingw (was Re: Switching to VC.NET 2003) In-Reply-To: <5.1.0.14.0.20040104104243.02671b70@mail.telecommunity.com> References: <3FF7FC20.6060406@v.loewis.de> <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20040104142918.03a28e60@mail.telecommunity.com> At 11:00 AM 1/4/04 -0500, Phillip J. Eby wrote: >Hm. Really, it begins to sound as though there should either be a >separate distutils command to build the library, or just a script you can >run to build and/or install the library. I'll consider writing something >-- at least a doc patch, if not a script -- for 2.4. After a bit more research, I have a short Python script to prepare a Windows Python installation for using the mingw32 compiler option. It does not require a separate download of the 'pexports' program; it requires only that the 'nm' and 'dlltool' executables (installed with the Cygwin 'binutils' package) are available on the system PATH, and that the user running it have write access to the PythonNN/libs directory. Should this, or something like it, be distributed with Windows Python? If so, where in the distribution should it go? Should I create a SF patch for this? #=== mingw_setup.py """Create pythonNN.def and libpythonNN.a in 'PythonNN/libs' directory This script makes it possible to use the MinGW compiler tools to build C extensions that work with the standard Windows Python distribution. Before running, you should have installed the MinGW compiler toolset, and placed its 'bin' directory on your PATH. An easy way to do this is to install Cygwin's "binutils", "gcc", and "mingw-*" packages, then run this script from the Cygwin shell. (Be sure to use *Windows* Python, not Cygwin python!) Once this script has been run, you should be able to build extensions using distutils in the standard way, as long as you select the 'mingw32' compiler, and the required tools are on your PATH. See the "Installing Python Modules" manual for more information on selecting a compiler. """ from distutils.spawn import find_executable from distutils.sysconfig import get_config_var from distutils import __file__ as distutils_file import os, re, sys if sys.platform=='cygwin': print "Please run this script using Windows python,", print "not Cygwin python. E.g, try:" print print "/cygdrive/c/PythonNN/python", " ".join(sys.argv) print print "(where NN is the major/minor python version number)" sys.exit() basename = 'python%d%d' % sys.version_info[:2] libs_dir = os.path.join(get_config_var('prefix'),'libs') lib_file = os.path.join(libs_dir,basename+'.lib') def_file = os.path.join(libs_dir,basename+'.def') ming_lib = os.path.join(libs_dir,'lib%s.a' % basename) distutils_cfg = os.path.join(os.path.dirname(distutils_file),'distutils.cfg') export_match = re.compile(r"^_imp__(.*) in python\d+\.dll").match nm = find_executable('nm') dlltool = find_executable('dlltool') if not nm or not dlltool: print "'nm' and/or 'dlltool' were not found;" print "Please make sure they're on your PATH." sys.exit() nm_command = '%s -Cs %s' % (nm, lib_file) print "Building", def_file, "using", nm_command f = open(def_file,'w') print >>f, "LIBRARY %s.dll" % basename print >>f, "EXPORTS" nm_pipe = os.popen(nm_command) for line in nm_pipe.readlines(): m = export_match(line) if m: print >>f, m.group(1) f.close() exit = nm_pipe.close() if exit: print "nm exited with status", exit print "Please check that", lib_file, "exists and is valid." sys.exit() print "Building",ming_lib,"using",dlltool dlltool_pipe = os.popen( "%s --dllname %s.dll --def %s --output-lib %s" % (dlltool, basename, def_file, ming_lib) ) dlltool_pipe.readlines() exit = dlltool_pipe.close() if exit: print "dlltool exited with status", exit print "Unable to proceed." sys.exit() print print "Installation complete. You may wish to add the following" print "lines to", distutils_cfg, ':' print print "[build]" print "compiler = mingw32" print print "This will make the distutils use MinGW as the default" print "compiler, so that you don't need to configure this for" print "every 'setup.py' you run." From tim.one at comcast.net Sun Jan 4 15:55:20 2004 From: tim.one at comcast.net (Tim Peters) Date: Sun Jan 4 15:55:26 2004 Subject: [Python-Dev] New version of installer In-Reply-To: <3FF80704.4020808@v.loewis.de> Message-ID: [Martin v. Loewis] > I have put a new version of the installer at > > http://www.dcl.hpi.uni-potsdam.de/home/loewis/python2.4.0.12421.msi > > ... > I'm particularly interested in whether this can be installed and > run successfully on Windows 95 (and other W9x versions). I don't have a Win95 box anymore. It installed fine on Win98SE; I already had the Windows Installer. Most things worked fine. I noticed that Lib/site-packages/ wasn't created, but should be. A number of tests failed that didn't fail in last night's VC6-built Python 2.4. I expect most of these are due to failing to install some test files with oddball extensions (that was a seemingly endless battle with the Wise installer too): C:\Python24>python lib/test/regrtest.py -u network,largefile ... test___all__ test test___all__ failed -- Traceback (most recent call last): File "C:\PYTHON24\lib\test\test___all__.py", line 69, in test_all self.check_all("base64") File "C:\PYTHON24\lib\test\test___all__.py", line 42, in check_all exec "from %s import *" % modname in names File "", line 1, in ? AttributeError: 'module' object has no attribute 'freenet_b64encode' Also failed under VC6. ... test test_anydbm failed -- errors occurred in test.test_anydbm.AnyDBMTestCase New failure. ... test test_imageop crashed -- exceptions.IOError: [Errno 2] No such file or directory: 'testrgb.uue' New failure. ... test_minidom Traceback (most recent call last): File "C:\PYTHON24\lib\test\test_minidom.py", line 1357, in ? func() File "C:\PYTHON24\lib\test\test_minidom.py", line 112, in testAppendChild dom = parse(tstfile) File "C:\PYTHON24\lib\xml\dom\minidom.py", line 1915, in parse return expatbuilder.parse(file) File "C:\PYTHON24\lib\xml\dom\expatbuilder.py", line 922, in parse fp = open(file, 'rb') IOError: [Errno 2] No such file or directory: 'C:\\PYTHON24\\lib\\test\\test.xml' Traceback (most recent call last): File "C:\PYTHON24\lib\test\test_minidom.py", line 1357, in ? func() File "C:\PYTHON24\lib\test\test_minidom.py", line 36, in testGetElementsByTagName dom = parse(tstfile) File "C:\PYTHON24\lib\xml\dom\minidom.py", line 1915, in parse return expatbuilder.parse(file) File "C:\PYTHON24\lib\xml\dom\expatbuilder.py", line 922, in parse fp = open(file, 'rb') IOError: [Errno 2] No such file or directory: 'C:\\PYTHON24\\lib\\test\\test.xml' Traceback (most recent call last): File "C:\PYTHON24\lib\test\test_minidom.py", line 1357, in ? func() File "C:\PYTHON24\lib\test\test_minidom.py", line 188, in testNonZero dom = parse(tstfile) File "C:\PYTHON24\lib\xml\dom\minidom.py", line 1915, in parse return expatbuilder.parse(file) File "C:\PYTHON24\lib\xml\dom\expatbuilder.py", line 922, in parse fp = open(file, 'rb') IOError: [Errno 2] No such file or directory: 'C:\\PYTHON24\\lib\\test\\test.xml' Traceback (most recent call last): File "C:\PYTHON24\lib\test\test_minidom.py", line 1357, in ? func() File "C:\PYTHON24\lib\test\test_minidom.py", line 31, in testParseFromFile dom = parse(StringIO(open(tstfile).read())) IOError: [Errno 2] No such file or directory: 'C:\\PYTHON24\\lib\\test\\test.xml' Traceback (most recent call last): File "C:\PYTHON24\lib\test\test_minidom.py", line 1357, in ? func() File "C:\PYTHON24\lib\test\test_minidom.py", line 195, in testUnlink dom = parse(tstfile) File "C:\PYTHON24\lib\xml\dom\minidom.py", line 1915, in parse return expatbuilder.parse(file) File "C:\PYTHON24\lib\xml\dom\expatbuilder.py", line 922, in parse fp = open(file, 'rb') IOError: [Errno 2] No such file or directory: 'C:\\PYTHON24\\lib\\test\\test.xml' test test_minidom produced unexpected output: ********************************************************************** *** lines 2-20 of actual output doesn't appear in expected output after line 1: + Test Failed: testAppendChild + + Test Failed: testGetElementsByTagName + + Test Failed: testNonZero + + Test Failed: testParseFromFile + + Test Failed: testUnlink + + + + + **** Check for failures in these tests: + testAppendChild + testGetElementsByTagName + testNonZero + testParseFromFile + testUnlink ********************************************************************** New failure. ... test test_rgbimg crashed -- exceptions.IOError: [Errno 2] No such file or directory: 'testrgb.uue' New failure. ... test test_sax crashed -- exceptions.IOError: [Errno 2] No such file or directory: 'test.xml.out' New failure. ... test test_tarfile crashed -- exceptions.IOError: [Errno 2] No such file or directory: 'c:\\windows\\temp\\testtar.dir\\testtar.tar.gz' New failure. ... test test_unicode_file failed -- Traceback (most recent call last): File "C:\PYTHON24\lib\test\test_unicode_file.py", line 142, in test_single_files self._test_single(TESTFN_UNICODE) File "C:\PYTHON24\lib\test\test_unicode_file.py", line 116, in _test_single self._do_single(filename) File "C:\PYTHON24\lib\test\test_unicode_file.py", line 32, in _do_single os.utime(filename, None) UnicodeEncodeError: 'ascii' codec can't encode characters in position 6-7: ordinal not in range(128) Old failure. ... test_urllib2 test test_urllib2 failed -- Traceback (most recent call last): File "C:\PYTHON24\lib\test\test_urllib2.py", line 345, in test_file r = h.file_open(Request(url)) File "C:\PYTHON24\lib\urllib2.py", line 1058, in file_open return self.open_local_file(req) File "C:\PYTHON24\lib\urllib2.py", line 1073, in open_local_file stats = os.stat(localfile) OSError: [Errno 2] No such file or directory: '\\test.txt' Old failure. ... 9 tests failed: test___all__ test_anydbm test_imageop test_minidom test_rgbimg test_sax test_tarfile test_unicode_file test_urllib2 Up from 3 failures last night. From pje at telecommunity.com Sun Jan 4 15:57:48 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 4 15:58:32 2004 Subject: [Python-Dev] Re: PEP 326: A Case for All In-Reply-To: <20040104113316.0E92.JCARLSON@uci.edu> References: <20040104174459.GA16449@nl.linux.org> <20040104174459.GA16449@nl.linux.org> Message-ID: <5.1.0.14.0.20040104153814.05b05c80@mail.telecommunity.com> At 12:00 PM 1/4/04 -0800, Josiah Carlson wrote: >Further uses of All will be left as an exercise to whomever wants to use >it. Um, then why don't those people just write their own 'All'? It's not like they all need to be using the same definition of 'All', so why put it in Python? The "Motivation" section of PEP 326 does not answer this question. Although it claims "hundreds of algorithms", it demonstrates only *one*: finding the minimum of a sequence. The example shows three versions of finding a minimum of a sequence, to show that it's easier with 'All'. But that's a straw man argument: the *easiest* way to get the desired behavior is just to use 'min(seq)' and not write any new functions at all! So, essentially, the "Motivation" section might as well not be in the PEP, because it provides no actual motivation at all. You need to find a better algorithm to cover. And don't bother using "find the pair x,y from a sequence of pairs where 'x' is the lowest item" as a substitute, because again the answer is 'min(seq)'. And even if you say, "yes but we don't want 'y' to be compared", there's always: def min_key(pairs): minX, chosenY = pairs[0] for x,y in pairs[1:] if x <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229123908.023122e0@telecommunity.com> <3FF206A0.3070002@v.loewis.de> <7k07lq16.fsf@yahoo.co.uk> <5.1.0.14.0.20040104110430.03a11ec0@mail.telecommunity.com> Message-ID: "Phillip J. Eby" writes: >>at the import tables, you do see that runtime symbols are satisfied >>from msvcrt without the -l flag, and from msvcr71 with it (apart from >>abort, which is always from msvcrt, which seems to be a peculiarity of >>how the mingw startup code is linked). > > Do you think that's likely to cause any issues? I can't prove that linking *anything* via msvcrt causes a problem. And a quick test shows that if user code calls malloc/free/abort, then those are linked from msvcr71. So it's only if the runtime calls the function, but your code doesn't, that the msvcrt version is used. Which is precisely when it doesn't matter (I think). So no, it won't cause any issues as far as I can see. Paul. -- This signature intentionally left blank From mal at egenix.com Sun Jan 4 18:23:04 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Sun Jan 4 18:23:06 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FF6AB34.1000502@v.loewis.de> References: <3FF6AB34.1000502@v.loewis.de> Message-ID: <3FF8A058.4020904@egenix.com> Martin v. Loewis wrote: > People keep complaining that they get Unicode errors > when funny characters start showing up in their inputs. > > In some cases, these people would apparantly be happy > if Python would just "keep going", IOW, they want to > get moji-bake (garbled characters), instead of > exceptions that abort their applications. > > I'd like to add a static property unicode.errorhandling, > which defaults to "strict". Applications could set this > to "replace" or "ignore", silencing the errors, and > risking loss of data. > > What do you think? -1. Codecs define what the default error handling should be (setting errors to NULL), not the Unicode implementation. From unicodeobject.h: """ /* === Builtin Codecs ===================================================== Many of these APIs take two arguments encoding and errors. These parameters encoding and errors have the same semantics as the ones of the builtin unicode() API. Setting encoding to NULL causes the default encoding to be used. Error handling is set by errors which may also be set to NULL meaning to use the default handling defined for the codec. Default error handling for all builtin codecs is "strict" (ValueErrors are raised). """ In some cases, "strict" is hard coded as errors argument to override the default value and that is done to prevent invalid data from propagating undetected throughout the application. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 04 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From jcarlson at uci.edu Sun Jan 4 19:28:36 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun Jan 4 19:30:51 2004 Subject: [Python-Dev] PEP 326 now online In-Reply-To: <002c01c3d302$33b374a0$e841fea9@oemcomputer> References: <20040104112807.0E8F.JCARLSON@uci.edu> <002c01c3d302$33b374a0$e841fea9@oemcomputer> Message-ID: <20040104132228.0E98.JCARLSON@uci.edu> > > I believe I've addressed the majority of concerns had for the All > > (previously Some) concept and object (the PEP editor seems to agree), > > and the PEP has been posted as PEP 326. > > The name must be something else like Infinity or something more > suggestive of what it does. What does the name 'None' suggest about what it does? Just as documentation and fooling around has helped current Python users understand how 'None' works, they can do the same for "All", albeit with a PEP that further describes how it works. Whether or not users realize that None is smaller than anything else is neither here nor there, they've not done the experimentation to understand the extended features of the None object. A similar thing can be said about the void pointer in C/C++ and new/intermediate users of those languages. Heck, the "All" PEP could be used to educate new Python users as to a useful feature of the None object that they will be using during their Python experience. > All, Some, and No already have meanings in logic and would > guarantee naming conflicts with existing code. 'No' is an answer to a question, not a logical idea. "Is this answer correct?" -> [Yes, No] "This answer is correct." -> [True, False] As of right now, [All, Some, No] produce NameErrors. Checking the standard 2.3.2 distribution, seemingly the only examples of any of the three are in comments, documetation, or strings. As for naming conflicts, I would like to see code that already has examples of All being used. > The use cases are very thin but have a ring of truth -- I think many > people have encountered a situation where they needed an initial > value guaranteed to be larger than any element in a set. Agreed. > It would also be helpful to see how well None has fared as a > negative infinity object. Your case would be much stronger > if it could be shown that None was widely used in that capacity > and that it had simplified the code where it was used. That seems to be a fairly arbitrary request, as most users don't realize that None is smaller than any other object - I'm sure that quite a few readers of python-dev were surprised to discover this. In my own code, I have produced schedulers in which times of None meant now. Seemingly 0 would make as much sense, but there are situations in which the current time advances from a starting time of 0, some setup rutines get negative times, etc, resulting in 0 not being early enough. I'm not sure I really want to go snooping around in the hundreds of thousands of lines of code that is out there. Pity that google doesn't have a regular expression search. > The opposition to new builtins is fierce, so it would be wise to > bolster the use cases with examples from real code which would be > improved by having an Infinity constant. A little user friendliness > testing on newbies couldn't hurt either (try the tutor list). Also, > develop some strong arguments about why it wouldn't be preferable > to have this as a recipe or example; to tuck it into some other > namespace; or to attach it to a type (like float.infinity, etc). float.infinity makes no sense conceptually. In terms of being of type float, and being infinity, there already exists an IEEE 754 representation of floating point infinity, which is no larger than 1e309. 10**309 is not the largest value that can be represented by Python. Pretending that it is, is not only misleading, it is incorrect. > Under alternatives, be sure to list Alex Martelli's suggestion to > use the key= construct with min() and max(). This could potentially > trivialize all the issues with writing clean code for finding the > biggest and smallest things around: > > min([1,2,3,4]) --> 1 > min([(a,1), (b,2), (c,3), (d,4)], key=itemgetter(1)) --> (a,1) > max([1,2,3,4]) --> 4 > max([(a,1), (b,2), (c,3), (d,4)], key=itemgetter(1)) --> (d,4) Min and max are only useful when you have the entire sequence right now and you only need to find the min or max once. In various optimization problems, finding a minimum (or maximum) is done often, and a heap or heap-like structure is used. For example, in Dijkstra's shortest path problem on weighted edges, one starts out by setting the distances of everything in the graph to infinity, setting the starting point distance to zero, then systematically pulling out the node with smallest distance, adding its neighbors, etc. Min and max are arbitrarily limiting, which is why they aren't used. > All of the examples in the PEP are subsumed by Alex's proposal. Certainly, but I don't offer every use of All in the PEP - doing so would take quite a while. > The only example I can think of that still warrants a constant > is for NULL objects stored in a database -- even that example > is weak because there are likely better ways to handle missing > data values. I don't know, I think that None would be suitable for such cases of 'missing data value'. It fits in terms of concept, but I may have missed significant amounts of discussion on the topic as to whether None makes sense in this case. Heck, as an alternative, a keyword argument would make sense in a function: def getresults(db, query, exceptonnull=0): #partial function prototype Thank you for your input, - Josiah From jcarlson at uci.edu Sun Jan 4 19:29:22 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun Jan 4 19:31:49 2004 Subject: [Python-Dev] Re: PEP 326: A Case for All In-Reply-To: <5.1.0.14.0.20040104153814.05b05c80@mail.telecommunity.com> References: <20040104113316.0E92.JCARLSON@uci.edu> <5.1.0.14.0.20040104153814.05b05c80@mail.telecommunity.com> Message-ID: <20040104143401.0E9B.JCARLSON@uci.edu> > >Further uses of All will be left as an exercise to whomever wants to use > >it. > > Um, then why don't those people just write their own 'All'? It's not like > they all need to be using the same definition of 'All', so why put it in > Python? Just because something is simple in idea and implementation, doesn't mean that it shouldn't be included with Python. The entire itertools module is filled with very useful functions that are simple in idea and implementation. The introduction of All into Python would offer users the opportunity to use an always-compatible implementation of All, without the arbitrariness of argument placement during comparisons: >>> class _AllType(object): ... def __cmp__(self, other): ... if isinstance(other, self.__class__): ... return 0 ... return 1 ... def __repr__(self): ... return 'All' ... >>> class _AllType2(object): ... def __cmp__(self, other): ... if isinstance(other, self.__class__): ... return 0 ... return 1 ... def __repr__(self): ... return 'All2' ... >>> All = _AllType() >>> All2 = _AllType2() >>> All > All2 1 >>> All2 > All 1 > The "Motivation" section of PEP 326 does not answer this > question. Although it claims "hundreds of algorithms", it demonstrates > only *one*: finding the minimum of a sequence. I should have probably made it more explicit. The most visible change when using All, in cases where a logical infinity was needed, is the removal of some "or (cur is None)" blobs. The next most visible change is the increase in understandability of some algorithms who require an infinity initialization. > The example shows three versions of finding a minimum of a sequence, to > show that it's easier with 'All'. But that's a straw man argument: the > *easiest* way to get the desired behavior is just to use 'min(seq)' and not > write any new functions at all! > > So, essentially, the "Motivation" section might as well not be in the PEP, > because it provides no actual motivation at all. You need to find a better > algorithm to cover. I offered the finding the minimum code in order to offer an easily understood algorithm that is simplified by All's introduction. If you would like, I'll put together some code that solves various graph algorithm problems, use 'All', and are simplified because of it. I did not post any in the PEP because I believed that the implications of All were straightforward. > And don't bother using "find the pair x,y from a sequence of pairs where > 'x' is the lowest item" as a substitute, because again the answer is > 'min(seq)'. And even if you say, "yes but we don't want 'y' to be > compared", there's always: That is quite unfriendly and insulting. Call me an idiot next time, it'll be straighter to the point. As Raymond Hettinger points out, Alex Martelli has a suggestion to allow the "key" keyword to min and max that could do everything required with with itemgetter. Or even: def m(a,b): if a[0] < b[0]: return a return b reduce(m, seq) "All" isn't about how many different ways there are to find the minimum of a sequence. "All" is about having a useful and consistant constant. Thank you for your input, - Josiah From john at hazen.net Sun Jan 4 22:14:47 2004 From: john at hazen.net (John Hazen) Date: Sun Jan 4 22:19:12 2004 Subject: [Python-Dev] Re: PEP 326: A Case for All In-Reply-To: <20040104143401.0E9B.JCARLSON@uci.edu> References: <20040104113316.0E92.JCARLSON@uci.edu> <5.1.0.14.0.20040104153814.05b05c80@mail.telecommunity.com> <20040104143401.0E9B.JCARLSON@uci.edu> Message-ID: <20040105031447.GA4437@gate2.hazen.net> * Josiah Carlson [2004-01-04 16:28]: > > I should have probably made it more explicit. The most visible change > when using All, in cases where a logical infinity was needed, is the > removal of some "or (cur is None)" blobs. The next most visible > change is the increase in understandability of some algorithms who > require an infinity initialization. Hi Josiah- Others have commented on the suitability of your name "All". I wanted to point out that in the paragraph above, you resort to calling it "infinity" when talking about it. I think this is telling as to the readability of the name "All". If it doesn't make sense to use the name in discussion, what makes you think it will make sense in a program? Since we're really talking about "infinity" here (its only use is in comparison), why not just call it that? I've heard you make two points for the name "All" -- that it's easier to type than "Infinity", and that it's (in some way) analagous to "None". To me, one of the strengths of python is its emphasis on readability, even (sometimes) at the expense of ease of typing. Also, "All" only has an analogue to one of the uses of None (which, as others have mentioned, is not the primary use). So, to get around the resistance to this because of the name (Guido begged for a better name than "Some", but I think "All" is just as bad), why not just change it to "Infinity"? If the asymmetry of None and Infinity bothers you, I would suggest also proposing a NegativeInfinity, and we could consider eventually deprecating comparisons on None. -John P.S. - Hello, py-dev'ers. This is my first post, but I've been lurking for quite a while. Thank you all for creating a great language! I'll introduce myself more completely within a few weeks, when I find a way to begin contributing. From pje at telecommunity.com Sun Jan 4 22:32:51 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 4 22:34:19 2004 Subject: [Python-Dev] Re: PEP 326: A Case for All In-Reply-To: <20040104143401.0E9B.JCARLSON@uci.edu> References: <5.1.0.14.0.20040104153814.05b05c80@mail.telecommunity.com> <20040104113316.0E92.JCARLSON@uci.edu> <5.1.0.14.0.20040104153814.05b05c80@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20040104215301.026695a0@mail.telecommunity.com> At 04:29 PM 1/4/04 -0800, Josiah Carlson wrote: >Just because something is simple in idea and implementation, doesn't >mean that it shouldn't be included with Python. The entire itertools >module is filled with very useful functions that are simple in idea and >implementation. And they're not built-ins, either. >The introduction of All into Python would offer users the opportunity to >use an always-compatible implementation of All, without the >arbitrariness of argument placement during comparisons: > > >>> class _AllType(object): >... def __cmp__(self, other): >... if isinstance(other, self.__class__): >... return 0 >... return 1 >... def __repr__(self): >... return 'All' >... > >>> class _AllType2(object): >... def __cmp__(self, other): >... if isinstance(other, self.__class__): >... return 0 >... return 1 >... def __repr__(self): >... return 'All2' >... > >>> All = _AllType() > >>> All2 = _AllType2() > >>> All > All2 >1 > >>> All2 > All >1 Can you show how/why this is a problem, or is likely to be a problem? The point of not putting simple things into the standard library isn't whether they're simple, it's whether enough people (or standard library modules) need to do it, and if there are relatively few people who need to do it, and they *can* do it, why not let them do it themselves? The examples you keep presenting aren't motivating in themselves. If I agreed with you that it was a widely-needed and useful object, the content of your PEP (apart from the Motivation section) would be helpful and informative. But the PEP provides no real motivation, except a repeated assertion that some set of algorithms would be clearer with its use. > > The "Motivation" section of PEP 326 does not answer this > > question. Although it claims "hundreds of algorithms", it demonstrates > > only *one*: finding the minimum of a sequence. > >I should have probably made it more explicit. The most visible change >when using All, in cases where a logical infinity was needed, is the >removal of some "or (cur is None)" blobs. Perhaps I should've made my point more explicit. I find it clearer and more efficient for "find a minimum" to take a value from the items under consideration, than to construct an artificial starting point. That's the only algorithm you presented, so it's the only one I could reasonably comment on. >The next most visible change >is the increase in understandability of some algorithms who require an >infinity initialization. Please illustrate some that would be commonly used, and not better off as say, part of a graph algorithms library. (In the latter case, such a library could certainly provide its own logical infinity implementation, specifically suited to the use cases involved.) >I offered the finding the minimum code in order to offer an easily >understood algorithm that is simplified by All's introduction. But it's not a *motivating* example, because the problem you presented is trivially solved using min(), slicing, or iterators. >If you >would like, I'll put together some code that solves various graph >algorithm problems, use 'All', and are simplified because of it. I did >not post any in the PEP because I believed that the implications of All >were straightforward. The implications are straightforward enough. The motivation for a builtin is not. Based on the discussion so far, it seems to me that this is something that belongs in e.g. the Python Cookbook, or part of a graph processing library if there are lots of graph algorithms that need it. So far, you haven't mentioned anything but graph algorithms; are there other common applications of this object? What are they? Without *concrete* use cases, it's impossible to validate the design. Over the course of these threads, you've asserted various invariants you expect All to meet. These invariants are not documented in the PEP (except implicitly in the implementation), nor are their reasoning explained. But, since the only use case you've given is reimplementing 'min', how can anyone say if the invariants you've asserted are the correct ones for the intended uses? For example, why must All compare equal to other instances of its type? It's certainly not necessary for the reimplementing 'min'. Should it even be allowable to create more than one instance of the type? How can we know the answers to these questions without actual use cases? We can only answer YAGNI, and wait for libraries to pop up where people actually use a logical infinity object. When they do, we'll either find that: 1) the invariants they use are the same, possibly justifying an addition to Python, or 2) the invariants they use are necessarily different, meaning there's no reason to pick one arbitrarily. >"All" isn't about how many different ways there are to find the minimum >of a sequence. "All" is about having a useful and consistant constant. Useful to whom, for what? That is the question that isn't even touched on in your PEP, apart from the "minimum of a sequence" example. By the way, with regards to your references section, it might be a good idea to include this one: http://mail.python.org/pipermail/python-dev/2003-December/041332.html whose last paragraph goes to the point I'm making about choosing invariants based on actual use cases. From tjreedy at udel.edu Mon Jan 5 00:00:45 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Jan 5 00:00:45 2004 Subject: [Python-Dev] Re: PEP 326 now online References: <20040104112807.0E8F.JCARLSON@uci.edu> Message-ID: "Josiah Carlson" wrote in message > Give it a look: http://www.python.org/peps/pep-0326.html OBJECTION "All comparisons including None and another object results in None being the smaller of the two." is not exactly true. >>> class lo: ... def __cmp__(s,o): return -1 ... >>> ll=lo() >>> ll < None 1 > > The name must be something else like Infinity or something more > > suggestive of what it does. > > What does the name 'None' suggest about what it does? What None is designed to do is to denote 'nothing' and the name 'None' does that very well. This is at least 99% of its current usage. That it compares to other objects at all is irrelevant to this main usage and is at least partly an artifact of the former rule that all objects should be comparable, even with arbitrary results. If Python had started with complex numbers and some non-comparabilities, None might just have well been made incomparable to anything, as I think it maybe should be and might be in 3.0. The name 'None' suggests almost nothing about the incidental fact that it usually compares low. [Indeed I find no such guarantee in current LibRef 2.3.3 Comparisons (and 2.3.9.7 The Null Object) and it contradicts "Implementation note: Objects of different types except numbers are ordered by their type names".] I consider this lack of suggestiveness to be a negative feature. Adding another object with a non-suggestive name would only compound what is arguably a mistake. (The mistake being to overload None as both a null object and as half of a lo/hi comparison pair, which is quite a different function.) SUBSTITUTE SUGGESTION I do believe that it would be useful (and arguably justifiable) to have a built-in coordinated pair of hi/lo, top/bottom objects *separate* from None. To avoid adding builtin names, they could be bound to cmp as the attributes cmp.hi and cmp.lo. It would be almost obvious, and trivial to remember, that these are objects that compare hi and lo respectively. They could be instances of object, NoneType, or maybe a new CompType. Their string representations could be ObjHi/ObjLo, NoneHi/NoneLo (if of type object or NoneType), or maybe Highest and Lowest. Since these two objects would function as identities for comparison, I would make them the default start values for max and min reductions, so that max() = max([]) = Highest and min() = min([]) = Lowest, instead of raising exceptions, just as sum([]) = 0. This is, however, an optional part of this suggestion. I believe this alternative meets the goal of your PEP better and would perhaps be more acceptible (but we'll hear about that ;-). I would certainly support it. MOTIVATION Add the following as an example of one of the "hundreds of algorithms that begin by initializing some set of values to a logical (or numeric) infinity" that is *not* covered by built in min and max. "For example, in Dijkstra's shortest path problem on weighted edges, one starts out by setting the distances of everything in the graph to infinity, setting the starting point distance to zero, then systematically pulling out the node with smallest distance, adding its neighbors, etc. Min and max are arbitrarily limiting, which is why they aren't used." Terry J. Reedy From jcarlson at uci.edu Mon Jan 5 00:33:17 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon Jan 5 00:35:55 2004 Subject: [Python-Dev] Re: PEP 326 now online (the big reply) In-Reply-To: References: <20040104112807.0E8F.JCARLSON@uci.edu> Message-ID: <20040104212330.0EA8.JCARLSON@uci.edu> > > Give it a look: http://www.python.org/peps/pep-0326.html > > OBJECTION > > "All comparisons including None and another object results in None being > the smaller of the two." is not exactly true. > > >>> class lo: > ... def __cmp__(s,o): return -1 > ... > >>> ll=lo() > >>> ll < None > 1 Ah hah! > > > The name must be something else like Infinity or something more > > > suggestive of what it does. > > > > What does the name 'None' suggest about what it does? > > What None is designed to do is to denote 'nothing' and the name 'None' does > that very well. This is at least 99% of its current usage. That it > compares to other objects at all is irrelevant to this main usage and is at > least partly an artifact of the former rule that all objects should be > comparable, even with arbitrary results. If Python had started with > complex numbers and some non-comparabilities, None might just have well > been made incomparable to anything, as I think it maybe should be and might > be in 3.0. > > The name 'None' suggests almost nothing about the incidental fact that it > usually compares low. [Indeed I find no such guarantee in current LibRef > 2.3.3 Comparisons (and 2.3.9.7 The Null Object) and it contradicts > "Implementation note: Objects of different types except numbers are ordered > by their type names".] I consider this lack of suggestiveness to be a > negative feature. Adding another object with a non-suggestive name would > only compound what is arguably a mistake. (The mistake being to overload > None as both a null object and as half of a lo/hi comparison pair, which is > quite a different function.) > > > SUBSTITUTE SUGGESTION > > I do believe that it would be useful (and arguably justifiable) to have a > built-in coordinated pair of hi/lo, top/bottom objects *separate* from > None. To avoid adding builtin names, they could be bound to cmp as the > attributes cmp.hi and cmp.lo. It would be almost obvious, and trivial to > remember, that these are objects that compare hi and lo respectively. They > could be instances of object, NoneType, or maybe a new CompType. Their > string representations could be ObjHi/ObjLo, NoneHi/NoneLo (if of type > object or NoneType), or maybe Highest and Lowest. > > Since these two objects would function as identities for comparison, I > would make them the default start values for max and min reductions, so > that max() = max([]) = Highest and min() = min([]) = Lowest, instead of > raising exceptions, just as sum([]) = 0. This is, however, an optional > part of this suggestion. > > I believe this alternative meets the goal of your PEP better and would > perhaps be more acceptible (but we'll hear about that ;-). I would > certainly support it. > > > MOTIVATION > > Add the following as an example of one of the "hundreds of algorithms that > begin by initializing some set of values to a logical (or numeric) > infinity" that is *not* covered by built in min and max. > > "For example, in Dijkstra's shortest path problem on weighted edges, one > starts out by setting the distances of everything in the graph to > infinity, setting the starting point distance to zero, then > systematically pulling out the node with smallest distance, adding its > neighbors, etc. Min and max are arbitrarily limiting, which is why they > aren't used." Terry, I very much like your suggestions. Your "SUBSTITUTE SUGGESTION" is better than my original proposal. I'm off to modify the PEP to address these suggested changes, as well as various clarifications that are necessary. Mind if I put you down as a co-author? - Josiah From lists at hlabs.spb.ru Mon Jan 5 05:34:33 2004 From: lists at hlabs.spb.ru (Dmitry Vasiliev) Date: Mon Jan 5 02:33:55 2004 Subject: [Python-Dev] SF patch 864863: Bisect C implementation In-Reply-To: <001001c3d03c$6894c020$e841fea9@oemcomputer> References: <001001c3d03c$6894c020$e841fea9@oemcomputer> Message-ID: <3FF93DB9.2080502@hlabs.spb.ru> Raymond Hettinger wrote: > * if the code is short, move it to the docs -- the itertools module has > pure python equivalents in the docs -- that makes the code accessible > and makes the docs more informative -- this might work for bisect and > heapq where the amount of code is small. +1 -- Dmitry Vasiliev (dima at hlabs.spb.ru) From jcarlson at uci.edu Mon Jan 5 05:06:19 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon Jan 5 05:09:01 2004 Subject: [Python-Dev] Re: PEP 326: A Case for All Message-ID: <20040105003357.0EAA.JCARLSON@uci.edu> Because All has generated as much disagreement as Some did initially, All is definitely a poor name. None being smaller than all other values was an arbitrary decision, and may not be the case in the future. Relying on it is probably not a good idea - especially considering that most people do not know that it could be used in this way. (Also noted by multiple people) If None should not be relied upon, but the existance of a top and bottom value limit make sense (a few people believe that it does), then really we need two values. Terry Reedy suggested making those objects attributes of the cmp builtin. This has the notable side-benefit of not polluting the builtins namespace (YAY!). Terry also suggested that the attributes be cmp.hi and cmp.lo, which seem to be obvious attribute names for what they are. String representations for them are certainly up for debate, ObjHi/ObjLo, NoneHi/NoneLo, Highest/Lowest, Infinity/NegativeInfinity were offered. My only concern is that eval(repr(cmp.hi)) == cmp.hi and eval(repr(cmp.lo)) == cmp.lo be true, but I am leaning toward Highest/Lowest or hi/lo. The names Infinity and NegativeInfinity seem to suggest numbers. Numbers can have math done on them. Certainly there are a certain set of behaviors that make sense, but then there are others that don't (like Infinity - Infinity). Whatever the top and bottom value limit end up being called, whether or not they should allow math to be done on them is probably up for debate. Independant implementations of the top/bottom idea suffer from the ordering of operands to a comparison function making a difference in the result. This may be a problem when an algorithm is written with one top value in mind, yet another comes in and could gum up the works. I've got another draft of my PEP that addresses the above comments and concerns, and provides examples. I'm going to sleep on it before submitting the changes. - Josiah From gerrit at nl.linux.org Mon Jan 5 05:55:37 2004 From: gerrit at nl.linux.org (Gerrit Holl) Date: Mon Jan 5 05:56:31 2004 Subject: [Python-Dev] Re: PEP 326: A Case for All In-Reply-To: <20040105003357.0EAA.JCARLSON@uci.edu> References: <20040105003357.0EAA.JCARLSON@uci.edu> Message-ID: <20040105105537.GA27869@nl.linux.org> Josiah Carlson wrote: > Terry also suggested that the attributes be cmp.hi and cmp.lo, which > seem to be obvious attribute names for what they are. String > representations for them are certainly up for debate, ObjHi/ObjLo, > NoneHi/NoneLo, Highest/Lowest, Infinity/NegativeInfinity were offered. > My only concern is that eval(repr(cmp.hi)) == cmp.hi and > eval(repr(cmp.lo)) == cmp.lo be true, but I am leaning toward > Highest/Lowest or hi/lo. What about: repr(cmp.lo) == "cmp.lo", etc.? hi and lo would be instances of a certain class (cmp.extreme?). How exactly the __repr__ of this class would know the name of cmp I don't know. What do we do with cmp.extreme - or cmp.hi.__class__ - is it allowed to make additional instances, as with bool, or not, as with NoneType? I propose the latter, because of TOOWTDI, and we can be certain that they are cmp attributes and cmp is always available (IIRC shadowing builtins will become forbidden evantually?), so it's got less problems. > The names Infinity and NegativeInfinity seem to suggest numbers. I don't think so. IIRC, Infinity is not a number... > Numbers can have math done on them. ...because you can't do math on them. I propose: cmp.high.__class__ == cmp.low.__class__ == cmp.extreme yours, Gerrit. -- 180. If a father give a present to his daughter -- either marriageable or a prostitute (unmarriageable) -- and then die, then she is to receive a portion as a child from the paternal estate, and enjoy its usufruct so long as she lives. Her estate belongs to her brothers. -- 1780 BC, Hammurabi, Code of Law -- Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/ From mwh at python.net Mon Jan 5 07:56:28 2004 From: mwh at python.net (Michael Hudson) Date: Mon Jan 5 07:56:31 2004 Subject: [Python-Dev] Toowtdi: Datatype conversions In-Reply-To: (Robert Brewer's message of "Sat, 3 Jan 2004 13:37:38 -0800") References: Message-ID: <2mhdza7dvn.fsf@starship.python.net> "Robert Brewer" writes: > Raymond Hettinger wrote: >> Choosing between: >> >> list(d) or d.keys() >> >> Which is the one obvious way of turning a dictionary into a list? >> IMO, list(d) is it. > > Except that, "turning a dictionary into a list" as an English phrase is > indeterminate--do you want the keys or the values or both? Indeed. In particular, I often write for k in d: ... and for k,v in d: ... and seem to subconsciously expect Python to work out the rest :-) (this is not a feature request...) Cheers, mwh -- Unfortunately, nigh the whole world is now duped into thinking that silly fill-in forms on web pages is the way to do user interfaces. -- Erik Naggum, comp.lang.lisp From FBatista at uniFON.com.ar Mon Jan 5 08:04:02 2004 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Jan 5 08:05:57 2004 Subject: [Python-Dev] prePEP "Decimal data type" v0.2 Message-ID: I finished the second version of the prePEP for the Decimal data type. Sorry for the delay, but I restructured it to new (thanks list), and changing to a new house and deploying GSM in Argentina are not very time-freeing tasks, :p I don't post it to the list because it was blocked in c.l.p because it's size. You can reach it here: http://www.taniquetil.com.ar/facundo/python/prePEP-Decimal-0.2.html I'll appreciate any suggestion. Thank you! . Facundo From FBatista at uniFON.com.ar Mon Jan 5 08:40:44 2004 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Jan 5 08:42:39 2004 Subject: [Python-Dev] New version of installer Message-ID: Martin v. Loewis wrote: #- I have put a new version of the installer at #- #- http://www.dcl.hpi.uni-potsdam.de/home/loewis/python2.4.0.12421.msi #- ... #- ... #- I'm particularly interested in whether this can be installed and #- run successfully on Windows 95 (and other W9x versions). If you don't #- have installer, get it from I'll try it on w95. Just to get the test as close as posible to your expectations, should I uninstall Python2.3 first? . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20040105/59446e94/attachment.html From pje at telecommunity.com Mon Jan 5 09:19:43 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 5 09:21:47 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: References: <20040104112807.0E8F.JCARLSON@uci.edu> Message-ID: <5.1.0.14.0.20040105090139.039395c0@mail.telecommunity.com> At 12:00 AM 1/5/04 -0500, Terry Reedy wrote: >I do believe that it would be useful (and arguably justifiable) to have a >built-in coordinated pair of hi/lo, top/bottom objects *separate* from >None. To avoid adding builtin names, they could be bound to cmp as the >attributes cmp.hi and cmp.lo. It would be almost obvious, and trivial to >remember, that these are objects that compare hi and lo respectively. They >could be instances of object, NoneType, or maybe a new CompType. Their >string representations could be ObjHi/ObjLo, NoneHi/NoneLo (if of type >object or NoneType), or maybe Highest and Lowest. I'm somewhere between -0 and +0 on the above, so split the difference and call it 0. :) >Since these two objects would function as identities for comparison, I >would make them the default start values for max and min reductions, so >that max() = max([]) = Highest and min() = min([]) = Lowest, instead of >raising exceptions, just as sum([]) = 0. This is, however, an optional >part of this suggestion. -1 on changing the exception throwing behavior; it would make bugs in programs that aren't expecting these objects harder to find (i.e., because later operations will fail instead of the min or max call). >"For example, in Dijkstra's shortest path problem on weighted edges, one >starts out by setting the distances of everything in the graph to >infinity, setting the starting point distance to zero, then >systematically pulling out the node with smallest distance, adding its >neighbors, etc. Min and max are arbitrarily limiting, which is why they >aren't used." I always thought the natural way to do this was to have a stack (heap) of nodes to be checked, initialized with the starting node, and an initially empty map of node:distance. Setting distances to infinity seems very artificial to me, given that what's actually meant is that the distance is *unknown*. But maybe that's just me. IMO, as we live in a finite reality, infinity doesn't come up that often as a referent in software. Unless of course you're doing mathematics, which in its most perfected forms has nothing to do with the real world. ;) But if you are doing mathematics, then the infinity ought to have the numeric properties of infinity, shouldn't it? For example, adding, multiplying, subtracting from, or dividing infinity ought to still produce infinity, no? This is why I keep saying that having more concrete use cases would be better, and am only "0" on the cmp.hi/lo or Highest/Lowest objects. From guido at python.org Mon Jan 5 09:38:12 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 5 09:38:16 2004 Subject: [Python-Dev] Re: PEP 326: A Case for All In-Reply-To: Your message of "Mon, 05 Jan 2004 02:06:19 PST." <20040105003357.0EAA.JCARLSON@uci.edu> References: <20040105003357.0EAA.JCARLSON@uci.edu> Message-ID: <200401051438.i05EcCA08017@c-24-5-183-134.client.comcast.net> > Because All has generated as much disagreement as Some did initially, > All is definitely a poor name. It goes beyond that. I don't think that you can convince me that this is a useful feature under *any* name (since you insist that it be different from Infinity). You can revise the PEP all you will, but I think you are wasting your time. The feature is not a good addition to Python. If you want more reasons, consider all of the negative feedback you've received so far. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jan 5 09:53:32 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 5 09:53:38 2004 Subject: [Python-Dev] prePEP "Decimal data type" v0.2 In-Reply-To: Your message of "Mon, 05 Jan 2004 10:04:02 -0300." References: Message-ID: <200401051453.i05ErXM08135@c-24-5-183-134.client.comcast.net> > I finished the second version of the prePEP for the Decimal data type. > > Sorry for the delay, but I restructured it to new (thanks list), and > changing to a new house and deploying GSM in Argentina are not very > time-freeing tasks, :p > > I don't post it to the list because it was blocked in c.l.p because it's > size. You can reach it here: > > http://www.taniquetil.com.ar/facundo/python/prePEP-Decimal-0.2.html > > I'll appreciate any suggestion. Thank you! Very cool. I haven't read it all (I have no time and I'm no expert on this area), but it seems you have worked diligently with the community, and the idea needs to get closer to implementation. I suggest that you register with the PEP editor right away to get a PEP number assigned -- there's no reason to have this be considered a "proto-PEP" still. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Mon Jan 5 10:06:50 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 5 10:06:53 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <200401031641.i03GfLL11733@c-24-5-183-134.client.comcast.net> References: <3FF6AB34.1000502@v.loewis.de> <200401031641.i03GfLL11733@c-24-5-183-134.client.comcast.net> Message-ID: <16377.32138.563619.390078@montanaro.dyndns.org> >> I'd like to add a static property unicode.errorhandling, ... >> What do you think? Guido> This is a global default, right? Seems to work for the socket Guido> timeout; I think the case for a global default is similar. Not so fast, pardner. Note that the socket module's timeout would be most beneficial to users of higher level modules (urllib[2]?, httplib, ftplib, etc), not directly by programmers. To the best of my knowledge, none of these modules have been adapted to this new feature yet, so I think it's still a bit premature extrapolate from socket.defaulttimeout to unicode.errorhandling. Also, as Phillip Eby pointed out, a UnicodeError generally results from an error in my code, not an intractable problem. (That's my experience, anyway.) Skip From mcherm at mcherm.com Mon Jan 5 10:11:10 2004 From: mcherm at mcherm.com (Michael Chermside) Date: Mon Jan 5 10:11:12 2004 Subject: [Python-Dev] Re: PEP 324: popen5 - New POSIX process module Message-ID: <1073315470.3ff97e8e18f9f@mcherm.com> On Sat, 3 Jan 2004, Peter ?strand wrote: > PEP 324: popen5 - New POSIX process module [...] > There are some issues wrt Windows support. Currently, popen5 does not > support Windows at all. To be able to do so, we must chose between: > > 1) Rely on Mark Hammonds Win32 extensions. [...] > or > 2) [...] copy the required process handling functions from win32all I STRONGLY support doing ONE of these so that popen5 (gee... we gotta get a better name) works "on all major platforms". Ideally, it would work "out of the box", ie, on all sane installations. Personally, I think managing an extra copy of the code would be a bit of a pain, so I would probably prefer (1) (rely on win32all). But that means that people who write simple cross-platform scripts and utilities can't really use popen5 and expect things to work. (I'm not worried about people who write whole applications in Python... they can go ahead and require win32all or ensure it gets loaded. But if I'm writing some short utility scripts or, worse yet, writing code to be published in a magazine or journal and typed in / downloaded by random users, then I am strongly urged to use only completely portable code.) So I guess I am forced to put up with (2), and the (hopefully minor) headaches of copying Mark's code UNLESS we can somehow ensure that win32all is available on all sane windows installs. If, for instance, it were released with the python.org releases for the Windows platform, (other packagers already include it), then I think we could dispense with the annoying need to duplicate code. -- Michael Chermside From guido at python.org Mon Jan 5 10:12:37 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 5 10:12:41 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: Your message of "Sun, 04 Jan 2004 12:02:59 +0100." References: Message-ID: <200401051512.i05FCbG08182@c-24-5-183-134.client.comcast.net> Peter, After seeing the discussion that your PEP has sparked (without reading all of it) I am torn. I agree that Python's subprocess management facilities can use a lot of work. I agree that your code is by and large an improvement over popen2 and friends. But I'm not sure that it is *enough* of an improvement to be accepted into the standard library. What I'm looking for in standard library additions is "category-killers". IMO good recent examples are optparse and the logging module. Both were in active use and had independent distributions with happy users before they were accepted into the standard library. Note that in my definition, a category-killer doesn't have to have all conceivable features -- it has to strike the right balance between feature completeness and a set of properties that are usually referred to by words like elegance, simplicity, ease-of-use, ease-of-learning. For example, optparse specifically does not support certain styles of option parsing -- it strives to encourage uniformity in option syntax, and that rules out certain features. (I'm not saying that your popen5 has too many features -- I'm just warning that I'm not asking you to make it a category-killer by adding tons more features. I don't want more features, I want the satisfying feeling that this is the "best" solution, for my very personal definition of "best".) I note that your source code at http://cvs.lysator.liu.se/viewcvs/viewcvs.cgi/popen5/?cvsroot=python-popen5 has some useful comments that were missing from the PEP, e.g. how to replace various older APIs with calls to popen5. It also has some motivation for the API choices you made that are missing from the PEP (e.g. why there are options for setting cwd). So what's missing? For the PEP, I would say that a lot of explanatory and motivational text from the source code comments and doc strings should be added. (PEPs don't have to be terse formal documents that describe the feature in as few words as possible; that can be one section of the PEP, but in general a PEP needs to provide motivation and insight as well as specification.) For the code, I think that a Windows version is essential. It would be okay if a small amount of code (e.g. making some basic Windows APIs available, like CreateProcess) had to be coded in C, as long as the bulk could still be in Python. I really like having this kind of code in Python -- it helps understanding the fine details, and it allows subclassing more easily. I also wonder if more support for managing a whole flock of subprocesses might not be a useful addition; this is a common need in some applications. And with this, I'd like to see explicit support (somehow -- maybe through a separate class) for managing "daemon" processes. This certainly is a common need! I would like this to replace all other "high-level" subprocess management facilities, in particular os.system() and os.popen() and the popen2 module, on all platforms. Oh, and also os.spawn*(). But not os.fork() and os.exec*(), since those are the building blocks (on Unix, anyway). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jan 5 10:15:38 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 5 10:15:44 2004 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0325.txt, NONE, 1.1 In-Reply-To: Your message of "Sat, 03 Jan 2004 08:18:13 PST." References: Message-ID: <200401051515.i05FFdr08216@c-24-5-183-134.client.comcast.net> Should I comment on PEP 325 yet? I haven't seen an official announcement. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Mon Jan 5 10:38:56 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 5 10:40:20 2004 Subject: [Python-Dev] trying to build 2.3.3 with db-4.2.52 In-Reply-To: <16375.62406.740857.546472@gargle.gargle.HOWL> References: <16375.62406.740857.546472@gargle.gargle.HOWL> Message-ID: <16377.34064.576330.366808@montanaro.dyndns.org> Matthias> Trying to build python2.3.3 with db-4.2.52, I get many Matthias> failures in various tests: Matthias> 5 tests failed: Matthias> test_anydbm test_bsddb test_dircache test_shelve test_whichdb Matthias> all of the same type: Matthias> test test_whichdb failed -- Traceback (most recent call last): Matthias> File "/home/packages/python/2.3/python2.3-2.3.3/Lib/test/test_whichdb.py", line 49, in test_whichdb_name Matthias> f = mod.open(_fname, 'c') Matthias> File "/home/packages/python/2.3/python2.3-2.3.3/Lib/dbhash.py", line 16, in open Matthias> return bsddb.hashopen(file, flag, mode) Matthias> File "/home/packages/python/2.3/python2.3-2.3.3/Lib/bsddb/__init__.py", line 192, in hashopen Matthias> d.open(file, db.DB_HASH, flags, mode) Matthias> DBError: (38, 'Function not implemented -- process-private: unable to initialize Matthias> environment lock: Function not implemented') Here an extra data point. I built Python 2.3.3 with db-4.2.52 last Friday on Solaris 8 and didn't encounter any problems like this. Here's a quick check of test_anydbm: $ python /usr/local/lib/python2.3/test/test_anydbm.py test_anydbm_creation (__main__.AnyDBMTestCase) ... ok test_anydbm_keys (__main__.AnyDBMTestCase) ... ok test_anydbm_modification (__main__.AnyDBMTestCase) ... ok test_anydbm_read (__main__.AnyDBMTestCase) ... ok ---------------------------------------------------------------------- Ran 4 tests in 0.521s OK Skip From pedronis at bluewin.ch Mon Jan 5 10:45:14 2004 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Mon Jan 5 10:41:36 2004 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0325.txt, NONE, 1.1 In-Reply-To: <200401051515.i05FFdr08216@c-24-5-183-134.client.comcast.ne t> References: Message-ID: <5.2.1.1.0.20040105163515.030e7c38@pop.bluewin.ch> At 07:15 05.01.2004 -0800, Guido van Rossum wrote: >Should I comment on PEP 325 yet? I haven't seen an official >announcement. I sent that version in October, but David was moving house and because of its ISP fault it never got it (but I didn't know), then I was busy and I resubmitted it just now because I was puzzled, it was the new year and I didn't like having it sitting on my HD even for longer. But to be honest, right now I have no time for lengthy discussions. regards. From skip at pobox.com Mon Jan 5 11:44:37 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 5 11:44:47 2004 Subject: [Python-Dev] Weekly Python Bug/Patch Summary In-Reply-To: <87n093r56n.fsf@hydra.localdomain> References: <200401041301.i04D18ch013001@manatee.mojam.com> <87n093r56n.fsf@hydra.localdomain> Message-ID: <16377.38005.301812.616913@montanaro.dyndns.org> >> Bug/Patch Summary >> ----------------- >> >> 618 open / 4468 total bugs (+81) Kurt> ^^^ is this supposed to be the Kurt> change in the period? Known problem. Someone (I think) volunteered to look at this awhile ago but hasn't yet. >> 224 open / 2514 total patches (+36) Kurt> [...] >> Closed Bugs >> ----------- Kurt> [...] >> Closed Patches >> -------------- Kurt> What is the algorithm used? Many of the IDLE bugs and patches Kurt> listed here as closed are still open. I looked at a couple of Kurt> non-IDLE items and they were open, also. I'm not entirely surprised there are problems. I haven't touched the script in ages. It's just a little screen scraper I cobbled together from other pieces I had lying about, so if there have been any changes to SF's page layout, this stuff might well break. I don't suppose SF has developed a SOAP or XML-RPC interface by now? I'll take a quick look at the code. If you'd like to contribute your eyeballs as well, have a look at http://www.musi-cal.com/~skip/python/getpybugs.py Skip From guido at python.org Mon Jan 5 11:44:43 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 5 11:44:54 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: Your message of "Mon, 05 Jan 2004 09:06:50 CST." <16377.32138.563619.390078@montanaro.dyndns.org> References: <3FF6AB34.1000502@v.loewis.de> <200401031641.i03GfLL11733@c-24-5-183-134.client.comcast.net> <16377.32138.563619.390078@montanaro.dyndns.org> Message-ID: <200401051644.i05GihZ08438@c-24-5-183-134.client.comcast.net> > >> I'd like to add a static property unicode.errorhandling, ... > >> What do you think? > > Guido> This is a global default, right? Seems to work for the socket > Guido> timeout; I think the case for a global default is similar. > > Not so fast, pardner. Note that the socket module's timeout would be > most beneficial to users of higher level modules (urllib[2]?, httplib, > ftplib, etc), not directly by programmers. To the best of my knowledge, > none of these modules have been adapted to this new feature yet, so I think > it's still a bit premature extrapolate from socket.defaulttimeout to > unicode.errorhandling. Also, as Phillip Eby pointed out, a UnicodeError > generally results from an error in my code, not an intractable problem. > (That's my experience, anyway.) Sure. So in your programs, keep the default. But if someone has written a program and decides that they don't want to deal with their (*their*!) errors by fixing the code, but rather would continue to go with possibly broken data (and they may well know enough about their application to know that that is harmless), why should we not give them a way to do that? AFAIK all the proposal does is give a way to change the default error handling -- any code that sets an explicit error handling policy will continue to receive exceptions. --Guido van Rossum (home page: http://www.python.org/~guido/) From mcherm at mcherm.com Mon Jan 5 11:52:13 2004 From: mcherm at mcherm.com (Michael Chermside) Date: Mon Jan 5 11:52:15 2004 Subject: [Python-Dev] PEP 326 now online Message-ID: <1073321533.3ff9963ddc65c@mcherm.com> Josiah Carlson writes: > I believe I've addressed the majority of concerns had for the All > (previously Some) concept and object (the PEP editor seems to agree), > and the PEP has been posted as PEP 326. > > Give it a look: http://www.python.org/peps/pep-0326.html > > As always, comments, questions, and suggestions are always appreciated. Thanks Josiah, it looks to me like you did a good job of boiling things down to a single, specific proposal and collecting and clearly stating the arguments in favor. That's what PEP writing is all about. Unfortunately, after reading PEP 326, I find myself convinced that it is NOT a good idea, and I would encourage the BDFL to reject the PEP. My reasons are quite simple: (1) As the PEP demonstrates, creating a value that sorts larger than any other is extremely trivial. (2) The PEP's motivation succesfully demonstrates that having a "largest" value for pre-population or sentinal purposes is useful for certain algorithms. But it provides NO reason why each author of such an algorithm could not simply create such an "All" object themselves. (3) The bar is quite high for adding new built-ins. This PEP, for instance, either break or render quite confusing any program which used the identifier "All". So, unless someone shows me a convincing argument for why this is more useful when used as a built-in constant rather than being coded by each user, I think this PEP should not be included in Python. Please feel free to include this viewpoint in the "Dissenting Opinion" section of the PEP. -- Michael Chermside From guido at python.org Mon Jan 5 12:44:19 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 5 12:44:31 2004 Subject: [Python-Dev] PyCon Reminder: Proposal deadline 1/15 Message-ID: <200401051744.i05HiJN08590@c-24-5-183-134.client.comcast.net> This is a reminder that the deadline for sending proposals for presentations at PyCon DC 2004 is January 15, 2004. We are interested in any and all submissions about uses of Python and the development of the language. Since there is expected to be a strong educational community presence for the next PyCon, teaching materials of various kinds are also encouraged. You can submit your proposal at: http://submit.pycon.org/ For more information about proposals, see: http://www.pycon.org/dc2004/cfp/ If you have further questions about the submission web interface or the format of submissions, please write to: pycon-organizers@python.org We would like to publish all accepted papers on the web. If your paper is accepted and you prepare an electronic presentation (in PDF, PythonPoint or PowerPoint) we will also happily publish that on the web site once PyCon is over. If you don't want to make a formal presentation, there will be a significant amount of Open Space to allow for informal and spur-of-the-moment presentations for which no formal submission is required. There will also be several Lightning Talk sessions (five minutes or less). About PyCon: PyCon is a community-oriented conference targeting developers (both those using Python and those working on the Python project). It gives you opportunities to learn about significant advances in the Python development community, to participate in a programming sprint with some of the leading minds in the Open Source community, and to meet fellow developers from around the world. The organizers work to make the conference affordable and accessible to all. PyCon DC 2004 will be held March 24-26, 2004 in Washington, D.C. The keynote speaker is Mitch Kapor of the Open Source Applications Foundation (http://www.osafoundation.org/). There will be a four-day development sprint before the conference. We're looking for volunteers to help run PyCon. If you're interested, subscribe to http://mail.python.org/mailman/listinfo/pycon-organizers Don't miss any PyCon announcements! Subscribe to http://mail.python.org/mailman/listinfo/pycon-announce You can discuss PyCon with other interested people by subscribing to http://mail.python.org/mailman/listinfo/pycon-interest The central resource for PyCon DC 2004 is http://www.pycon.org/ Pictures from last year's PyCon: http://www.python.org/cgi-bin/moinmoin/PyConPhotos I'm looking forward to seeing you all in DC in March!!! --Guido van Rossum (home page: http://www.python.org/~guido/) From trentm at ActiveState.com Mon Jan 5 13:13:15 2004 From: trentm at ActiveState.com (Trent Mick) Date: Mon Jan 5 13:19:26 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: ; from claird@lairds.com on Sat, Jan 03, 2004 at 09:15:48AM -0500 References: Message-ID: <20040105101315.G22900@ActiveState.com> [Cameron Laird wrote] > Python generation for HP-UX is broken. I want to fix it. I am able to build Python for HP-UX 11.00 (on parisc). Mind you, for that work I am on Python 2.1, if you can believe it. :) I _do_ have to move to Python 2.3 at some point so I am interesting in fixing any problems here as well. I generally don't care about curses so don't know about support for that. Threading and Tkinter work fine though, IIRC. Quickly looking through my diffs I have some configure, Makefile, and environment tweaks to get my builds to work. Trent -- Trent Mick TrentM@ActiveState.com From trentm at ActiveState.com Mon Jan 5 14:05:33 2004 From: trentm at ActiveState.com (Trent Mick) Date: Mon Jan 5 14:12:09 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: ; from pf_moore@yahoo.co.uk on Sat, Jan 03, 2004 at 08:45:04PM +0000 References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <3cb2xhgn.fsf@yahoo.co.uk> <3FF20589.2030702@v.loewis.de> Message-ID: <20040105110533.H22900@ActiveState.com> [Paul Moore wrote] > > msiexec.exe python2.4.0.msi DEFAULTPYTHON=1 > > That sounds good enough - I know little about MSI, so when you say > "public property" it doesn't mean much to me. But if it translates to > "command line option with no GUI interface" then I have no problem at > all with this - it's for advanced users only, so that's not a problem. The install notes for ActivePython (which uses an MSI installer) show some of the things you can do with an MSI installer. The discussion for the ActivePython installer might shed some light, if you care: http://aspn.activestate.com/ASPN/docs/ActivePython/2.3/UserGuide/install.html#install_dos Cheers, Trent -- Trent Mick TrentM@ActiveState.com From trentm at ActiveState.com Mon Jan 5 14:08:17 2004 From: trentm at ActiveState.com (Trent Mick) Date: Mon Jan 5 14:12:46 2004 Subject: [Python-Dev] Switch to VC.NET 7.1 completed In-Reply-To: <3FF5E550.3030006@v.loewis.de>; from martin@v.loewis.de on Fri, Jan 02, 2004 at 10:40:32PM +0100 References: <3FF5E550.3030006@v.loewis.de> Message-ID: <20040105110817.I22900@ActiveState.com> [Martin v. Loewis wrote] > I now have committed the bits for building Python with VC.NET. Was there general agreement that this move should be made now? I would still like to hear from Mark on potential issue with win32all. Trent -- Trent Mick TrentM@ActiveState.com From jcarlson at uci.edu Mon Jan 5 15:42:18 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon Jan 5 15:45:22 2004 Subject: [Python-Dev] PEP 326 now online In-Reply-To: <1073321533.3ff9963ddc65c@mcherm.com> References: <1073321533.3ff9963ddc65c@mcherm.com> Message-ID: <20040105123921.F749.JCARLSON@uci.edu> I've submitted my latest changes that I believe address the concerns brought up since I last emailed the list. Until they are in CVS and on the standard PEP page, it can be found here: http://josiahcarlson.no-ip.org/pep-0326.html The major change of note: it is no longer a builtin. - Josiah From mal at egenix.com Mon Jan 5 16:04:31 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Jan 5 16:06:57 2004 Subject: [Python-Dev] pybench.py stats Message-ID: <3FF9D15F.9040900@egenix.com> I thought you might find this interesting. For some time I've been running pybench.py as nightly cronjob against fresh CVS checkouts and recorded the findings. Attached is a PNG diagram showing the findings. The results are simple to interpret: there was only a very slight performance improvement over the last year. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 05 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: -------------- next part -------------- A non-text attachment was scrubbed... Name: pybench.gif Type: image/gif Size: 16456 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040105/e655164f/pybench-0001.gif From fumanchu at amor.org Mon Jan 5 16:10:23 2004 From: fumanchu at amor.org (Robert Brewer) Date: Mon Jan 5 16:16:24 2004 Subject: [Python-Dev] PEP 326 now online Message-ID: Josiah Carlson wrote: > I've submitted my latest changes that I believe address the concerns > brought up since I last emailed the list. Until they are in > CVS and on > the standard PEP page, it can be found here: > http://josiahcarlson.no-ip.org/pep-0326.html > > The major change of note: it is no longer a builtin. My only comment on the changes: I definitely do not approve of the use of the abbreviations "hi" and "lo" in place of standard English "high" and "low"--human parsing of "hi" and "lo" depends upon a familiarity with English which many people in this world do not possess. In my book, three characters of extra typing are more than offset by the improved accessibility. As far as the whole PEP, I'm +0. I'd never use it. Fighting unnecessary abbreviations wherever they are found, Robert Brewer MIS Amor Ministries fumanchu@amor.org From martin at v.loewis.de Mon Jan 5 16:25:16 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Mon Jan 5 16:25:40 2004 Subject: [Python-Dev] SUCCESS! 2.4 extensions w/mingw (was Re: Switching to VC.NET 2003) In-Reply-To: <5.1.0.14.0.20040104142918.03a28e60@mail.telecommunity.com> References: <3FF7FC20.6060406@v.loewis.de> <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> <3FF059A1.8090202@v.loewis.de> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <5.1.0.14.0.20040103233904.031e00e0@mail.telecommunity.com> <5.1.0.14.0.20040104142918.03a28e60@mail.telecommunity.com> Message-ID: <3FF9D63C.8020006@v.loewis.de> Phillip J. Eby wrote: > Should this, or something like it, be distributed with Windows Python? > If so, where in the distribution should it go? Should I create a SF > patch for this? I would put it into Tools/scripts; for that, no SF patch would be necessary. However, I would think that distutils should find out whether the essential libraries are present, and suggest running a script if they are not - so please do submit a patch for that (which then should include your script). Regards, Martin From martin at v.loewis.de Mon Jan 5 16:32:32 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Mon Jan 5 16:32:51 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FF8A058.4020904@egenix.com> References: <3FF6AB34.1000502@v.loewis.de> <3FF8A058.4020904@egenix.com> Message-ID: <3FF9D7F0.4080802@v.loewis.de> M.-A. Lemburg wrote: > Codecs define what the default error handling should be (setting > errors to NULL), not the Unicode implementation. I'm not proposing to change this. Codecs would continue to define their error handling. The builtin codecs would base their decision on unicode.errorhandling, though. Regards, Martin From mal at egenix.com Mon Jan 5 16:36:05 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Jan 5 16:35:57 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <200401051644.i05GihZ08438@c-24-5-183-134.client.comcast.net> References: <3FF6AB34.1000502@v.loewis.de> <200401031641.i03GfLL11733@c-24-5-183-134.client.comcast.net> <16377.32138.563619.390078@montanaro.dyndns.org> <200401051644.i05GihZ08438@c-24-5-183-134.client.comcast.net> Message-ID: <3FF9D8C5.4030907@egenix.com> Guido van Rossum wrote: > Sure. So in your programs, keep the default. > > But if someone has written a program and decides that they don't want > to deal with their (*their*!) errors by fixing the code, but rather > would continue to go with possibly broken data (and they may well know > enough about their application to know that that is harmless), why > should we not give them a way to do that? > > AFAIK all the proposal does is give a way to change the default error > handling -- any code that sets an explicit error handling policy will > continue to receive exceptions. Note that the default error handling is defined by the codec, not the Unicode implementation. In most cases, the implementation passes NULL as errors argument to the codec which means that the codec's default errors handling should be used (for builtin codecs this happens to be the same as "strict", but it is explicitely allowed for codecs to decide what to do, so external codecs might want to be less restrictive or do something clever instead, e.g. use approximations of the characters in question). There are only a few places where the implementation does use "strict" to prevent cases where the automagic coercion code tries to convert an 8-bit string into Unicode, or Unicode into a default encoded 8-bit string. The only "default error handling" the proposed option could change is this usage of "strict". IMO, an application which relies on silencing these kinds of errors is simply broken. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 05 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Mon Jan 5 16:39:23 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Mon Jan 5 16:39:54 2004 Subject: [Python-Dev] Switch to VC.NET 7.1 completed In-Reply-To: <20040105110817.I22900@ActiveState.com> References: <3FF5E550.3030006@v.loewis.de> <20040105110817.I22900@ActiveState.com> Message-ID: <3FF9D98B.4000906@v.loewis.de> Trent Mick wrote: > Was there general agreement that this move should be made now? Yes. None of the major contributors (Guido, Tim, Thomas) objected, even though Mark's lack of objection may be due to his vacation. There was some concern that mingw building may not work, which we found was fortunately not true. There was some concern that using .NET SDK compiler may not be possible, however a) using the SDK compiler was not possible out of the box before, either, and b) people interested in this can contribute a solution later still, and c) the VC6 files are still available to those that want to generate makefiles. Apart from that, there was no objection. Regards, Martin From python at rcn.com Mon Jan 5 16:44:01 2004 From: python at rcn.com (Raymond Hettinger) Date: Mon Jan 5 16:44:39 2004 Subject: [Python-Dev] PEP 326 now online In-Reply-To: <20040105123921.F749.JCARLSON@uci.edu> Message-ID: <005a01c3d3d5$08c30840$e841fea9@oemcomputer> > I've submitted my latest changes that I believe address the concerns > brought up since I last emailed the list. Until they are in CVS and on > the standard PEP page, it can be found here: > http://josiahcarlson.no-ip.org/pep-0326.html +1 It would occasionally be useful to have ready access to an object whose sole purpose is to compare higher or lower than everything else. Robert Brewer's idea to use spelled-out names, cmp.high and cmp.low, is easy to remember, better for non-native English speakers, and reasonably compact. As for the open issues section, I am -1 on changing the behavior of min/max to return the new objects when given an empty list. Likewise, I'm -1 on having the operators do any math except for comparisons. The idea is most attractive in its simplest form -- there is no reason to make something hard out of something simple. Raymond Hettinger From stephen.pheiffer at nist.gov Mon Jan 5 16:56:30 2004 From: stephen.pheiffer at nist.gov (Stephen Pheiffer) Date: Mon Jan 5 17:00:06 2004 Subject: [Python-Dev] Visual Stduio .NET build of python Message-ID: <1073339790.13703.1.camel@h122034.ncnr.nist.gov> Does a Visual Studio .NET build of python exist? distutils will not work to build python because it was created using Visual C 6.0, while I have Visual Studio .NET installed on my machine. From martin at v.loewis.de Mon Jan 5 17:12:09 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Mon Jan 5 17:12:28 2004 Subject: [Python-Dev] Visual Stduio .NET build of python In-Reply-To: <1073339790.13703.1.camel@h122034.ncnr.nist.gov> References: <1073339790.13703.1.camel@h122034.ncnr.nist.gov> Message-ID: <3FF9E139.1030702@v.loewis.de> Stephen Pheiffer wrote: > Does a Visual Studio .NET build of python exist? distutils will not > work to build python because it was created using Visual C 6.0, while I > have Visual Studio .NET installed on my machine. Yes, an alpha release exists, at http://www.dcl.hpi.uni-potsdam.de/home/loewis/python2.4.0.12421.msi If you need a build for Python 2.3, you will have to create one yourself. Regards, Martin From tjreedy at udel.edu Mon Jan 5 17:18:57 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Jan 5 17:18:55 2004 Subject: [Python-Dev] Re: Re: PEP 326 now online (the big reply) References: <20040104112807.0E8F.JCARLSON@uci.edu> <20040104212330.0EA8.JCARLSON@uci.edu> Message-ID: "Josiah Carlson" wrote in message news:20040104212330.0EA8.JCARLSON@uci.edu... > Terry, I very much like your suggestions. Nice to know that you are not afflicted with NIH syndrome ;-) One other point: given that the extreme values are as far from 0 as possiible (when compared to numbers), I think it should be that bool(Highest) = bool(Lowest) = True. bool(None) = False is to me another strike against using it as an 'extreme' value. > Mind if I put you down as a co-author? Either way OK with me. Terry J. Reedy From guido at python.org Mon Jan 5 23:17:45 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 5 23:17:52 2004 Subject: [Python-Dev] pybench.py stats In-Reply-To: Your message of "Mon, 05 Jan 2004 22:04:31 +0100." <3FF9D15F.9040900@egenix.com> References: <3FF9D15F.9040900@egenix.com> Message-ID: <200401060417.i064Hjs09438@c-24-5-183-134.client.comcast.net> > I thought you might find this interesting. For some time I've been > running pybench.py as nightly cronjob against fresh CVS checkouts > and recorded the findings. Attached is a PNG diagram showing the > findings. > > The results are simple to interpret: there was only a very slight > performance improvement over the last year. Indeed. I don't think there was a concerted effort to speed up basic operations last year. Several library modules were sped up dramatically by C reimplementation, but of course that wouldn't be measured by pybench. But we may have to work harder -- after all I really wouldn't like to receive pie in my face in the Parrot contest next July at OSCON! --Guido van Rossum (home page: http://www.python.org/~guido/) From mhammond at skippinet.com.au Tue Jan 6 01:24:40 2004 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue Jan 6 01:25:30 2004 Subject: [Python-Dev] Switching to VC.NET 2003 In-Reply-To: Message-ID: <192001c3d41d$c5a71f30$2c00a8c0@eden> > [Martin v. Loewis] > > ... > > > > In the sample installer, I put python24.dll into TARGETDIR, > > not into System32Folder. I'd like to keep it that way, > > as Microsoft discourages deployment into system32. Of course, > > if there is a good reason to install python24.dll into > > system32, changing the installer generator to do so would > > be only a small effort. > > We need to hear from Mark Hammond about this, but he's on > vacation now. The > current installer's "non-admin install" option leaves system > folders alone, > and works fine for everything I routinely do on Windows. The point to > stuffing the Python DLL into a system directory had (IIRC) > something to do > with a Windows Service embedding a Python component via COM, > in which case > "the virtual command line" that started the service has no > idea where Python > might live on the box. Or something like that -- or > something not like that > at all. That's why we need Mark . The underlying issue has to do with embedding Python. It is most visible with COM, but exists for any applicaiton that attempts to embed a 'stand-alone' Python installation. If they also ship Python with their app, the problem generally is trivial to work around by placing everything in the same directory, and indeed how py2exe works without putting anything in system32. Consider a Python implemented COM object. Some arbitrary process (eg, Outlook loading SpamBayes) finds the COM object is supported by loading pythoncom24.dll (as this DLL has the necessary COM entry points). Obviously, pythoncom24.dll references symbols in python24.dll. The problem is that the loading of pythoncom24.dll fails, due to failure to find python24.dll. COM itself has done the LoadLibrary() for pythoncom24, as part of the process outlook.exe. The Windows documentation for LoadLibrary() indicates the directory of the DLL being loaded is *not* searched for dependent DLLs. As it is not our code doing the LoadLibrary(), we can not move to LoadLibraryEx(). Thus, the only reasonable solution I see is for python24.dll to be on the global path. The easiest way to achieve that is to copy it to system32. Another real example: The win32all installer also embeds Python, via a WISE extension DLL. Again, it is WISE itself that loads its extension DLL, and this DLL has references to Python24.dll. I could find no way to make this work, short of insisting that people launch the installation .EXE from the main Python directory. (This installer is going to die very soon, but the underlying issue is not ) Even a simple .EXE that tries to directly embed Python will have the same issue - the executable will need to live in the same directory as the Python DLL, or will fail to load. In most cases, people will just run it from that directory, but it doesn't seem a reasonable solution. XP is supposed to have some new features regarding this, but I have not looked into it at all. In summary, we need to ensure that an arbitrary DLL with symbol references to python24.dll can be loaded by an arbitrary process. Arbitrarily yrs, Mark. From mhammond at skippinet.com.au Tue Jan 6 03:22:52 2004 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue Jan 6 03:23:11 2004 Subject: [Python-Dev] test_unicode_file fails on Linux In-Reply-To: <1072470171.31901.74.camel@localhost.localdomain> Message-ID: <193b01c3d42e$48083f30$2c00a8c0@eden> > The utime() call is failing for one of the Unicode file names. > > build> ./python ../Lib/test/test_unicode_file.py > test_directories (__main__.TestUnicodeFiles) ... ok > test_equivalent_files (__main__.TestUnicodeFiles) ... ok > test_single_files (__main__.TestUnicodeFiles) ... > '@test-\xc3\xa0\xc3\xb2' > '@test-\xc3\xa0\xc3\xb2' > u'@test-\xe0\xf2' > ERROR By default, this test is working for me on Linux. I suspect it has to do with the fact that: [skip@bobcat build]$ ./python -c 'import sys;print sys.getfilesystemencoding()' UTF-8 By way of testing, I tried: [skip@bobcat build]$ export LANG=de_DE [skip@bobcat build]$ ./python -c 'import sys;print sys.getfilesystemencoding()' ISO-8859-1 And the tests still succeeded. Trying to work with "ascii" as the encoding results in a 'TestSkipped' exception: [skip@bobcat build]$ export LANG=C [skip@bobcat build]$ ./python -c 'import sys;print sys.getfilesystemencoding()' ANSI_X3.4-1968 [skip@bobcat build]$ ./python ../Lib/test/test_unicode_file.py ... test.test_support.TestSkipped: No Unicode filesystem semantics on this platform. I even managed to get my Windows 98 box on the network again, and this also seems to work for me from current CVS. I'm really not sure what I am missing.... Mark. From perky at i18n.org Tue Jan 6 03:42:18 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Tue Jan 6 03:42:47 2004 Subject: [Python-Dev] test_unicode_file fails on Linux In-Reply-To: <193b01c3d42e$48083f30$2c00a8c0@eden> References: <1072470171.31901.74.camel@localhost.localdomain> <193b01c3d42e$48083f30$2c00a8c0@eden> Message-ID: <20040106084218.GA14241@i18n.org> On Tue, Jan 06, 2004 at 07:22:52PM +1100, Mark Hammond wrote: > > The utime() call is failing for one of the Unicode file names. > > > > build> ./python ../Lib/test/test_unicode_file.py > > test_directories (__main__.TestUnicodeFiles) ... ok > > test_equivalent_files (__main__.TestUnicodeFiles) ... ok > > test_single_files (__main__.TestUnicodeFiles) ... > > '@test-\xc3\xa0\xc3\xb2' > > '@test-\xc3\xa0\xc3\xb2' > > u'@test-\xe0\xf2' > > ERROR > > By default, this test is working for me on Linux. I suspect it has to do > with the fact that: > > [skip@bobcat build]$ ./python -c 'import sys;print > sys.getfilesystemencoding()' > UTF-8 > > By way of testing, I tried: > [skip@bobcat build]$ export LANG=de_DE > [skip@bobcat build]$ ./python -c 'import sys;print > sys.getfilesystemencoding()' > ISO-8859-1 > > And the tests still succeeded. Trying to work with "ascii" as the encoding > results in a 'TestSkipped' exception: > > [skip@bobcat build]$ export LANG=C > [skip@bobcat build]$ ./python -c 'import sys;print > sys.getfilesystemencoding()' > ANSI_X3.4-1968 > [skip@bobcat build]$ ./python ../Lib/test/test_unicode_file.py > ... > test.test_support.TestSkipped: No Unicode filesystem semantics on this > platform. > > I even managed to get my Windows 98 box on the network again, and this also > seems to work for me from current CVS. I'm really not sure what I am > missing.... > I guess that I fixed it on Modules/posixmodule.c rev 2.310. Hye-Shik From mhammond at skippinet.com.au Tue Jan 6 04:45:12 2004 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue Jan 6 04:45:29 2004 Subject: [Python-Dev] test_unicode_file fails on Linux In-Reply-To: <20040106084218.GA14241@i18n.org> Message-ID: <195101c3d439$c8b734a0$2c00a8c0@eden> > I guess that I fixed it on Modules/posixmodule.c rev 2.310. That would be it! Thanks! Mark. From mal at egenix.com Tue Jan 6 05:27:00 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Jan 6 05:27:02 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FF9D7F0.4080802@v.loewis.de> References: <3FF6AB34.1000502@v.loewis.de> <3FF8A058.4020904@egenix.com> <3FF9D7F0.4080802@v.loewis.de> Message-ID: <3FFA8D74.2060809@egenix.com> Martin v. Loewis wrote: > M.-A. Lemburg wrote: > >> Codecs define what the default error handling should be (setting >> errors to NULL), not the Unicode implementation. > > I'm not proposing to change this. Codecs would continue to define > their error handling. The builtin codecs would base their decision > on unicode.errorhandling, though. But if you change the default behaviour of the builtin codecs only, how would this help a user of a broken application ? The Unicode implementation would continue to use "strict" for things like low-level parser marker conversions and external codecs would also not work as expected (e.g. the Asian codecs). In the end, I don't think we'd gain anything much except another global to carry around. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Tue Jan 6 07:40:06 2004 From: mwh at python.net (Michael Hudson) Date: Tue Jan 6 07:40:13 2004 Subject: [Python-Dev] pybench.py stats In-Reply-To: <3FF9D15F.9040900@egenix.com> (M.'s message of "Mon, 05 Jan 2004 22:04:31 +0100") References: <3FF9D15F.9040900@egenix.com> Message-ID: <2mu1392qu1.fsf@starship.python.net> "M.-A. Lemburg" writes: > I thought you might find this interesting. For some time I've been > running pybench.py as nightly cronjob against fresh CVS checkouts > and recorded the findings. Attached is a PNG diagram showing the > findings. > > The results are simple to interpret: there was only a very slight > performance improvement over the last year. Indeed, I think most of the 2.3 speedups came fairly early on in its development time, more than a year ago. I don't suppose you have the data for 2002? Cheers, mwh -- In general, I'd recommend injecting LSD directly into your temples, Syd-Barret-style, before mucking with Motif's resource framework. The former has far lower odds of leading directly to terminal insanity. -- Dan Martinez From bsder at allcaps.org Tue Jan 6 08:17:52 2004 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Tue Jan 6 08:12:49 2004 Subject: [Python-Dev] PEP 326 now online In-Reply-To: <005a01c3d3d5$08c30840$e841fea9@oemcomputer> References: <005a01c3d3d5$08c30840$e841fea9@oemcomputer> Message-ID: <20040106044418.E59742@mail.allcaps.org> On Mon, 5 Jan 2004, Raymond Hettinger wrote: > +1 It would occasionally be useful to have ready access to an object > whose sole purpose is to compare higher or lower than everything else. > > Robert Brewer's idea to use spelled-out names, cmp.high and cmp.low, is > easy to remember, better for non-native English speakers, and reasonably > compact. What about cmp.Max and cmp.Min? Non-native English speakers already have to know max() and min() and what they do, this might be easier to understand. However, it might make things a little more confusing for native English speakers. As for use cases, this stuff often comes up in the multitude of spatial data structures used for range searching. Segment trees, R-trees, k-d trees, database keys, etc. would all make use of this. The issue is that a range can be open on one side and does not always have an initialized case. The solutions I have seen are to either overload None as the extremum or use an arbitrary large magnitude number. Overloading None means that the builtins can't really be used without special case checks to work around the undefined (or "wrongly defined") ordering of None. These checks tend to swamp the nice performance of builtins like max() and min(). Choosing a large magnitude number throws away the ability of Python to cope with arbitrarily large integers and introduces a potential source of overrun/underrun bugs. -a From astrand at lysator.liu.se Tue Jan 6 09:30:40 2004 From: astrand at lysator.liu.se (Peter Astrand) Date: Tue Jan 6 09:30:45 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: <200401051512.i05FCbG08182@c-24-5-183-134.client.comcast.net> Message-ID: On Mon, 5 Jan 2004, Guido van Rossum wrote: > What I'm looking for in standard library additions is > "category-killers". IMO good recent examples are optparse and the > logging module. Both were in active use and had independent > distributions with happy users before they were accepted into the > standard library. Thanks, this is very useful feedback. > For the code, I think that a Windows version is essential. It would > be okay if a small amount of code (e.g. making some basic Windows APIs > available, like CreateProcess) had to be coded in C, as long as the > bulk could still be in Python. I really like having this kind of code > in Python -- it helps understanding the fine details, and it allows > subclassing more easily. Sounds good to me. > I also wonder if more support for managing a whole flock of > subprocesses might not be a useful addition; this is a common need in > some applications. And with this, I'd like to see explicit support > (somehow -- maybe through a separate class) for managing "daemon" > processes. This certainly is a common need! I'll put this on the TODO list. -- /Peter ?strand From jrw at pobox.com Tue Jan 6 10:22:45 2004 From: jrw at pobox.com (John Williams) Date: Tue Jan 6 10:22:49 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: References: Message-ID: <1073402565.3417.38.camel@lcc54.languagecomputer.com> This PEP looks fantastic to me; I've often wanted a better set of primitives for working with external processes, and your PEP has just about everything I've ever wanted. I do have some minor nits to pick, though: - The preexec_args argument is extraneous considering how easy it is to use a lambda construct to bind arguments to a function. Consider: Popen(..., preexec_fn=foo, preexec_args=(bar,baz)) vs. Popen(..., preexec_fn=lambda: foo(bar,baz)) - Rather than passing argv0 as a named argument, what about passing in the program to execute instead? It would simplify code like this: Popen(child_program, *child_args[1:], argv0=child_args[0]) into this Popen(*child_args, program=child_program) Of course I'm not suggesting you change the default behavior of taking argv0 *and* the program to execute from the first positional argument. - The defaults for close_fds and universal_newlines should be False rather than 0; this would make it clear from reading the synopsis that these arguments are boolean-valued. - The poll() method and returncode attribute aren't both needed. I like returncode's semantics better since it gives more information and it avoids using -1 as a magic number. - Rather than raising a PopenException for incorrect arguments, I would think that raising ValueError or TypeError would be more in line with users' expectations. - How does this communicate() method interact with I/O redirection? - As you and others have mentioned, this API needs a better name than popen5; I like "process" a lot better. - Would you consider adding an example of how to chain processes together into a pipeline? You say this is possible, and I'm assuming that's what the PIPE constant is for, but I'd like to see it written out to make sure I'm understanding it correctly. From Paul.Moore at atosorigin.com Tue Jan 6 11:10:25 2004 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Tue Jan 6 11:10:28 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060D8D@UKDCX001.uk.int.atosorigin.com> From: John Williams > - Would you consider adding an example of how to chain processes > together into a pipeline? You say this is possible, and I'm assuming > that's what the PIPE constant is for, but I'd like to see it written out > to make sure I'm understanding it correctly. There's actually an example in the source code. Looking at it, I have come to the conclusion that I don't like PIPE, but I can't see an obviously better answer. The example from the code is: p1 = Popen(["dmesg"], stdout=PIPE) p2 = Popen(["grep", "hda"], stdin=p1.fromchild) result = p2.communicate()[0] Some things immediately spring from this: 1. stdout=PIPE doesn't feel like it adds anything. It's "obvious" that I want a pipe. But of course there's no way the code realises that unless I say so. 2. fromchild doesn't look too nice. I'd prefer to use "stdin=p1.stdout". 3. communicate() doesn't really describe what it does very well. As far as PIPE goes, the issue is that there are (to me) two, almost equally common, options - to simply inherit the parent's stdXXX (like os.system does) and to set up pipes to (some or all of) the child's stdXXX, which the parent then talks to. PIPE says to set up a pipe, where the default is to inherit. I'd like to see a better option, but I can't immediately think of one. This cries out for a "just do what I mean" option :-) Maybe inverting the sense of the argument, and saying pipe=STDOUT, or in the case of multiple pipes, pipe=(STDIN, STDOUT), is better. What do people think? But remember that *all* of these look worse if you don't use from...import style: stdout=popen5.PIPE or pipe=popen5.STDOUT. Part of me wonders whether the creation of the child process should be split out of the constructor, so that we do p1 = Popen(...); p1.start(). Then setting up pipes could happen between the two. But I can't think of a syntax I like for setting up pipes even then, and I see no other advantage to a separate start() method. I'll see what this feels like when I have practical experience implementing the Windows version under my belt... Paul. PS If this becomes a design discussion, is python-dev (or even python-list) appropriate? Would it be better to set up a SIG for this, much like optparse had the getopt-sig for a while? From tjreedy at udel.edu Tue Jan 6 12:50:13 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Jan 6 12:50:13 2004 Subject: [Python-Dev] Re: PEP 326 now online References: <005a01c3d3d5$08c30840$e841fea9@oemcomputer> <20040106044418.E59742@mail.allcaps.org> Message-ID: "Andrew P. Lentvorski, Jr." wrote in message news:20040106044418.E59742@mail.allcaps.org... > On Mon, 5 Jan 2004, Raymond Hettinger wrote: > > > +1 It would occasionally be useful to have ready access to an object > > whose sole purpose is to compare higher or lower than everything else. I am to far from being a Python newbie to know whether such would expect such a t thing or find is surprising or even puzzling. > > Robert Brewer's idea to use spelled-out names, cmp.high and cmp.low, is > > easy to remember, better for non-native English speakers, and reasonably > > compact. > > What about cmp.Max and cmp.Min? Non-native English speakers already have > to know max() and min() and what they do, this might be easier to > understand. I proposed cmp.lo and cmp.hi as a trial balloon for the idea of using specific public function attributes (as opposed to generic implementation attributes) as a 'low-cost' alternative to builtin names for something that is useful but not so central as None, True, and False. I had no thought then of non-native-English users or much attachment to the specifics except for the shortness. Capitalizing the constants' names would follow precedence. Using Min and Max reuses existing knowledge, so this would be OK by me. > However, it might make things a little more confusing for > native English speakers. ?? I have no idea how. > As for use cases, this stuff often comes up in the multitude of spatial > data structures used for range searching. Segment trees, R-trees, k-d > trees, database keys, etc. would all make use of this. The issue is that > a range can be open on one side and does not always have an initialized > case. > > The solutions I have seen are to either overload None as the extremum or > use an arbitrary large magnitude number. Overloading None means that the > builtins can't really be used without special case checks to work around > the undefined (or "wrongly defined") ordering of None. These checks tend > to swamp the nice performance of builtins like max() and min(). > > Choosing a large magnitude number throws away the ability of Python to > cope with arbitrarily large integers and introduces a potential source of > overrun/underrun bugs. More grist for the Motivation section. Thanks. Terry J. Reedy From guido at python.org Tue Jan 6 13:07:57 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 6 13:08:01 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: Your message of "Tue, 06 Jan 2004 12:50:13 EST." References: <005a01c3d3d5$08c30840$e841fea9@oemcomputer> <20040106044418.E59742@mail.allcaps.org> Message-ID: <200401061807.i06I7vE11059@c-24-5-183-134.client.comcast.net> > I proposed cmp.lo and cmp.hi as a trial balloon for the idea of > using specific public function attributes (as opposed to generic > implementation attributes) as a 'low-cost' alternative to builtin > names for something that is useful but not so central as None, True, > and False. I had no thought then of non-native-English users or > much attachment to the specifics except for the shortness. > Capitalizing the constants' names would follow precedence. Using > Min and Max reuses existing knowledge, so this would be OK by me. Hm. cmp is a *builtin function*. That seems an exceedingly odd place to stick arbitrary constants -- much more so than type objects (like Martin's recently proposed unicode property for controlling error handling). --Guido van Rossum (home page: http://www.python.org/~guido/) From eppstein at ics.uci.edu Tue Jan 6 13:16:47 2004 From: eppstein at ics.uci.edu (David Eppstein) Date: Tue Jan 6 13:16:46 2004 Subject: [Python-Dev] Re: PEP 326 now online References: <005a01c3d3d5$08c30840$e841fea9@oemcomputer> <20040106044418.E59742@mail.allcaps.org> Message-ID: On Mon, 5 Jan 2004, Raymond Hettinger wrote: > > > +1 It would occasionally be useful to have ready access to an object > > > whose sole purpose is to compare higher or lower than everything else. I would certainly use such objects. I don't think "None" and "All" are sufficiently explicit names for the objects, though -- the only context in which None makes sense to me as a name for a minimal object is in terms of set inclusion, but Set() != None and there is no universal set. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From theller at python.net Tue Jan 6 13:44:28 2004 From: theller at python.net (Thomas Heller) Date: Tue Jan 6 13:44:32 2004 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/sandbox/msi msilib.py, 1.2, 1.3 In-Reply-To: (loewis@users.sourceforge.net's message of "Fri, 02 Jan 2004 12:42:26 -0800") References: Message-ID: loewis@users.sourceforge.net writes: > Update of /cvsroot/python/python/nondist/sandbox/msi > In directory sc8-pr-cvs1:/tmp/cvs-serv17402 > > Modified Files: > msilib.py > Log Message: > Locate cabarc in registry. > except OSError: > pass > ! for k, v in [(r"Software\Microsoft\VisualStudio\7.1\Setup\VS", "VS7CommonBinDir"), > ! (r"Software\Microsoft\Win32SDK\Directories", "Install Dir")]: > ! try: > ! key = _winreg.OpenKey(_winreg.HKEY_LOCAL_MACHINE, k) > ! except WindowsError: > ! continue > ! cabarc = os.path.join(_winreg.QueryValueEx(key, v)[0], r"Bin", "cabarc.exe") > ! _winreg.CloseKey(key) > ! assert os.path.exists(cabarc), cabarc The above assert fails for me, there's no cabarc.exe in the VS7CommonBinDir: Do I have to install additional software, apart from VS.NET 2003? Thomas From tim.one at comcast.net Tue Jan 6 13:51:44 2004 From: tim.one at comcast.net (Tim Peters) Date: Tue Jan 6 13:51:56 2004 Subject: [Python-Dev] test_unicode_file fails on Linux In-Reply-To: <20040106084218.GA14241@i18n.org> Message-ID: [Mark Hammond] >> I even managed to get my Windows 98 box on the network again, and >> this also seems to work for me from current CVS. I'm really not >> sure what I am missing.... [Hye-Shik Chang] > I guess that I fixed it on Modules/posixmodule.c rev 2.310. I think you did (and thanks!): test_unicode_file passes for me today on my Win98SE box too. Speaking of which, it wouldn't have been possible for me to test this if I hadn't "rehabilitated" the VC6 project files over the weekend: VC7 can't be installed on a Win9x box. I sure hope I was the only one routinely building Python on a Win9x box, though . From mike at nospam.com Tue Jan 6 13:56:03 2004 From: mike at nospam.com (Mike Rovner) Date: Tue Jan 6 13:56:08 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 References: <192001c3d41d$c5a71f30$2c00a8c0@eden> Message-ID: Mark Hammond wrote: > XP is supposed to have some new features regarding this, but I have > not looked into it at all. JFYI, relevant link "Windows XP: Kernel Improvements" http://msdn.microsoft.com/msdnmag/issues/01/12/xpkernel/default.aspx (part "Application Compatibility-Side-by-side Assemblies") addresses new features of Win2k and XP regarding DLL. Mike From theller at python.net Tue Jan 6 14:01:23 2004 From: theller at python.net (Thomas Heller) Date: Tue Jan 6 14:01:27 2004 Subject: [Python-Dev] test_unicode_file fails on Linux In-Reply-To: (Tim Peters's message of "Tue, 6 Jan 2004 13:51:44 -0500") References: Message-ID: "Tim Peters" writes: > Speaking of which, it wouldn't have been possible for me to test this if I > hadn't "rehabilitated" the VC6 project files over the weekend: VC7 can't be > installed on a Win9x box. I haven't thought about this, but I'm not surprised. > I sure hope I was the only one routinely building > Python on a Win9x box, though . IIRC, Guido has upgraded. Thomas From aleaxit at yahoo.com Tue Jan 6 14:14:41 2004 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Jan 6 14:14:46 2004 Subject: [Python-Dev] test_unicode_file fails on Linux In-Reply-To: References: Message-ID: <941B1780-407C-11D8-884F-000A95EFAE9E@yahoo.com> On Jan 6, 2004, at 8:01 PM, Thomas Heller wrote: ... >> I sure hope I was the only one routinely building >> Python on a Win9x box, though . > > IIRC, Guido has upgraded. > I used to do it pretty often on tartina, my old win98 laptop (which is why I didn't even ask to be considered as one of the MSVC 7 giftees back when MS offered -- I knew VS 7 doesn't run on win98). But since July I haven't really worked on any Windows contracts -- it's been mostly Linux (with some prospect of Mac OS X now on the horizon) -- and tartina's batteries have been nearly dead (RIP) so it's not very usable anymore on trips &c... Alex From jcarlson at uci.edu Tue Jan 6 14:23:29 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue Jan 6 14:26:51 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <200401061807.i06I7vE11059@c-24-5-183-134.client.comcast.net> References: <200401061807.i06I7vE11059@c-24-5-183-134.client.comcast.net> Message-ID: <20040106110341.FC63.JCARLSON@uci.edu> [Guido] > Hm. cmp is a *builtin function*. That seems an exceedingly odd place > to stick arbitrary constants -- much more so than type objects (like > Martin's recently proposed unicode property for controlling error > handling). If the high/low (or hi/lo, Max/Min, etc.) objects themselves are subclasses of object, it may make sense to just place them at object.high/object.low. One idea was to create a type called 'extreme', bind it to cmp.extreme, and subclass high/low from extreme. Of course that is just one more arbitrary object attached to cmp, which is even more odd. Another option would be for min.Min and max.Max, but I'm pretty sure that would be confusing. The convenient part about putting them as attributes of cmp is that it is obvious that they are most useful when it comes to comparing against other objects. - Josiah From guido at python.org Tue Jan 6 14:45:42 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 6 14:45:47 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: Your message of "Tue, 06 Jan 2004 11:23:29 PST." <20040106110341.FC63.JCARLSON@uci.edu> References: <200401061807.i06I7vE11059@c-24-5-183-134.client.comcast.net> <20040106110341.FC63.JCARLSON@uci.edu> Message-ID: <200401061945.i06JjgP11366@c-24-5-183-134.client.comcast.net> > [Guido] > > Hm. cmp is a *builtin function*. That seems an exceedingly odd place > > to stick arbitrary constants -- much more so than type objects (like > > Martin's recently proposed unicode property for controlling error > > handling). > > If the high/low (or hi/lo, Max/Min, etc.) objects themselves are > subclasses of object, it may make sense to just place them at > object.high/object.low. Why two subclasses? Shouldn't one with two instances suffice? > One idea was to create a type called 'extreme', bind it to cmp.extreme, > and subclass high/low from extreme. Of course that is just one more > arbitrary object attached to cmp, which is even more odd. > > Another option would be for min.Min and max.Max, but I'm pretty sure > that would be confusing. > > The convenient part about putting them as attributes of cmp is that it > is obvious that they are most useful when it comes to comparing against > other objects. I'm not convinced. --Guido van Rossum (home page: http://www.python.org/~guido/) From claird at lairds.com Tue Jan 6 15:43:18 2004 From: claird at lairds.com (Cameron Laird) Date: Tue Jan 6 15:44:15 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: Message-ID: > From tim.one@comcast.net Sat Jan 03 19:31:55 2004 > > ... > > Has anyone been in contact with the HP-UX Porting Center > > folks recently? I wonder if I should enlist their help ... > If and only if they know something about HP-UX <0.5 wink>. Nobody here > really seems to. Have you tried their port? 2.3.1 appears to be the most > recent: > http://hpux.cs.utah.edu/hppd/hpux/Languages/python-2.3.1/ > By memory, in the past we've gotten complaints from HP-UX users about > previous Porting Center Pythons in two areas: > 1. Apparently some (all?) of their Python ports were built with > threading disabled entirely. > 2. Flaky floating-point behavior. > An example of the latter complaint is here (not a particularly good example, > just the first I bumped into via Googling): > http://www.ifost.org.au/Software/ > The python for HP-UX 11 that is found at the HP public domain > software porting centre has buggy floating point problems. I > recompiled it with gcc 3 and while there are some round-off > issues, at least it's not totally deranged... [a link to > python-211-hpux11-gcc.gz] I'm far enough along to begin reporting to python-dev and soliciting the group as what Guido called a "crutch". Issues: A. Threading: first, congratulations to Guido et al., for bringing it along this far. I think you've done the best possible. Lots of HP-UX people seem to be building --without-threads. I can imagine that the Porting Center will continue to do so. We can help them do better, or at least more; I don't know yet when that'll be the most important chore. I want to make changes in threading's configuration. I want first to collect more information, so I can explain myself with precision. B. Version: I'd like good support to go back at least to 10.20. If someone thinks, "Nah, 11.01 and higher is good enough," I'd like to know now. C. curses: I find this hairy. Is there a clean way to build without curses? The only way I know to advance without it is to hack up setup.py, speci- fically to comment out the blocks that involve curses. Here are the symptoms: curses involves a cc -Ae -DNDEBUG -O +z -I. \ -I$SRC_ROOT/./Include \ -I/tmp/test4/include -I/usr/local/include \ -I$SRC_ROOT/Include \ -I$SRC_ROOT -c \ $SRC_ROOT/Modules/_cursesmodule.c This tosses cc: "$SRC_ROOT/Modules/_cursesmodule.c", line 306: error 1000: \ Unexpected symbol: "}". cc: error 2017: Cannot recover from earlier errors, terminating. The source in _cursesmodule.c in that vicinity is ... #define Window_NoArg2TupleReturnFunction(X, TYPE, ERGSTR) \ static PyObject * PyCursesWindow_ ## X (PyCursesWindowObject *self) \ { \ TYPE arg1, arg2; \ X(self->win,arg1,arg2); return Py_BuildValue(ERGSTR, arg1, arg2); } ... Window_NoArg2TupleReturnFunction(getyx, int, "ii") Window_NoArg2TupleReturnFunction(getbegyx, int, "ii") Window_NoArg2TupleReturnFunction(getmaxyx, int, "ii") Window_NoArg2TupleReturnFunction(getparyx, int, "ii") ... /usr/include/curses.h defines #define getyx(__win,__y,__x) { WINDOW *__wi; \ __wi = __win; \ ((__y) = __getcury(__wi), \ (__x) = __getcurx(__wi)) } so the pre-processed form of the source is /* Re-formatted for ease in reading white space. */ static PyObject * PyCursesWindow_getyx (PyCursesWindowObject *self) { int arg1, arg2; { WINDOW *__wi; __wi = self->win; ((arg1) = __getcury(__wi), (arg2) = __getcurx(__wi)) }; return Py_BuildValue("ii", arg1, arg2); } which is, indeed, a fatal error. Does anyone see the right fix? I sure haven't, yet. D. Other: there are others. I don't yet have good diagnoses. From martin at v.loewis.de Tue Jan 6 15:59:53 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Tue Jan 6 16:00:00 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FFA8D74.2060809@egenix.com> References: <3FF6AB34.1000502@v.loewis.de> <3FF8A058.4020904@egenix.com> <3FF9D7F0.4080802@v.loewis.de> <3FFA8D74.2060809@egenix.com> Message-ID: <3FFB21C9.5080800@v.loewis.de> M.-A. Lemburg wrote: > But if you change the default behaviour of the builtin codecs > only, how would this help a user of a broken application ? It would be even sufficient to change the behaviour of the us-ascii codec, as this is the default codec. The user of the broken implementation would get question marks in places that would previously cause an exception. That may be acceptable, if it is only debugging output or log files; it may be tolerable if this is only for text seen by users. > The Unicode implementation would continue to use "strict" > for things like low-level parser marker conversions and external > codecs would also not work as expected (e.g. the Asian codecs). Sure, but they would also not be the default codecs. Not sure what "low-level parser marker conversions" are: what is a "parser marker"? > In the end, I don't think we'd gain anything much except another > global to carry around. It will make developers happier, as they need less urgently fix applications that blow up at the customer. Regards, Martin From martin at v.loewis.de Tue Jan 6 16:13:14 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Tue Jan 6 16:13:48 2004 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/sandbox/msi msilib.py, 1.2, 1.3 In-Reply-To: References: Message-ID: <3FFB24EA.2090300@v.loewis.de> Thomas Heller wrote: >>! for k, v in [(r"Software\Microsoft\VisualStudio\7.1\Setup\VS", "VS7CommonBinDir"), >>! (r"Software\Microsoft\Win32SDK\Directories", "Install Dir")]: >>! try: >>! key = _winreg.OpenKey(_winreg.HKEY_LOCAL_MACHINE, k) >>! except WindowsError: >>! continue >>! cabarc = os.path.join(_winreg.QueryValueEx(key, v)[0], r"Bin", "cabarc.exe") >>! _winreg.CloseKey(key) >>! assert os.path.exists(cabarc), cabarc > > > The above assert fails for me, there's no cabarc.exe in the > VS7CommonBinDir: Do I have to install additional software, apart from > VS.NET 2003? I don't think so. Where is cabarc.exe located, in your VS7.1 installation tree? It should be in common/bin, which in turn should be listed in the first of the two registry keys. Can you find out what is different in your installation? Also, what is the resulting cabarc variable in above assertion? Regards, Martin From guido at python.org Tue Jan 6 16:18:32 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 6 16:18:42 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: Your message of "Tue, 06 Jan 2004 13:09:17 PST." <20040106120607.FC66.JCARLSON@uci.edu> References: <20040106110341.FC63.JCARLSON@uci.edu> <200401061945.i06JjgP11366@c-24-5-183-134.client.comcast.net> <20040106120607.FC66.JCARLSON@uci.edu> Message-ID: <200401062118.i06LIWF11616@c-24-5-183-134.client.comcast.net> > > > One idea was to create a type called 'extreme', bind it to cmp.extreme, > > > and subclass high/low from extreme. Of course that is just one more > > > arbitrary object attached to cmp, which is even more odd. > > > > > > Another option would be for min.Min and max.Max, but I'm pretty sure > > > that would be confusing. > > > > > > The convenient part about putting them as attributes of cmp is that it > > > is obvious that they are most useful when it comes to comparing against > > > other objects. > > > > I'm not convinced. > > Seemingly you are not convinced that creating attributes for cmp is > reasonable. Seemingly, attributes for min and max are equivalently as > unreasonable (and may be confusing). > > How about them being attributes of object, or even placed in the math > module? > > - Josiah Module attributes make sense; make them attributes of object has the unfortunate side effect that they will be attributes of *all* objects and that doesn't seem a good idea. The math module is only appropriate if this is primarily about float numbers. And see PEP 747 in that case. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Jan 6 16:18:13 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Tue Jan 6 16:20:03 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: Message-ID: <3FFB2615.1050403@v.loewis.de> Cameron Laird wrote: > C. curses: I find this hairy. > > Is there a clean way to build without curses? Why do you want one? If it fails to build, it fails to build. It will still continue trying to build the other extension modules. The real problem here is the old decision to use distutils for building exceptions - in the Modules/Setup approach, you *had* to edit the file to determine which extension modules to build. Of course, the same can apply to setup.py: Just delete the lines mentioning curses, and it will stop building the curses extensions. I'm confident though that one can manage to adjust the curses code to build on HP-UX, once you have figured out that there are multiple (three ?!?) curses implementations on the system, and that you should select the "right" one. Regards, Martin From astrand at lysator.liu.se Tue Jan 6 16:25:51 2004 From: astrand at lysator.liu.se (Peter Astrand) Date: Tue Jan 6 16:25:59 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060D8D@UKDCX001.uk.int.atosorigin.com> Message-ID: > The example from the code is: > > p1 = Popen(["dmesg"], stdout=PIPE) > p2 = Popen(["grep", "hda"], stdin=p1.fromchild) > result = p2.communicate()[0] > > Some things immediately spring from this: > > 1. stdout=PIPE doesn't feel like it adds anything. It's "obvious" that > I want a pipe. But of course there's no way the code realises that > unless I say so. > 2. fromchild doesn't look too nice. I'd prefer to use "stdin=p1.stdout". You too? :-) I'm almost convinced now that we should really drop this popen2 syntax... > 3. communicate() doesn't really describe what it does very well. Give me some other alternatives... > Maybe inverting the sense of the argument, and saying pipe=STDOUT, or in > the case of multiple pipes, pipe=(STDIN, STDOUT), is better. What do > people think? I think this is worse than the current syntax. > PS If this becomes a design discussion, is python-dev (or even python-list) > appropriate? Would it be better to set up a SIG for this, much like > optparse had the getopt-sig for a while? If there's interest, I can arrange a mailing list in matter of minutes. -- /Peter ?strand From mal at egenix.com Tue Jan 6 16:30:22 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Jan 6 16:30:32 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FFB21C9.5080800@v.loewis.de> References: <3FF6AB34.1000502@v.loewis.de> <3FF8A058.4020904@egenix.com> <3FF9D7F0.4080802@v.loewis.de> <3FFA8D74.2060809@egenix.com> <3FFB21C9.5080800@v.loewis.de> Message-ID: <3FFB28EE.2090208@egenix.com> Martin v. Loewis wrote: > M.-A. Lemburg wrote: > >> But if you change the default behaviour of the builtin codecs >> only, how would this help a user of a broken application ? > > It would be even sufficient to change the behaviour of the us-ascii > codec, as this is the default codec. So you are only talking about the case where the application uses the standard default encoding (ASCII) and does not make use of any other codecs ? > The user of the broken implementation would get question marks > in places that would previously cause an exception. That may > be acceptable, if it is only debugging output or log files; > it may be tolerable if this is only for text seen by users. > >> The Unicode implementation would continue to use "strict" >> for things like low-level parser marker conversions and external >> codecs would also not work as expected (e.g. the Asian codecs). > > Sure, but they would also not be the default codecs. Why not ? What about Asian users who set the default encoding to one of the encodings supported by e.g. the JapaneseCodecs package ? > Not sure what "low-level parser marker conversions" are: > what is a "parser marker"? Oh, sorry, that's the term I use for PyArg_ParseTuple() format arguments. >> In the end, I don't think we'd gain anything much except another >> global to carry around. > > It will make developers happier, as they need less urgently fix > applications that blow up at the customer. Given the scenario you mention above, wouldn't that also be possible by providing a customized codec for "ascii" under a new name "all-things-ascii" and then setting the default encoding to "all-things-ascii" ? In general I don't like these global settings. Just think of the issues this could cause in multi-user systems such as Zope that are not prepared for these changes: a script could easily change the settings to have the server execute code under different user ids if threads executing their requests generate codec errors (the errors parameter can be set to a callback now that we have the new logic in place...). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From claird at lairds.com Tue Jan 6 16:32:36 2004 From: claird at lairds.com (Cameron Laird) Date: Tue Jan 6 16:32:45 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <3FFB2615.1050403@v.loewis.de> Message-ID: > From martin@v.loewis.de Tue Jan 06 16:18:36 2004 > . > . > . > > Is there a clean way to build without curses? > Why do you want one? If it fails to build, it fails to build. > It will still continue trying to build the other extension > modules. > The real problem here is the old decision to use distutils > for building exceptions - in the Modules/Setup approach, > you *had* to edit the file to determine which extension > modules to build. > Of course, the same can apply to setup.py: Just delete > the lines mentioning curses, and it will stop building > the curses extensions. > I'm confident though that one can manage to adjust the > curses code to build on HP-UX, once you have figured out > that there are multiple (three ?!?) curses implementations > on the system, and that you should select the "right" > one. > . > . > . ? You might have to go through some of this slowly for me. I'm stumbling over the "It will still continue trying to build the other extension modules" part. I did "make; make install", and no one ever tried to compile Modules/_tkinter.c (for example); what am I missing? I agree that curses is a finite problem, one that can be overcome. Does anyone happen to know his way around this module, even if only enough to say that, perhaps, systematic trial-and-error is more efficient than rational comprehension? From evan at 4-am.com Tue Jan 6 16:34:41 2004 From: evan at 4-am.com (Evan Simpson) Date: Tue Jan 6 16:35:38 2004 Subject: [Python-Dev] Dots in __name__ break import -- fixable? Message-ID: <3FFB29F1.3050201@4-am.com> There must be some code in the import machinery that assumes (with excellent reason) that the __name__ of the module from which it is called will never contain periods. Consider the following: >>> __name__ = 'a.b.c' >>> import string >>> string The peculiar behavior is *only* triggered by periods: >>> __name__ = ''.join(map(chr, range(1, 256))).replace('.', '') >>> import string >>> string Would this be easy to fix? It would fall into the realm of "so don't do that!" except that Zope's Python Scripts set their __name__ to their Zope Id, which can contain periods. Cheers, Evan @ 4-am From martin at v.loewis.de Tue Jan 6 16:49:01 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Tue Jan 6 16:49:07 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FFB28EE.2090208@egenix.com> References: <3FF6AB34.1000502@v.loewis.de> <3FF8A058.4020904@egenix.com> <3FF9D7F0.4080802@v.loewis.de> <3FFA8D74.2060809@egenix.com> <3FFB21C9.5080800@v.loewis.de> <3FFB28EE.2090208@egenix.com> Message-ID: <3FFB2D4D.7040606@v.loewis.de> M.-A. Lemburg wrote: > So you are only talking about the case where the application > uses the standard default encoding (ASCII) and does not > make use of any other codecs ? Yes, I'd like to change the error handling in the case of an implicit conversion - in particular for the unicode-to-ascii case, but also for the (non-)ascii to unicode case. Application authors *think* they got rid of all non-trivial instances of such conversions, only to find out that their customers can produce endless series of application crashes by entering funny characters in all imaginable places. >> Sure, but they would also not be the default codecs. > > > Why not ? What about Asian users who set the default encoding > to one of the encodings supported by e.g. the JapaneseCodecs > package ? If that was a proper Python feature, I'm sure the cjkcodecs would support it instantly. However, perhaps people also take our advise and avoid changing the default encoding (because that *doesn't* work); so even in applications where cjkcodecs are heavily used, I would hope that the system encoding remains at us-ascii. But if it doesn't, cjkcodecs should also implement the change I'm proposing. This is a red herring. > Oh, sorry, that's the term I use for PyArg_ParseTuple() format > arguments. Ah, right: They should make use of the default error handling as well. > Given the scenario you mention above, wouldn't that also be > possible by providing a customized codec for "ascii" under > a new name "all-things-ascii" and then setting the default > encoding to "all-things-ascii" ? No. Changing the system default encoding is not possible for applications - it is the system administrator that needs to make this change (in site.py). I'm proposing a change that applications can make at run-time. > Just think of the issues this could cause in multi-user systems > such as Zope that are not prepared for these changes: > a script could easily change the settings to have > the server execute code under different user ids if threads > executing their requests generate codec errors (the errors > parameter can be set to a callback now that we have the new > logic in place...). This is also a red herring. Zope can give very controlled access to builtins, and could just dis-allow scripts to change the setting - Zope applications would need to find a different way. Remember, it is a work-around - so it clearly has limitations. I'm proposing it anyway, and I'm fully aware of the limitations. Regards, Martin From martin at v.loewis.de Tue Jan 6 16:52:31 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Tue Jan 6 16:54:42 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: Message-ID: <3FFB2E1F.4010801@v.loewis.de> Cameron Laird wrote: > ? You might have to go through some of this slowly for me. > I'm stumbling over the "It will still continue trying to > build the other extension modules" part. I did "make; make > install", and no one ever tried to compile Modules/_tkinter.c > (for example); what am I missing? After setup.py fails to build one extension module, it *should* continue with the next one. Are you sure _tkinter.sl was not build??? However, it might be that the 'make install' part fails, due to the build error. > I agree that curses is a finite problem, one that can be > overcome. Does anyone happen to know his way around this > module, even if only enough to say that, perhaps, systematic > trial-and-error is more efficient than rational comprehension? I am strongly convinced that rational comprehension is the only way to approach this problem. Regards, Martin From claird at lairds.com Tue Jan 6 17:05:48 2004 From: claird at lairds.com (Cameron Laird) Date: Tue Jan 6 17:05:56 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <3FFB2E1F.4010801@v.loewis.de> Message-ID: > From martin@v.loewis.de Tue Jan 06 16:52:48 2004 > . > . > . > After setup.py fails to build one extension module, it *should* > continue with the next one. Are you sure _tkinter.sl was not > build??? > However, it might be that the 'make install' part fails, due > to the build error. > . > . > . Nowhere in sight--and I can't see that it attempted, either. I feel terribly clumsy at this stuff. I thought it was going to be easier just to scan setup.py, and say, "ah, yes, right there; it'll compile X right after Y and just before Z, and ..." That you, too, are other-than-entirely-certain about how errors are handled--well, I don't know if that's a good thing, or bad. So is there no one who's expert with the Makefile? Shall I dig in myself and see what it'll need to work correctly and compre- hensibly? From martin at v.loewis.de Tue Jan 6 17:24:04 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Tue Jan 6 17:24:17 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: Message-ID: <3FFB3584.2040805@v.loewis.de> Cameron Laird wrote: > Nowhere in sight--and I can't see that it attempted, either. Ah, that is then caused by prerequisites missing. It is definitely, certainly, absolutely not because earlier modules failed to build. > So is there no one who's expert with the Makefile? It has nothing to do with the Makefile, either - all happens in a single setup.py invocation. > Shall I dig > in myself and see what it'll need to work correctly and compre- > hensibly? You should add print statements 1. at the beginning of detect_tkinter, to see whether it is invoked, and 2. Just before the line self.extensions.append(ext) in detect_tkinter If 1 is executed, and 2 is not, then some prerequisites are missing. Trace through the function. Regards, Martin From guido at python.org Tue Jan 6 18:18:00 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 6 18:18:06 2004 Subject: [Python-Dev] Dots in __name__ break import -- fixable? In-Reply-To: Your message of "Tue, 06 Jan 2004 15:34:41 CST." <3FFB29F1.3050201@4-am.com> References: <3FFB29F1.3050201@4-am.com> Message-ID: <200401062318.i06NI0n11807@c-24-5-183-134.client.comcast.net> > There must be some code in the import machinery that assumes (with > excellent reason) that the __name__ of the module from which it is > called will never contain periods. Right. Periods in module names are used for the package namespace. (Import some code that uses packages and then print sys.modules.keys() to see how.) > Consider the following: > > >>> __name__ = 'a.b.c' > >>> import string > >>> string > > > The peculiar behavior is *only* triggered by periods: > > >>> __name__ = ''.join(map(chr, range(1, 256))).replace('.', '') > >>> import string > >>> string > > > Would this be easy to fix? No. > It would fall into the realm of "so don't do that!" except that Zope's > Python Scripts set their __name__ to their Zope Id, which can contain > periods. Too bad. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Jan 6 18:21:44 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 6 18:21:50 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: Your message of "Tue, 06 Jan 2004 22:52:31 +0100." <3FFB2E1F.4010801@v.loewis.de> References: <3FFB2E1F.4010801@v.loewis.de> Message-ID: <200401062321.i06NLiY11832@c-24-5-183-134.client.comcast.net> > Cameron Laird wrote: > > ? You might have to go through some of this slowly for me. > > I'm stumbling over the "It will still continue trying to > > build the other extension modules" part. I did "make; make > > install", and no one ever tried to compile Modules/_tkinter.c > > (for example); what am I missing? [MvL] > After setup.py fails to build one extension module, it *should* > continue with the next one. Are you sure _tkinter.sl was not > build??? setup.py skips some modules if the required .h files cannot be found. If you never installed Tcl/Tk, it will skip _tkinter.c silently. For curses, if it can find ncurses.h, it will assume that it can build the curses module. It could be that the ncurses.h found doesn't actually work with _cursesmodule.c; or it could be that the C compiler's search finds a different header file than what setup.py finds (a native C compiler typically has more platform knowledge than distutils). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Tue Jan 6 18:49:53 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Jan 6 18:50:03 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FFB2D4D.7040606@v.loewis.de> References: <3FF6AB34.1000502@v.loewis.de> <3FF8A058.4020904@egenix.com> <3FF9D7F0.4080802@v.loewis.de> <3FFA8D74.2060809@egenix.com> <3FFB21C9.5080800@v.loewis.de> <3FFB28EE.2090208@egenix.com> <3FFB2D4D.7040606@v.loewis.de> Message-ID: <3FFB49A1.3030001@egenix.com> Martin v. Loewis wrote: > M.-A. Lemburg wrote: > >> So you are only talking about the case where the application >> uses the standard default encoding (ASCII) and does not >> make use of any other codecs ? > > Yes, I'd like to change the error handling in the case of > an implicit conversion - in particular for the unicode-to-ascii > case, but also for the (non-)ascii to unicode case. > > Application authors *think* they got rid of all non-trivial > instances of such conversions, only to find out that their > customers can produce endless series of application crashes > by entering funny characters in all imaginable places. Hmm, I would assume that application authors do have the proper permissions to use the work-around I mentioned below, that is, add sitecustomize.py to their application as top-level module and change the default encoding to use their custom codec instead. Even better: that codec could also write out a warning log together with traceback to make debugging for the application developers easier. >>> Sure, but they would also not be the default codecs. >> >> Why not ? What about Asian users who set the default encoding >> to one of the encodings supported by e.g. the JapaneseCodecs >> package ? > > If that was a proper Python feature, I'm sure the cjkcodecs > would support it instantly. However, perhaps people also take > our advise and avoid changing the default encoding (because that > *doesn't* work); so even in applications where cjkcodecs are > heavily used, I would hope that the system encoding remains > at us-ascii. But if it doesn't, cjkcodecs should also implement > the change I'm proposing. This is a red herring. > >> Oh, sorry, that's the term I use for PyArg_ParseTuple() format >> arguments. > > Ah, right: They should make use of the default error handling > as well. They currently hard code "strict" for error handling and I don't want to change that (because codecs might not default to "strict" in case the default error handling for the codec is chose). Then again, if you only change the ASCII codec, you wouldn't have to change these hard coded "strict" values. >> Given the scenario you mention above, wouldn't that also be >> possible by providing a customized codec for "ascii" under >> a new name "all-things-ascii" and then setting the default >> encoding to "all-things-ascii" ? > > No. Changing the system default encoding is not possible for > applications - it is the system administrator that needs > to make this change (in site.py). I'm proposing a change that > applications can make at run-time. See above: All you have to do is let Python pick up a sitecustomize module included in application's Python path. You don't need to be root to accomplish that and I would assume that an application developer knows how to implement this work-around. We could even provide such a codec as standard Python encoding, if that's too much hassle for the application developer, something like 'ascii-debug'. >> Just think of the issues this could cause in multi-user systems >> such as Zope that are not prepared for these changes: >> a script could easily change the settings to have >> the server execute code under different user ids if threads >> executing their requests generate codec errors (the errors >> parameter can be set to a callback now that we have the new >> logic in place...). > > This is also a red herring. Zope can give very controlled > access to builtins, and could just dis-allow scripts to change > the setting - Zope applications would need to find a different > way. Zope was just an example of such an application. The problem is that you would be creating a possible vulnerability that you'd have to teach such applications first in order to protect them against it. Of course, you could restrict the global error handling default for the ASCII codec to only accept strings as arguments. > Remember, it is a work-around - so it clearly has limitations. > I'm proposing it anyway, and I'm fully aware of the limitations. And I'm trying to convince you that there are other ways to achieve the same thing without introducing yet more of these application scope globals :-) Boiling down to what we've found, I think you only need to add a switch to turn the ASCII codec errors arguments "strict" and NULL into behaving like "replace". I'd be +0 on that :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From tjreedy at udel.edu Tue Jan 6 19:39:47 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Jan 6 19:39:44 2004 Subject: [Python-Dev] Re: Re: PEP 326 now online References: <005a01c3d3d5$08c30840$e841fea9@oemcomputer><20040106044418.E59742@mail.allcaps.org> <200401061807.i06I7vE11059@c-24-5-183-134.client.comcast.net> Message-ID: "Guido van Rossum" wrote in various messages (I wrote) > > I proposed cmp.lo and cmp.hi as a trial balloon for the idea of > > using specific public function attributes (as opposed to generic > > implementation attributes) as a 'low-cost' alternative to builtin > > names for something that is useful but not so central as None, True, > > and False. .... Using > > Min and Max reuses existing knowledge, so this would be OK by me. > > Hm. cmp is a *builtin function*. I know ;-) Which means that it is immediately available without an import. > That seems an exceedingly odd place to stick arbitrary constants Yes, this is an 'innovative' suggestion in that regard (hence 'trial balloon'). Cmp does, of course, already have the standard default set of special attributes and methods, so giving it two public attributes is not completely odd. My position: Extreme values are useful/needed for certain types of algorithms. By current standards, they are not generally useful enough to go in builtins. They *are* plausibly useful enough to be included in the standard language if such can be done unobstrusively. Doing so will aid writing and teaching the algorithms that need them. If one accepts point three, at least hypothetically, the question is where to 'unobtrusively' put them. After considering other possibilities, I came up with the cmp suggestion, which seems easy enough both to document and remember. > -- much more so than type objects some of which are now callable factories, which as deterministic mappings are functions in the mathematical sense. In an alternate universe in which entry to builtins was even more restricted, True and False might have been introduced as bool.True and bool.False. And 'cmp' might be a callable int subtype, like bool is now, returning Less, Equal, More == -1, 0 1. > make[ing] them attributes of object has the unfortunate side effect > that they will be attributes of *all* objects and that doesn't seem a good idea. While I considered and mentioned this possibility, I rejected it and looked further for precisely that reason. > Module attributes make sense The negatives being the need to import (minor) and the need to pick one (but which?) >The math module is only appropriate if this is primarily about float numbers. Which is why I keep looking. The extreme values should work with anything except a user class which defines competing extremes. >Why two subclasses? Shouldn't one with two instances suffice? Definitely, and that is my proposal. Or, if a new class were not desired, perhaps even two new instances of NoneType if that would work and not break anything. I considered the specific implementation unimportant as long as it is unobtrusive. Terry J. Reedy From kbk at shore.net Tue Jan 6 19:40:31 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Tue Jan 6 19:40:35 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <200401062321.i06NLiY11832@c-24-5-183-134.client.comcast.net> (Guido van Rossum's message of "Tue, 06 Jan 2004 15:21:44 -0800") References: <3FFB2E1F.4010801@v.loewis.de> <200401062321.i06NLiY11832@c-24-5-183-134.client.comcast.net> Message-ID: <871xqcioao.fsf@hydra.localdomain> Guido van Rossum writes: > [MvL] >> After setup.py fails to build one extension module, it *should* >> continue with the next one. Are you sure _tkinter.sl was not >> build??? > > setup.py skips some modules if the required .h files cannot be found. > If you never installed Tcl/Tk, it will skip _tkinter.c silently. I have a patch to setup.py in the queue for review (#850977). This handles a problem on Free/OpenBSD where the tk/tcl headers are named in an unusual way. You may have a problem as simple as that. http://sourceforge.net/tracker/index.php?func=detail&aid=850977&group_id=5470&atid=305470 -- KBK From jcarlson at uci.edu Tue Jan 6 19:45:22 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue Jan 6 19:48:03 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <200401062118.i06LIWF11616@c-24-5-183-134.client.comcast.net> References: <20040106120607.FC66.JCARLSON@uci.edu> <200401062118.i06LIWF11616@c-24-5-183-134.client.comcast.net> Message-ID: <20040106164512.FC6C.JCARLSON@uci.edu> > Module attributes make sense; make them attributes of object has > the unfortunate side effect that they will be attributes of *all* > objects and that doesn't seem a good idea. > > The math module is only appropriate if this is primarily about float > numbers. And see PEP 747 in that case. You mean PEP 754, right? Just like creating an arbitrarily large integer, floating point infinity is also arbitrary, and may not be big enough. I just got a comment from another user suggesting modifying the min/max.__cmp__ so that they are the actual minimum and maximum. An interesting approach, which makes some sense to me. - Josiah P.S. Right now: >>> min > max 1 From DavidA at ActiveState.com Tue Jan 6 20:08:39 2004 From: DavidA at ActiveState.com (David Ascher) Date: Tue Jan 6 20:12:54 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <200401062321.i06NLiY11832@c-24-5-183-134.client.comcast.net> References: <3FFB2E1F.4010801@v.loewis.de> <200401062321.i06NLiY11832@c-24-5-183-134.client.comcast.net> Message-ID: <3FFB5C17.6060607@ActiveState.com> Guido van Rossum wrote: >setup.py skips some modules if the required .h files cannot be found. >If you never installed Tcl/Tk, it will skip _tkinter.c silently. > > Suggestion w/o a patch: I think that for 2.4, it'd be nice if distutils used the logging module extensively so that the 'silently' above can be changed easily. A good project for a snowy evening. --david From kbk at shore.net Tue Jan 6 21:06:53 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Tue Jan 6 21:06:58 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <3FFB5C17.6060607@ActiveState.com> (David Ascher's message of "Tue, 06 Jan 2004 17:08:39 -0800") References: <3FFB2E1F.4010801@v.loewis.de> <200401062321.i06NLiY11832@c-24-5-183-134.client.comcast.net> <3FFB5C17.6060607@ActiveState.com> Message-ID: <87isjoh5qa.fsf@hydra.localdomain> David Ascher writes: > Suggestion w/o a patch: I think that for 2.4, it'd be nice if distutils > used the logging module extensively so that the 'silently' above can be > changed easily. A good project for a snowy evening. One of the things in the patch is the line: ! self.announce("INFO: Can't locate Tcl/Tk libs and/or headers", 2) so a notice will appear in the configure output, whereas previously it gave up silently. -- KBK From stuart at stuartbishop.net Tue Jan 6 21:14:46 2004 From: stuart at stuartbishop.net (Stuart Bishop) Date: Tue Jan 6 21:15:35 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FF6AB34.1000502@v.loewis.de> References: <3FF6AB34.1000502@v.loewis.de> Message-ID: <4380DA98-40B7-11D8-8353-000A95A06FC6@stuartbishop.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/01/2004, at 10:44 PM, Martin v. Loewis wrote: > People keep complaining that they get Unicode errors > when funny characters start showing up in their inputs. > > In some cases, these people would apparantly be happy > if Python would just "keep going", IOW, they want to > get moji-bake (garbled characters), instead of > exceptions that abort their applications. I think the key word here is 'apparently'. I also think that giving people this knob too tweak will simply shuffle their problem from one place to another. This will be a net loss, since we will end up with these users trying to repair their damaged data, as well as developers having to maintain systems or integrate with libraries where this knob was twiddled either through ignorance or as a quick fix. If you are writing an XML processing application, it might appear quite sane to set unicode.errorhandling to 'xmlcharref' but it will bite you on the bum eventually when that project grows. > I'd like to add a static property unicode.errorhandling, > which defaults to "strict". Applications could set this > to "replace" or "ignore", silencing the errors, and > risking loss of data. I'd be happier as a command line argument for this sort of thing. This way, it is explicit that 'I want to run this application with lossy Unicode conversions' without having to worry about something you didn't write flipping the switch. - -- Stuart Bishop http://www.stuartbishop.net/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (Darwin) iD8DBQE/+2uZAfqZj7rGN0oRAsyJAJ4yyO1/VwF8131Ev8IN44KtpAH1MQCfRlhI 6e65Oclei4FonvO5gHBbEjc= =ikNP -----END PGP SIGNATURE----- From evan at 4-am.com Tue Jan 6 22:20:32 2004 From: evan at 4-am.com (Evan Simpson) Date: Tue Jan 6 22:20:13 2004 Subject: [Python-Dev] Dots in __name__ break import -- fixable? In-Reply-To: <200401062318.i06NI0n11807@c-24-5-183-134.client.comcast.net> References: <3FFB29F1.3050201@4-am.com> <200401062318.i06NI0n11807@c-24-5-183-134.client.comcast.net> Message-ID: <3FFB7B00.90709@4-am.com> Guido van Rossum wrote: >>Would this be easy to fix? > > > No. Oh well. Setting __name__ to None seems to work fine for Script purposes, so I'll probably go with that. Cheers, Evan @ 4-am From bob at redivi.com Tue Jan 6 23:15:13 2004 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 6 23:13:37 2004 Subject: [Python-Dev] Re: PEP 326 now online Message-ID: <1772B65D-40C8-11D8-B0D7-000A95686CD8@redivi.com> > > Module attributes make sense; make them attributes of object has > > the unfortunate side effect that they will be attributes of *all* > > objects and that doesn't seem a good idea. > > > > The math module is only appropriate if this is primarily about float > > numbers. And see PEP 747 in that case. > > You mean PEP 754, right? Just like creating an arbitrarily large > integer, floating point infinity is also arbitrary, and may not be big > enough. > > I just got a comment from another user suggesting modifying the > min/max.__cmp__ so that they are the actual minimum and maximum. > > An interesting approach, which makes some sense to me. [[ I apologize in advance for not carrying along the In-Reply-To and References headers, I just finally got around to subscribing to this list. ]] This was my suggestion. I am for this PEP and am willing to write the reference implementation for the new min and max builtins if there's enough interest. I have, like some others here, used my own One True Large Object. I think the best reason to have One True Large Object is because you can't really compare two implementations of the One True Large Object and expect to get a meaningful result out of it. For the record, my use case had to do with a giant sorted list of tuples and the bisect module. The first element of a tuple was a timestamp, the rest of the tuple isn't worth explaining but I never wanted to compare against it. The "database" had two primary operations, inserting records *after* a timestamp, and finding every record between two timestamps. Let's take a look: >>> from random import randrange >>> from pprint import pprint >>> from bisect import bisect_left, bisect_right >>> lst = [(randrange(10), randrange(10)) for i in xrange(10)] >>> lst.sort() >>> pprint(lst) [(0, 2), (0, 7), (0, 9), (4, 6), (6, 5), (6, 8), (6, 9), (7, 7), (8, 6), (9, 5)] >>> bisect_left(lst, (6,)) 4 >>> bisect_right(lst, (6,)) 4 Well, that doesn't look too useful does it? In order for bisect_right to have a meaningful return value in this case, I need to build an object such that it compares greater than any (6, *foo). >>> class MaxObject: ... def __cmp__(self, other): ... return int(not self is other) ... >>> maxobject = MaxObject() >>> bisect_right(lst, (6, maxobject)) 7 >>> lst[4:7] [(6, 5), (6, 8), (6, 9)] In this particular contrived case, sys.maxint would have worked, but the general case needs One True Large Object. I think One True Large Object should be in builtins and should be called 'max'. Similarly, there are probably use cases for One True Small Object, but none that I've personally ran into. However, since 'min' is already in builtins, might as well do it for symmetry. -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2357 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040106/be0f57a5/smime.bin From martin at v.loewis.de Tue Jan 6 23:47:16 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Tue Jan 6 23:47:21 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FFB49A1.3030001@egenix.com> References: <3FF6AB34.1000502@v.loewis.de> <3FF8A058.4020904@egenix.com> <3FF9D7F0.4080802@v.loewis.de> <3FFA8D74.2060809@egenix.com> <3FFB21C9.5080800@v.loewis.de> <3FFB28EE.2090208@egenix.com> <3FFB2D4D.7040606@v.loewis.de> <3FFB49A1.3030001@egenix.com> Message-ID: <3FFB8F54.3040601@v.loewis.de> M.-A. Lemburg wrote: > Hmm, I would assume that application authors do have the > proper permissions to use the work-around I mentioned > below, that is, add sitecustomize.py to their application > as top-level module and change the default encoding to > use their custom codec instead. But that would interfere with a sitecustomize.py already present on the system. > Then again, if you only change the ASCII codec, you wouldn't > have to change these hard coded "strict" values. Why not? If the argument parser invokes the default encoding, and passes "strict", then this would override the default error handling of the codec, no? Regards, Martin From guido at python.org Wed Jan 7 01:06:04 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 7 01:06:21 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: Your message of "Tue, 06 Jan 2004 23:15:13 EST." <1772B65D-40C8-11D8-B0D7-000A95686CD8@redivi.com> References: <1772B65D-40C8-11D8-B0D7-000A95686CD8@redivi.com> Message-ID: <200401070606.i07664Y12487@c-24-5-183-134.client.comcast.net> > > I just got a comment from another user suggesting modifying the > > min/max.__cmp__ so that they are the actual minimum and maximum. > > > > An interesting approach, which makes some sense to me. Not to me. Random reuses like this would make Python into a mysterious language. > This was my suggestion. I am for this PEP and am willing to write the > reference implementation for the new min and max builtins if there's > enough interest. Not from me -- don't waste your time. > I have, like some others here, used my own One True Large Object. I > think the best reason to have One True Large Object is because you > can't really compare two implementations of the One True Large Object > and expect to get a meaningful result out of it. > > For the record, my use case had to do with a giant sorted list of > tuples and the bisect module. The first element of a tuple was a > timestamp, the rest of the tuple isn't worth explaining but I never > wanted to compare against it. The "database" had two primary > operations, inserting records *after* a timestamp, and finding every > record between two timestamps. Let's take a look: Your example (snipped here) seems to ask for a different kind of data structure, rather than an object larger than everything else. --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Wed Jan 7 03:36:44 2004 From: theller at python.net (Thomas Heller) Date: Wed Jan 7 03:36:49 2004 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/sandbox/msi msilib.py, 1.2, 1.3 In-Reply-To: <3FFB24EA.2090300@v.loewis.de> (Martin v. Loewis's message of "Tue, 06 Jan 2004 22:13:14 +0100") References: <3FFB24EA.2090300@v.loewis.de> Message-ID: "Martin v. Loewis" writes: > Thomas Heller wrote: > >>>! for k, v in [(r"Software\Microsoft\VisualStudio\7.1\Setup\VS", "VS7CommonBinDir"), >>>! (r"Software\Microsoft\Win32SDK\Directories", "Install Dir")]: >>>! try: >>>! key = _winreg.OpenKey(_winreg.HKEY_LOCAL_MACHINE, k) >>>! except WindowsError: >>>! continue >>>! cabarc = os.path.join(_winreg.QueryValueEx(key, v)[0], r"Bin", "cabarc.exe") >>>! _winreg.CloseKey(key) >>>! assert os.path.exists(cabarc), cabarc >> The above assert fails for me, there's no cabarc.exe in the >> VS7CommonBinDir: Do I have to install additional software, apart from >> VS.NET 2003? > > I don't think so. Where is cabarc.exe located, in your VS7.1 > installation tree? It should be in common/bin, which in turn > should be listed in the first of the two registry keys. Searching for cabarc.exe finds this: c:\BC5\BIN\cabarc.exe c:\cabsdk\BIN\CABARC.EXE c:\Dokumente und Einstellungen\thomas\Desktop\downloads\Cabsdk\BIN\CABARC.EXE c:\Programme\MicrosoftSDK\Bin\CabArc.Exe c:\WINDOWS\Prefetch\CABARC.EXE-04C2A86A.pf The first one in c:\bc5\bin is an old version, included in an old borland compiler I have installed. The one in c:\cabskd is from MS' cabinet sdk, also the one in downloads. c:\Programme\MicrosoftSDK is the Feb 2003 Platform SDK, which I had installed before installing VS 7.1. So, it seems there is no cabsdk.exe in my VS 7.1 installation tree. > > Can you find out what is different in your installation? > Also, what is the resulting cabarc variable in above assertion? c:\sf\python\nondist\sandbox\msi>py24 msi.py WARNING: Missing extension _bsddb.pyd Traceback (most recent call last): File "msi.py", line 753, in ? add_files(db) File "msi.py", line 627, in add_files cab.commit(db) File "c:\sf\python\nondist\sandbox\msi\msilib.py", line 367, in commit assert os.path.exists(cabarc), cabarc AssertionError: C:\Programme\Microsoft Visual Studio .NET 2003\Common7\Tools\Bin\cabarc.exe > c:\sf\python\nondist\sandbox\msi\msilib.py(367)commit() -> assert os.path.exists(cabarc), cabarc (Pdb) Ah, now I see. Changing the code to this: for k, v in [(r"Software\Microsoft\VisualStudio\7.1\Setup\VS", "VS7CommonBinDir"), (r"Software\Microsoft\Win32SDK\Directories", "Install Dir")]: try: key = _winreg.OpenKey(_winreg.HKEY_LOCAL_MACHINE, k) except WindowsError: continue cabarc = os.path.join(_winreg.QueryValueEx(key, v)[0], r"Bin", "cabarc.exe") _winreg.CloseKey(key) if not os.path.exists(cabarc): continue break else: print "WARNING: cabarc.exe not found in registry" cabarc = "cabarc.exe" makes it work, since it now finds the cabarc.exe in c:\Programme\MicrosoftSDK. Where is cabarc.exe on *your* system? Thomas From jcarlson at uci.edu Wed Jan 7 04:30:23 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Jan 7 04:33:18 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <200401070606.i07664Y12487@c-24-5-183-134.client.comcast.net> References: <1772B65D-40C8-11D8-B0D7-000A95686CD8@redivi.com> <200401070606.i07664Y12487@c-24-5-183-134.client.comcast.net> Message-ID: <20040106224357.13CC.JCARLSON@uci.edu> > Not to me. Random reuses like this would make Python into a > mysterious language. The reuse isn't random. It would be random if int compared smaller than everything and float compared larger than everything, or even if the strings "little" and "big" did the same. Using min and max to be the smallest and largest objects, as well as being functions that can be called to find the smallest or largest value in a sequence, seems to be intuitive. But lets get past that for a moment. Let us just pretend, for sake of argument, that min and max were to become the smallest and largest values. The only place this would effect pre-existing code is if someone were ordering functions based on '<' or '>' comparisons, IE: they would have to be sorting functions, specifically, min and max. If it is desireable to warn programmers who have been doing this, it would be relatively easy to produce a warning when this happens (to warn those that had been previously sorting functions). Making the warning removable by using a __future__ import for those of us who know about this behavior: from __future__ import minmax And subsequently removing the warnings in later Python versions. > Not from me -- don't waste your time. Are you against the *idea* of a top and bottom value, its location, or both? - Josiah From mal at egenix.com Wed Jan 7 05:35:42 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Jan 7 05:35:53 2004 Subject: [Python-Dev] Relaxing Unicode error handling In-Reply-To: <3FFB8F54.3040601@v.loewis.de> References: <3FF6AB34.1000502@v.loewis.de> <3FF8A058.4020904@egenix.com> <3FF9D7F0.4080802@v.loewis.de> <3FFA8D74.2060809@egenix.com> <3FFB21C9.5080800@v.loewis.de> <3FFB28EE.2090208@egenix.com> <3FFB2D4D.7040606@v.loewis.de> <3FFB49A1.3030001@egenix.com> <3FFB8F54.3040601@v.loewis.de> Message-ID: <3FFBE0FE.6000209@egenix.com> Martin v. Loewis wrote: > M.-A. Lemburg wrote: > >> Hmm, I would assume that application authors do have the >> proper permissions to use the work-around I mentioned >> below, that is, add sitecustomize.py to their application >> as top-level module and change the default encoding to >> use their custom codec instead. > > But that would interfere with a sitecustomize.py already > present on the system. Possibly, yes, provided the site uses sitecutomize.py. >> Then again, if you only change the ASCII codec, you wouldn't >> have to change these hard coded "strict" values. > > Why not? If the argument parser invokes the default encoding, > and passes "strict", then this would override the default > error handling of the codec, no? No, because if you override the errors argument in the ASCII codec alone (and depending on the switch you introduce), there's no need to change anything in the system at all to have an enabled switch silence the errors. What do you think ? Wouldn't that be a solution to what you are trying to achieve ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mchermside at ingdirect.com Wed Jan 7 09:43:14 2004 From: mchermside at ingdirect.com (Chermside, Michael) Date: Wed Jan 7 09:43:37 2004 Subject: [Python-Dev] PEP 326 now online Message-ID: <7F171EB5E155544CAC4035F0182093F04212F1@INGDEXCHSANC1.ingdirect.com> Josiah Carlson writes: > I've submitted my latest changes that I believe address the concerns > brought up since I last emailed the list. [...] > The major change of note: it is no longer a builtin. Nice work! With this current proposal, you've gotten my vote to a solid +0. I find your current motivation far more convincing -- although I don't think this is something *I* would need much, I can see the advantages to having standard High and Low objects. The only open issue that I think presents any significant barrier is the question of where to put them. No answer proposed so far seems like the "one obvious place", and most have been, well, a bit peculiar (function attributes; sorting properties of function objects). Of these, I think using min and max (the functions) is my favorite solution, judging by the simple metric of what would I be most likely to remember if I had seen it mentioned in a manual once but hadn't actually used it for a long time. But Guido seems to find this a bit "too clever". -- Michael Chermside This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it. From guido at python.org Wed Jan 7 10:01:35 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 7 10:01:40 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: Your message of "Wed, 07 Jan 2004 01:30:23 PST." <20040106224357.13CC.JCARLSON@uci.edu> References: <1772B65D-40C8-11D8-B0D7-000A95686CD8@redivi.com> <200401070606.i07664Y12487@c-24-5-183-134.client.comcast.net> <20040106224357.13CC.JCARLSON@uci.edu> Message-ID: <200401071501.i07F1ZL13165@c-24-5-183-134.client.comcast.net> > > Not to me. Random reuses like this would make Python into a > > mysterious language. > > The reuse isn't random. It would be random if int compared smaller than > everything and float compared larger than everything, or even if the > strings "little" and "big" did the same. > > > Using min and max to be the smallest and largest objects, as well as > being functions that can be called to find the smallest or largest value > in a sequence, seems to be intuitive. But lets get past that for a > moment. If that's "intuitive" to you, our ideas about language design must be so different that I have low hopes for something useful coming out of this. > Let us just pretend, for sake of argument, that min and max were to > become the smallest and largest values. The only place this would > effect pre-existing code is if someone were ordering functions based on > '<' or '>' comparisons, IE: they would have to be sorting functions, > specifically, min and max. That is *not* the point at all. The point is that using min() and max() as functions is several orders of magnitude more common than using an infinity value. Most people quickly forget features or details they don't need. So it is likely that many Python users would only be familiar with the function usage, and if they say code that passed max in the meaning of infinity, they'd scratch their head and wonder whether it was used as a callback, or whether there was a local variable max, or whether it was simply a bug. > If it is desireable to warn programmers who have been doing this, it > would be relatively easy to produce a warning when this happens (to warn > those that had been previously sorting functions). Making the warning > removable by using a __future__ import for those of us who know about > this behavior: > from __future__ import minmax > And subsequently removing the warnings in later Python versions. > > > > Not from me -- don't waste your time. > > Are you against the *idea* of a top and bottom value, its location, or > both? So far *all* of the names that have been proposed for the concept suck, starting with "Some" and "Any", and now various ways to abuse builtins. I am not against the concept of a universal extreme by itself, but IMO their use is fairly infrequent (except in your mind perhaps, because you've clearly become obsessed with it), so it should not be a builtin, nor disguised as a builtin. How about you write the Python code to implement proper universal extremes, put it in a module, and submit that module for inclusion of the standard library. Then my resistence would be a lot less (until Raymond Hettinger offers to reimplement it in C :-). Hey, universalextremes.py seems a fine name for that module, and UniversalMaximum and UniversalMinimum seem fine names for the two objects in that module. If you find those names too long, you can always write from universalextremes import UnivsersalMaximum as UMax --Guido van Rossum (home page: http://www.python.org/~guido/) From astrand at lysator.liu.se Wed Jan 7 10:21:15 2004 From: astrand at lysator.liu.se (=?iso-8859-1?Q?Peter_=C5strand?=) Date: Wed Jan 7 10:21:25 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module Message-ID: > This PEP looks fantastic to me; I've often wanted a better set of > primitives for working with external processes, and your PEP has just > about everything I've ever wanted. Great :-) > I do have some minor nits to pick, though: > > - The preexec_args argument is extraneous considering how easy it is to > use a lambda construct to bind arguments to a function. Consider: > > Popen(..., preexec_fn=foo, preexec_args=(bar,baz)) > > vs. > > Popen(..., preexec_fn=lambda: foo(bar,baz)) Nice. The only drawback with this approach that I can think of right now is that this will be hard for newbies to understand, but if we give a few examples in the doc this is probably not a problem. I'll make this change, unless I hear protests. > - Rather than passing argv0 as a named argument, what about passing in > the program to execute instead? It would simplify code like this: > > Popen(child_program, *child_args[1:], argv0=child_args[0]) > > into this > > Popen(*child_args, program=child_program) > > Of course I'm not suggesting you change the default behavior of taking > argv0 *and* the program to execute from the first positional argument. I don't understand exactly what you mean here; could you provide a working example? > - The defaults for close_fds and universal_newlines should be False > rather than 0; this would make it clear from reading the synopsis that > these arguments are boolean-valued. Yes, I've fixed this now. > - The poll() method and returncode attribute aren't both needed. I like > returncode's semantics better since it gives more information and it > avoids using -1 as a magic number. Well, as I described in another mail, it's actually a big advantage to provide both a returncode attribute and an unmodified status value, since it's makes it easy to migrate from the "old" API. Perhaps we'll need to make some changes to the poll() method wrt Windows, though. > - Rather than raising a PopenException for incorrect arguments, I would > think that raising ValueError or TypeError would be more in line with > users' expectations. Sounds good to me. Anyone else has comments? > - How does this communicate() method interact with I/O redirection? Now I'm lost again. Which kind of redirection are you thinking about? /Peter From Andrew.MacKeith at ABAQUS.com Wed Jan 7 10:32:28 2004 From: Andrew.MacKeith at ABAQUS.com (Andrew MacKeith) Date: Wed Jan 7 10:32:27 2004 Subject: [Python-Dev] HP-UX clean-up References: Message-ID: <3FFC268C.9030702@ABAQUS.com> > Who *are* the folks with an HP-UX interest? At ABAQUS Inc. we use Python on a variety of platforms, including HP-UX, and we are in the process of upgrading to Python-2.3.3. We build in our own specialized build environment, using the pyconfig.h files created by configure on each platform. In the past we have had to make several changes for HP-UX. An important feature of the HP-UX build is that it should be binary compatible for all HP-UX processors (there are at least 10, maybe more, varieties). We have noticed that Python optimizes for the native processor on which it is built, and this can give problems on a different processor. We therefore have to have flags that will force HP to compile for some common generic processor. This affects everyone who has to distribute Python as pre-built binaries for customers, like we do (rather than building it on customers' machines). We are currently targeting: HP-UX 11.00 on PA-RISC 32-bit all processors HP-UX 11.00 on PA-RISC 64-bit all processors HP-UX 11.22 on Intel/Itanium 64-bit (both big- and little-endian setups) HP-UX 11.11 all processors HP-UX 11.i all processors Andrew MacKeith Cameron Laird wrote: > Python generation for HP-UX is broken. I want to fix it. > > Do I need to make the case for the first proposition? I think > the threading problems are widely recognized ... This isn't > an accusation of moral turpitude, incidentally; it just hasn't > worked out to solve HP-UX difficulties, for plenty of legiti- > mate reasons. I understand that. > > I happened to try a vanilla generation of 2.3.3 under HP-UX > 10.20 yesterday, and encountered multiple non-trivial faults > (curses, threading, ...). Experience tells me I'd find roughly > the same level of problems with other releases of Python and > HP-UX. > > Do I need to make the case that fixing HP-UX (and eventually > Irix and so on) is a Good Thing? I'll assume not, at least for > now. > > I recognize that there's been a lot of churn in HP-UX fixes-- > tinkering with threading libraries, compilation options, and > so on. I'm sensitive to such. My plan is to make conservative > changes, ones unlikely to degrade the situation for other HP-UX > users. I have plenty of porting background; I'm sure we can > improve on what we have now. > > Here's what I need: > 1. Is it best to carry this on here, or should > those of us with HP-UX interests go off by > ourselves for a while, and return once we've > established consensus? I've been away, and > don't feel I know the culture here well. > > Who *are* the folks with an HP-UX interest? > 2. Does anyone happen to have an executable > built that can "import Tkinter"? That's my > most immediate need. If you can share that > with me, it'd help my own situation, and I'll > continue to push for corrections in the > standard distribution, anyway. We do not use Tkinter. > > I can compile _tkinter.c "by hand", but > (more details, later--the current point is > just that stuff doesn't work automatically > ... > 3. I'm very rusty with setup.py. While I was > around for its birth, as I recall, I feel > clumsy with it now. Is there a write-up > anyone would recommend on good setup.py > style? > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mackeith%40acm.org > From skip at pobox.com Wed Jan 7 10:43:24 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 7 10:43:41 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <3FFC268C.9030702@ABAQUS.com> References: <3FFC268C.9030702@ABAQUS.com> Message-ID: <16380.10524.703424.981390@montanaro.dyndns.org> Andrew> We have noticed that Python optimizes for the native processor Andrew> on which it is built, and this can give problems on a different Andrew> processor. We therefore have to have flags that will force HP Andrew> to compile for some common generic processor. This affects Andrew> everyone who has to distribute Python as pre-built binaries for Andrew> customers, like we do (rather than building it on customers' Andrew> machines). Andrew, Can you submit a patch with these changes? Skip From fumanchu at amor.org Wed Jan 7 10:41:29 2004 From: fumanchu at amor.org (Robert Brewer) Date: Wed Jan 7 10:47:29 2004 Subject: [Python-Dev] Re: PEP 326 now online Message-ID: Bob Ippolito wrote: > I just got a comment from another user suggesting modifying the > min/max.__cmp__ so that they are the actual minimum and maximum. > > An interesting approach, which makes some sense to me.Guido van Rossum wrote: and Guido van Rossum replied: > Not to me. Random reuses like this would make Python into a > mysterious language. More mysterious than these comparisons? ;) >>> 'a' > 3 True >>> '3' > 4 True Robert Brewer MIS Amor Ministries fumanchu@amor.org From Paul.Moore at atosorigin.com Wed Jan 7 10:52:17 2004 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Wed Jan 7 10:52:22 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060D97@UKDCX001.uk.int.atosorigin.com> From: Peter ?strand >> - The poll() method and returncode attribute aren't both needed. I like >> returncode's semantics better since it gives more information and it >> avoids using -1 as a magic number. > Well, as I described in another mail, it's actually a big advantage to > provide both a returncode attribute and an unmodified status value, since > it's makes it easy to migrate from the "old" API. Perhaps we'll need to > make some changes to the poll() method wrt Windows, though. The peculiar "combined" nature of a Unix status value has no analogue in Windows. And in the context of an object with multiple attributes, I don't see the need anyway. I'd suggest Popen.returncode - Process return code, valid on both Unix and Windows Popen.signal - Signal which caused the program to exit, Unix only. Code which is meant to be portable (to whatever extent that is possible) can use returncode without any problems. Code which is Unix-specific can worry about signal. (I'd suggest signal=0 means no signal occurred, and on Windows, signal is always 0. This means that it's still possible to write portable code which tests signal.) I'm not sure I see "migration from the old API" as an issue. Certainly not a "big advantage". This module should attempt to be "the best possible design", and as such should take the chance to ignore backwards compatibility. If it's good enough, people will be glad to change code to match. If it's *not* good enough, it won't succeed even if it tries to be compatible. This applies both to compatibility with existing Python modules (popenXXX, os, etc) and to compatibility with C APIs (which is where the overloaded returncode comes from). Let's get a good Pythonic design. Paul. From claird at lairds.com Wed Jan 7 10:53:02 2004 From: claird at lairds.com (Cameron Laird) Date: Wed Jan 7 10:53:10 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <16380.10524.703424.981390@montanaro.dyndns.org> Message-ID: > From skip@mojam.com Wed Jan 07 10:43:39 2004 > . > . > . > Andrew> We have noticed that Python optimizes for the native processor > Andrew> on which it is built, and this can give problems on a different > Andrew> processor. We therefore have to have flags that will force HP > Andrew> to compile for some common generic processor. This affects > Andrew> everyone who has to distribute Python as pre-built binaries for > Andrew> customers, like we do (rather than building it on customers' > Andrew> machines). > Andrew, > Can you submit a patch with these changes? > Skip You're losing me, Skip. What, to you, would constitute "a patch"? I'll tell you *my* idea: we teach configure a new flag, maybe --compile-generic that means, "rather than optimize for the native processor, force compilation that's trustworthy on all processors likely to host this operating system." No, wait; since autoconf and I dislike each other so much, let's make it a *makefile* option, so that ... No, wait; we can ... My conclusion: I really would benefit from hearing how others see these issues. I know enough about how big a subject "portability" is to want to learn more. It would suit me fine, incidentally, to move more and more into setup.py and away from configure (which word, "configure", I can barely say without prefixing an invective or two). From mwh at python.net Wed Jan 7 10:57:40 2004 From: mwh at python.net (Michael Hudson) Date: Wed Jan 7 10:57:42 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: (Cameron Laird's message of "Wed, 07 Jan 2004 10:53:02 -0500") References: Message-ID: <2mr7ybzr7v.fsf@starship.python.net> Cameron Laird writes: > It would suit me fine, incidentally, to move more and more into > setup.py and away from configure (which word, "configure", I can > barely say without prefixing an invective or two). Hey, could be worse -- we could be using automake, or even libtool! unhelpfully-ly y'rs mwh -- If you have too much free time and can't think of a better way to spend it than reading Slashdot, you need a hobby, a job, or both. -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#faq From fdrake at acm.org Wed Jan 7 11:02:57 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Jan 7 11:03:15 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <2mr7ybzr7v.fsf@starship.python.net> References: <2mr7ybzr7v.fsf@starship.python.net> Message-ID: <16380.11697.314484.866132@sftp.fdrake.net> Michael Hudson writes: > Hey, could be worse -- we could be using automake, or even libtool! libtool is evil. Let's not go there. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From pje at telecommunity.com Wed Jan 7 11:31:06 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 7 11:34:20 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: Message-ID: <5.1.0.14.0.20040107112910.03073120@mail.telecommunity.com> At 07:41 AM 1/7/04 -0800, Robert Brewer wrote: >and Guido van Rossum replied: > > Not to me. Random reuses like this would make Python into a > > mysterious language. > >More mysterious than these comparisons? ;) > > >>> 'a' > 3 >True > >>> '3' > 4 >True Code that relies on those comparisons is broken, per the reference manual. From skip at pobox.com Wed Jan 7 11:53:07 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 7 11:56:47 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: <16380.10524.703424.981390@montanaro.dyndns.org> Message-ID: <16380.14707.303691.996868@montanaro.dyndns.org> Andrew> We therefore have to have flags that will force HP to compile Andrew> for some common generic processor. Skip> Can you submit a patch with these changes? Cameron> You're losing me, Skip. What, to you, would constitute "a Cameron> patch"? Probably a patch to configure.in. We set lots of platform- or OS-specific compilation flags there. Skip From claird at lairds.com Wed Jan 7 12:01:30 2004 From: claird at lairds.com (Cameron Laird) Date: Wed Jan 7 12:01:36 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <16380.14707.303691.996868@montanaro.dyndns.org> Message-ID: > From skip@mojam.com Wed Jan 07 11:53:13 2004 > Envelope-to: claird@phaseit.net > From: Skip Montanaro > . > . > . > Andrew> We therefore have to have flags that will force HP to compile > Andrew> for some common generic processor. > Skip> Can you submit a patch with these changes? > Cameron> You're losing me, Skip. What, to you, would constitute "a > Cameron> patch"? > Probably a patch to configure.in. We set lots of platform- or OS-specific > compilation flags there. > Skip Clearly. I want to take this opportunity, though, to understand attitude-intent-design, so that any patches in which I'm involved will not merely imitate what (perhaps mistakenly) exists, but will promote long-term usability and maintainability. So: when Mr. MacKeith describes his situation, do those with the most configure expertise regard it as a platform-specific instance of a cross- platform condition, or ... well, I just think there's a lot more to know to get configure.in right. Uh-oh; I just realized there's another possibility. Maybe we're all more-or-less equally expert with the Python build process, because all that anyone does is local imitation, without aspira- tion to a higher design. If so, that's no criticism of any one; it would reflect well on Python's robustness that it hasn't needed more deliberation or structure. I think it's a good time for it, now, however. From peter at cendio.se Wed Jan 7 12:31:59 2004 From: peter at cendio.se (=?iso-8859-1?Q?Peter_=C5strand?=) Date: Wed Jan 7 12:32:04 2004 Subject: [Python-Dev] PEP 324: popen5 - New POSIX process module In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060D97@UKDCX001.uk.int.atosorigin.com> Message-ID: > > Well, as I described in another mail, it's actually a big advantage to > > provide both a returncode attribute and an unmodified status value, since > > it's makes it easy to migrate from the "old" API. Perhaps we'll need to > > make some changes to the poll() method wrt Windows, though. > > The peculiar "combined" nature of a Unix status value has no analogue in > Windows. And in the context of an object with multiple attributes, I don't > see the need anyway. I'd suggest > > Popen.returncode - Process return code, valid on both Unix and Windows > Popen.signal - Signal which caused the program to exit, Unix only. With .returncode being None, if the process died due to a signal? Looks good to me. > Code which is meant to be portable (to whatever extent that is possible) can > use returncode without any problems. Code which is Unix-specific can worry > about signal. (I'd suggest signal=0 means no signal occurred, and on Windows, > signal is always 0. This means that it's still possible to write portable code > which tests signal.) Why not signal=None, if no signal occured? /Peter From evan at 4-am.com Wed Jan 7 12:52:03 2004 From: evan at 4-am.com (Evan Simpson) Date: Wed Jan 7 13:00:20 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040106224357.13CC.JCARLSON@uci.edu> References: <1772B65D-40C8-11D8-B0D7-000A95686CD8@redivi.com> <200401070606.i07664Y12487@c-24-5-183-134.client.comcast.net> <20040106224357.13CC.JCARLSON@uci.edu> Message-ID: <3FFC4743.2080304@4-am.com> Josiah Carlson wrote: > Using min and max to be the smallest and largest objects, as well as > being functions that can be called to find the smallest or largest value > in a sequence, seems to be intuitive. Only in the sense that "min" and "max" are intuitive *names* for these objects, to English speakers. On the other hand, I wouldn't mind having "min()" and "max()" (currently TypeErrors) return the minimal and maximal objects. Cheers, Evan @ 4-am From martin at v.loewis.de Wed Jan 7 13:00:28 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Wed Jan 7 13:00:33 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: Message-ID: <3FFC493C.9020309@v.loewis.de> Cameron Laird wrote: > Clearly. I want to take this opportunity, though, to understand > attitude-intent-design, so that any patches in which I'm involved > will not merely imitate what (perhaps mistakenly) exists, but will > promote long-term usability and maintainability. So: when Mr. > MacKeith describes his situation, do those with the most configure > expertise regard it as a platform-specific instance of a cross- > platform condition, or ... well, I just think there's a lot more > to know to get configure.in right. First, ask what the intent is, then evaluate the intent, then ask for the mechanism to achieve this intent. Intent: Generate binaries that work on all processors Evaluation: Sounds like a good idea Mechanism to achieve: He said "We therefore have to have flags that will force HP to compile for some common generic processor." So apparently, he proposes to use compiler flags, which probably means "command line arguments". Then, go in cycles: Intent: Add new command line options Evaluation: Pro: resulting binary will get more portable Con: resulting binary might lose speed - might make this customizable Con: Long lists of command line options are unmaintainable - should reduce list to the absolute minimum to achieve intent Con: options are clearly compiler-specific - need to determine options depending on platform/compiler Mechanism to achieve: Must use configure to select options depending on compiler and platform. There is nothing we can do about the long list of compiler options, except for documenting each one so we later know what option was added for what reason. This leaves the potential loss of speed. Intent: Make cross-platform executables a user option Evaluation: Pro: User can make a choice. Con: User can make a choice, anyway, by explicitly setting all options Con: User has to know about option Con: Option needs to be documented and maintained. Given that making an option specifically for that purpose adds no value and gives only problems, this intent should not be achieved. > Uh-oh; I just realized there's another possibility. Maybe we're > all more-or-less equally expert with the Python build process, > because all that anyone does is local imitation, without aspira- > tion to a higher design. If so, that's no criticism of any one; > it would reflect well on Python's robustness that it hasn't needed > more deliberation or structure. You are certainly wrong here. Some of us do care about the build process, and want to have it work reliable and maintainable, and improve on reliability and maintainability. I personally have a very clear view of what changes I would like to see for what reason; in particular, I strongly dislike changes that are based on imitation. Instead, each change should be accompanied with a rationale, and with documentation. For example, adding new command line options to the build process is rarely a good thing, IMO, because users don't want to specify endless lists of command line options. They want the defaults to get it all right the first time. Likewise, I despise attempts to throw away the build process and replace it with something new: anybody attempting should first prove that the new solution is indeed capable of providing advantages over the current process. Regards, Martin From pf_moore at yahoo.co.uk Wed Jan 7 13:18:30 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Wed Jan 7 13:18:23 2004 Subject: [Python-Dev] Re: PEP 324: popen5 - New POSIX process module References: <16E1010E4581B049ABC51D4975CEDB8803060D97@UKDCX001.uk.int.atosorigin.com> Message-ID: Peter ?strand writes: > With .returncode being None, if the process died due to a signal? Looks > good to me. [...] > Why not signal=None, if no signal occured? Yes. Both of these sound right. Paul -- This signature intentionally left blank From martin at v.loewis.de Wed Jan 7 15:07:34 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Wed Jan 7 15:12:32 2004 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/sandbox/msi msilib.py, 1.2, 1.3 In-Reply-To: References: <3FFB24EA.2090300@v.loewis.de> Message-ID: <3FFC6706.2030207@v.loewis.de> Thomas Heller wrote: > Where is cabarc.exe on *your* system? I have one in the platform SDK directory, but msilib.py finds the one in E:\Program Files\Microsoft Visual Studio .NET 2003\Common7\Tools\Bin I now looked at the .NET MSI file, and found that CabArc.exe is only installed if the feature "Win32 SDK Tools" is selected, which is a subfeature of VC++. As it is optional, I have now removed the assertion that VC++ provides it, and implemented your proposed change. Regards, Martin From jcarlson at uci.edu Wed Jan 7 16:11:05 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Jan 7 16:23:44 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <200401071501.i07F1ZL13165@c-24-5-183-134.client.comcast.net> References: <20040106224357.13CC.JCARLSON@uci.edu> <200401071501.i07F1ZL13165@c-24-5-183-134.client.comcast.net> Message-ID: <20040107104457.0F01.JCARLSON@uci.edu> > If that's "intuitive" to you, our ideas about language design must be > so different that I have low hopes for something useful coming out of > this. I don't know about that; until this thread, I've basically agreed with every direction Python has gone in the 5 years since I started using it, but that is a one-way relationship. Overloading min and max was one option, among quite a few, that would have resulted in min and max being what they do. You seem to hate it, so those of us who desire the functionality will seek other alternatives... > > Are you against the *idea* of a top and bottom value, its location, or > > both? > > So far *all* of the names that have been proposed for the concept > suck, starting with "Some" and "Any", and now various ways to abuse > builtins. > > I am not against the concept of a universal extreme by itself, but > IMO their use is fairly infrequent (except in your mind perhaps, > because you've clearly become obsessed with it), so it should not be a > builtin, nor disguised as a builtin. Just so you know, I'm not obsessed with it. As my wife just pointed out, anything that I believe in, I go "balls-to-the-walls" arguing for regardless of consequences. Sometimes it alienates people; seemingly I have alienated you. I'm sorry if this is the case. > How about you write the Python code to implement proper universal > extremes, put it in a module, and submit that module for inclusion > of the standard library. Then my resistence would be a lot less > (until Raymond Hettinger offers to reimplement it in C :-). > > Hey, universalextremes.py seems a fine name for that module, and > UniversalMaximum and UniversalMinimum seem fine names for the two > objects in that module. Sounds reasonable. Until my changes are available from CVS, you can catch the latest revision here http://josiahcarlson.no-ip.org/pep-0326.html which includes a sample implementation that could be easily placed into a universalextremes.py. Off list, Andrew Lentvorski has suggested operator.Min/Max as a location, which also sounds reasonable, but I don't know others feel about locating it in operators. - Josiah From skip at pobox.com Wed Jan 7 17:13:25 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 7 17:13:39 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: <16380.14707.303691.996868@montanaro.dyndns.org> Message-ID: <16380.33925.790310.892064@montanaro.dyndns.org> Cameron> Clearly. I want to take this opportunity, though, to Cameron> understand attitude-intent-design, so that any patches in which Cameron> I'm involved will not merely imitate what (perhaps mistakenly) Cameron> exists, but will promote long-term usability and Cameron> maintainability. Cameron, We're talking about configure and you're talking about usability and maintainability. What rock have you been hiding under? On a more serious note, if there is a way to break up configure.in into separate scripts (I don't care if configure is generated as a single script) I'd be happy to take a stab at it. That would make it easier for potential editors to restrict their damage^H^H^H^H^H^Henhancements to a smaller chunk of the overall configure script. If I can get a list of options currently generated for the HP compiler (not GCC, right?) and what would produce a processor-independent executable, I'll take a crack at tweaking configure.in and adding a --native-hpux flag or something similar. I think the basic problem here is that while many compilers can generate processor-dependent code, the default is usually to create processor-independent code. It appears HP flipped the switch on that, so we have to do a little more work to ensure processor independence. Skip From zack at codesourcery.com Wed Jan 7 17:39:07 2004 From: zack at codesourcery.com (Zack Weinberg) Date: Wed Jan 7 17:39:27 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: (Cameron Laird's message of "Wed, 07 Jan 2004 12:01:30 -0500") References: Message-ID: <87oetfgz90.fsf@egil.codesourcery.com> I'm piggybacking on this thread to point out that there are some pretty drastic simplifications possible in configure.in. To give examples: it contains its own logic for determining a 'canonical name' for the operating system, rather than use AC_CANONICAL_HOST; almost all the compile checks are using lower-level macros than they ought to; there are several places where AC_DEFUN should be used to encapsulate repetitive code, such as the custom checks for functions like ctermid_r; there are tests that are completely unnecessary, such as "AC_CHECK_SIZEOF (char, 1)" [sizeof(char)==1 is guarantted by the C standard]. Another cleanup that would be nice is to get rid of Modules/Setup and related; as I understand it, this is only around because the top level setup.py hasn't been finished yet. (And for a pipe dream, could we have a way to set the disabled_module_list from the configure command line?) zw From skip at pobox.com Wed Jan 7 17:53:54 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 7 17:54:05 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <87oetfgz90.fsf@egil.codesourcery.com> References: <87oetfgz90.fsf@egil.codesourcery.com> Message-ID: <16380.36354.394842.220080@montanaro.dyndns.org> Zack> I'm piggybacking on this thread to point out that there are some Zack> pretty drastic simplifications possible in configure.in. To paraphrase Tim, Guido, MvL and many others: "Patches are welcome". You knew that would be the canned reply, right? The Python configure script has been around for a long time. I suspect much of the cruft that's there is simply because it hasn't kept up with newer versions of autoconf. The sizeof check for the char type probably predates the decision to require an ANSI C compiler. I can probably elide that without breaking too much. Zack> Another cleanup that would be nice is to get rid of Modules/Setup Zack> and related; as I understand it, this is only around because the Zack> top level setup.py hasn't been finished yet. (And for a pipe Zack> dream, could we have a way to set the disabled_module_list from Zack> the configure command line?) Nothing's really ever finished, least of all distutils... I believe Modules/Setup is still the best (only?) way to create a statically linked interpreter (all extension modules builtin) or add your own extension modules without fiddling setup.py. Skip From claird at lairds.com Wed Jan 7 18:09:57 2004 From: claird at lairds.com (Cameron Laird) Date: Wed Jan 7 18:10:08 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <16380.36354.394842.220080@montanaro.dyndns.org> Message-ID: > From skip@mojam.com Wed Jan 07 17:54:01 2004 > Envelope-to: claird@phaseit.net > From: Skip Montanaro > . > . > . > To paraphrase Tim, Guido, MvL and many others: "Patches are welcome". You > knew that would be the canned reply, right? > The Python configure script has been around for a long time. I suspect much > of the cruft that's there is simply because it hasn't kept up with newer > versions of autoconf. The sizeof check for the char type probably predates > the decision to require an ANSI C compiler. I can probably elide that > without breaking too much. > Zack> Another cleanup that would be nice is to get rid of Modules/Setup > Zack> and related; as I understand it, this is only around because the > Zack> top level setup.py hasn't been finished yet. (And for a pipe > Zack> dream, could we have a way to set the disabled_module_list from > Zack> the configure command line?) > Nothing's really ever finished, least of all distutils... I believe > Modules/Setup is still the best (only?) way to create a statically linked > interpreter (all extension modules builtin) or add your own extension > modules without fiddling setup.py. > Skip I'm the kind of developer who thinks it very desirable to embed in configure.in such lines as # January 2004: Skip Montanaro explains that he # "believe[s] Modules/Setup is still the best ... # ... In the absence of objections, I'm likely to start submitting patches to the effect later this month. From skip at pobox.com Wed Jan 7 18:15:19 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 7 18:15:27 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: References: <16380.36354.394842.220080@montanaro.dyndns.org> Message-ID: <16380.37639.304353.850952@montanaro.dyndns.org> Cameron> I'm the kind of developer who thinks it very desirable Cameron> to embed in configure.in such lines as Cameron> # January 2004: Skip Montanaro explains that he Cameron> # "believe[s] Modules/Setup is still the best ... Cameron> # ... Cameron> In the absence of objections, I'm likely to start submitting Cameron> patches to the effect later this month. Sure, hang me out to dry. The wind chill here in Chicago is only what? -10? I should be good and stiff by morning... :-) Skip From skip at pobox.com Wed Jan 7 18:27:53 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 7 18:28:01 2004 Subject: [Python-Dev] pyconfig.h macro usage Message-ID: <16380.38393.450171.642972@montanaro.dyndns.org> After Zacks message about configure.in cruft, I zapped the SIZEOF_CHAR macro and noticed it was never used in any .c files. So that got me to thinking (dangerous, I know)... What about: for macro in `egrep '#undef [_A-Z]' pyconfig.h.in \ | awk '{print $2}' \ | sort` do echo -n "$macro: " find . -name '*.[ch]' \ | egrep -v pyconfig \ | xargs egrep $macro \ | wc -l done \ | sort -n -k 2,2 Turns out there are quite a few macros defined in pyconfig.h that are unused: HAVE_DUP2 SIZEOF_FLOAT HAVE_GETPID SIZEOF_UINTPTR_T HAVE_LIBDL SIZEOF_WCHAR_T HAVE_LIBDLD SYS_SELECT_WITH_SYS_TIME HAVE_LIBIEEE TM_IN_SYS_TIME HAVE_LIBRESOLV WITH_DL_DLD HAVE_PTHREAD_INIT WITH_DYLD HAVE_STDARG_H WITH_LIBINTL HAVE_STRDUP _FILE_OFFSET_BITS HAVE_STRPTIME _LARGEFILE_SOURCE HAVE_ST_BLOCKS _MINIX HAVE_SYS_SOCKET_H _NETBSD_SOURCE HAVE_TERMIOS_H _OSF_SOURCE HAVE_TIMEGM _POSIX_1_SOURCE HAVE_TM_ZONE _POSIX_C_SOURCE HAVE_TRUNCATE _POSIX_SOURCE HAVE_UCS4_TCL _REENTRANT RETSIGTYPE __BSD_VISIBLE SIZEOF_DOUBLE __EXTENSIONS__ Any votes for getting rid of them? Skip From martin at v.loewis.de Wed Jan 7 18:44:20 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Wed Jan 7 18:44:38 2004 Subject: [Python-Dev] pyconfig.h macro usage In-Reply-To: <16380.38393.450171.642972@montanaro.dyndns.org> References: <16380.38393.450171.642972@montanaro.dyndns.org> Message-ID: <3FFC99D4.1070803@v.loewis.de> Skip Montanaro wrote: > So that got me to thinking (dangerous, I know)... What about: I did the same thing when converting to autoconf 2.5x, and zapped all which I was certain had no use (at that time). > Turns out there are quite a few macros defined in pyconfig.h that are > unused: > > HAVE_DUP2 SIZEOF_FLOAT > HAVE_GETPID SIZEOF_UINTPTR_T > HAVE_LIBDL SIZEOF_WCHAR_T > HAVE_LIBDLD SYS_SELECT_WITH_SYS_TIME > HAVE_LIBIEEE TM_IN_SYS_TIME > HAVE_LIBRESOLV WITH_DL_DLD > HAVE_PTHREAD_INIT WITH_DYLD > HAVE_STDARG_H WITH_LIBINTL > HAVE_STRDUP _FILE_OFFSET_BITS > HAVE_STRPTIME _LARGEFILE_SOURCE > HAVE_ST_BLOCKS _MINIX > HAVE_SYS_SOCKET_H _NETBSD_SOURCE > HAVE_TERMIOS_H _OSF_SOURCE > HAVE_TIMEGM _POSIX_1_SOURCE > HAVE_TM_ZONE _POSIX_C_SOURCE > HAVE_TRUNCATE _POSIX_SOURCE > HAVE_UCS4_TCL _REENTRANT > RETSIGTYPE __BSD_VISIBLE > SIZEOF_DOUBLE __EXTENSIONS__ > > Any votes for getting rid of them? __EXTENSIONS__, *_SOURCE, _REENTRANT, _FILE_OFFSET_BITS are all used in system headers - so they need to be defined on the systems where they need to be defined... HAVE_UCS_TCL should probably be used. A number of the macros are not used because they are associated with replacement implementations, such as dup2 (all AC_REPLACE_FUNCS I think). We (probably) cannot remove the test whether a replacement function must be defined; that defines the macro as a side effect. _MINIX can be removed as part of the PEP 11 activities; if you remove that, please implement all scheduled removals. The other ones would need a more detailed investigation. Regards, Martin From martin at v.loewis.de Wed Jan 7 18:48:20 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Wed Jan 7 18:48:31 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <87oetfgz90.fsf@egil.codesourcery.com> References: <87oetfgz90.fsf@egil.codesourcery.com> Message-ID: <3FFC9AC4.5080301@v.loewis.de> Zack Weinberg wrote: > I'm piggybacking on this thread to point out that there are some > pretty drastic simplifications possible in configure.in. To give > examples: it contains its own logic for determining a 'canonical name' > for the operating system, rather than use AC_CANONICAL_HOST; almost > all the compile checks are using lower-level macros than they ought > to; there are several places where AC_DEFUN should be used to > encapsulate repetitive code, such as the custom checks for functions > like ctermid_r; there are tests that are completely unnecessary, such > as "AC_CHECK_SIZEOF (char, 1)" [sizeof(char)==1 is guarantted by the C > standard]. As Skip says: contributions are welcome. Using a different computation of the canonical name may not be possible, though, without performing major changes e.g. to the Lib/plat-* subdirectories. > Another cleanup that would be nice is to get rid of Modules/Setup and > related; as I understand it, this is only around because the top level > setup.py hasn't been finished yet. (And for a pipe dream, could we > have a way to set the disabled_module_list from the configure command line?) Again, as Skip explains: setup.py will never be able to do what Modules/Setup does. So Modules/Setup must stay, and may survive setup.py. Regards, Martin From zack at codesourcery.com Wed Jan 7 19:50:07 2004 From: zack at codesourcery.com (Zack Weinberg) Date: Wed Jan 7 19:50:28 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <16380.36354.394842.220080@montanaro.dyndns.org> (Skip Montanaro's message of "Wed, 7 Jan 2004 16:53:54 -0600") References: <87oetfgz90.fsf@egil.codesourcery.com> <16380.36354.394842.220080@montanaro.dyndns.org> Message-ID: <87d69vgt6o.fsf@egil.codesourcery.com> Skip Montanaro writes: > Zack> I'm piggybacking on this thread to point out that there are some > Zack> pretty drastic simplifications possible in configure.in. > > To paraphrase Tim, Guido, MvL and many others: "Patches are welcome". You > knew that would be the canned reply, right? Yeah, well, lack of time. Sorry. zw From skip at pobox.com Wed Jan 7 20:53:23 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 7 20:53:31 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <87d69vgt6o.fsf@egil.codesourcery.com> References: <87oetfgz90.fsf@egil.codesourcery.com> <16380.36354.394842.220080@montanaro.dyndns.org> <87d69vgt6o.fsf@egil.codesourcery.com> Message-ID: <16380.47123.132690.918767@montanaro.dyndns.org> Zack> Skip Montanaro writes: Zack> I'm piggybacking on this thread to point out that there are some Zack> pretty drastic simplifications possible in configure.in. >> >> To paraphrase Tim, Guido, MvL and many others: "Patches are welcome". You >> knew that would be the canned reply, right? Zack> Yeah, well, lack of time. Sorry. Understood. At least by raising the issue, you've brought it to the surface. I for one would have never thought to look at that stuff. Skip From barry at python.org Thu Jan 8 01:04:19 2004 From: barry at python.org (Barry Warsaw) Date: Thu Jan 8 01:04:27 2004 Subject: [Python-Dev] Followup to Head broken when --enable-unicode=ucs4 is used Message-ID: <1073541859.9106.19.camel@anthem> http://mail.python.org/pipermail/python-dev/2003-December/041488.html I just had this happen again, and I figured out what I did wrong. I actually typed "--enable-unicode=usc4". Must have been all that recent US college football on the brain! The typo in the option wreaks havoc, but fixing that to ucs4 naturally works correctly. I'm not sure there's anything worth changing, but at least now it's in the archives. -Barry From miki.tebeka at zoran.com Thu Jan 8 04:08:56 2004 From: miki.tebeka at zoran.com (Miki Tebeka) Date: Thu Jan 8 04:12:44 2004 Subject: [Python-Dev] argmax? Message-ID: <20040108090856.GK824@zoran.com> Hello All, Do you think there is a place for an "argmax" function (which I use a lot) in the standard library? I'm thinking in the lines of: def argmax(items, func, cmp=cmp): '''argmax(items, func) -> max_item, max_value''' it = iter(items) # Initial values try: item = it.next() except StopIteration: raise ValueError("can't run over empty sequence") val = func(item) for i in it: v = func(i) if cmp(v, val) == 1: item = i val = v return item, val def argmin(items, func, cmp=cmp): '''argmin(items, func) -> min_item, min_value''' return argmax(items, func, lambda x,y : -cmp(x,y)) Thanks. -- ----------------------------------------------------------- Smile, damn it, smile. lambda msg: { "name" : "Miki Tebeka", "email" : "miki.tebeka@zoran.com", "url" : "http://www.cs.bgu.ac.il/~tebeka", "quote" : "The only difference between children and adults " \ "is the price of the toys" }[msg] From jepler at unpythonic.net Thu Jan 8 09:00:22 2004 From: jepler at unpythonic.net (Jeff Epler) Date: Thu Jan 8 09:00:35 2004 Subject: [Python-Dev] pyconfig.h macro usage In-Reply-To: <3FFC99D4.1070803@v.loewis.de> References: <16380.38393.450171.642972@montanaro.dyndns.org> <3FFC99D4.1070803@v.loewis.de> Message-ID: <20040108140022.GA14102@unpythonic.net> On Thu, Jan 08, 2004 at 12:44:20AM +0100, Martin v. Loewis wrote: > HAVE_UCS_TCL should probably be used. the ucs tcl test is only used within configure to choose a default for --enable-unicode. Probably its related AC_DEFINE() can be removed. Jeff From fdrake at acm.org Thu Jan 8 09:44:37 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu Jan 8 09:44:58 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib libstdtypes.tex, 1.149, 1.150 In-Reply-To: References: Message-ID: <16381.27861.673436.371082@sftp.fdrake.net> rhettinger@users.sourceforge.net writes: > Modified Files: > libstdtypes.tex > Log Message: > SF bug #872461: list.extend() described as experimental Thanks Raymond! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jcarlson at uci.edu Thu Jan 8 12:59:14 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu Jan 8 13:02:17 2004 Subject: [Python-Dev] argmax? In-Reply-To: <20040108090856.GK824@zoran.com> References: <20040108090856.GK824@zoran.com> Message-ID: <20040108092543.1363.JCARLSON@uci.edu> Miki, That sounds quite a bit like: from itertools import imap, izip argmax = lambda funct, items: max(izip(imap(funct, items), items)) argmin = lambda funct, items: min(izip(imap(funct, items), items)) Which breaks ties on vals using a comparison with items (I don't know if this is desireable to you). Running a quick test on large sequences (range(100000)) with a relatively simple function (lambda a:a*a) produces the following timings on my machine (best of 7): argmax: 5.953 seconds (with std. dev. of .01455 seconds) izipimap: 5.844 seconds (with std. dev. of .03009 seconds) zipmap: 6.407 seconds (with std. dev. of .072356 seconds) Seems to be a small win for itertools, a close second for your original argmax, and a harrowing loss for map and zip on large sequences. I don't know how anyone would feel about adding it to the standard library, - Josiah > Do you think there is a place for an "argmax" function (which I use a lot) > in the standard library? > > I'm thinking in the lines of: > > def argmax(items, func, cmp=cmp): > '''argmax(items, func) -> max_item, max_value''' > it = iter(items) > # Initial values > try: > item = it.next() > except StopIteration: > raise ValueError("can't run over empty sequence") > val = func(item) > > for i in it: > v = func(i) > if cmp(v, val) == 1: > item = i > val = v > return item, val > > def argmin(items, func, cmp=cmp): > '''argmin(items, func) -> min_item, min_value''' > return argmax(items, func, lambda x,y : -cmp(x,y)) From jeremy at alum.mit.edu Thu Jan 8 15:47:06 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu Jan 8 15:51:42 2004 Subject: [Python-Dev] upgrading to zlib 1.2.1 for Windows Message-ID: <1073594825.6341.180.camel@localhost.localdomain> I noticed that a new version of zlib was released last year. 1.2.0 came out in March through 1.2.1 in late November. There are a lot of API changes and new DLL support, but the new code seems to work fine with a Python build. That is, it compiles with minor changes to the build and the tests all pass. Even if we don't change our code at all, inflate is about 20% faster and crc32 is about 50% faster. Shall we upgrade the Windows build to use this new version? I have the changes made locally, but don't want to commit until people have had a chance to grab the new source. Jeremy From tim.one at comcast.net Thu Jan 8 15:55:38 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 8 15:57:23 2004 Subject: [Python-Dev] upgrading to zlib 1.2.1 for Windows In-Reply-To: <1073594825.6341.180.camel@localhost.localdomain> Message-ID: [Jeremy] > I noticed that a new version of zlib was released last year. 1.2.0 > came out in March through 1.2.1 in late November. > ... > Shall we upgrade the Windows build to use this new version? I have > the changes made locally, but don't want to commit until people have > had a chance to grab the new source. Which Windows build(s)? If you mean the 2.4 VC7 Windows build, sure. The VC6 build appears to be officially unsupported now, so unsure about that one. If there's another kind of "security fix" gimmick in 1.2.0 or 1.2.1, then a backport to 2.3 maint would also be appropriate. From jeremy at alum.mit.edu Thu Jan 8 15:52:36 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu Jan 8 16:14:28 2004 Subject: [Python-Dev] upgrading to zlib 1.2.1 for Windows In-Reply-To: References: Message-ID: <1073595156.6341.182.camel@localhost.localdomain> On Thu, 2004-01-08 at 15:55, Tim Peters wrote: > [Jeremy] > > I noticed that a new version of zlib was released last year. 1.2.0 > > came out in March through 1.2.1 in late November. > > ... > > Shall we upgrade the Windows build to use this new version? I have > > the changes made locally, but don't want to commit until people have > > had a chance to grab the new source. > > Which Windows build(s)? If you mean the 2.4 VC7 Windows build, sure. The > VC6 build appears to be officially unsupported now, so unsure about that > one. If there's another kind of "security fix" gimmick in 1.2.0 or 1.2.1, > then a backport to 2.3 maint would also be appropriate. I meant Python 2.4. There isn't any mention of security fixes in the 1.2.x line, so I assume 1.1.4 is still safe to use. Jeremy From aleaxit at yahoo.com Thu Jan 8 15:50:00 2004 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Jan 8 16:15:38 2004 Subject: [Python-Dev] argmax? In-Reply-To: <20040108092543.1363.JCARLSON@uci.edu> References: <20040108090856.GK824@zoran.com> <20040108092543.1363.JCARLSON@uci.edu> Message-ID: <200401082150.00037.aleaxit@yahoo.com> On Thursday 08 January 2004 06:59 pm, Josiah Carlson wrote: ... > I don't know how anyone would feel about adding it to the standard > library, Personally, I have thought for quite a while that max and min should grow a key= optional keyword argument, just like the sort method of list objects now has in 2.4. Admittedly not quite equivalent to the argmax function proposed (or to your itertools-based solution), but IMHO preferable -- because somelist.sort will now allow just that shortcut to the typical "D-S-U" idiom (if one wants the values as well as the underlying items, then an explicit D-S-U [or however you want to name it when applied to min and max...] can of course still perfectly well be performed, for somelist.sort as well as for min and max). When one has grown used to somelist.sort(key=foobar), having e.g. min(somelist, key=foobar) [for those cases in which one only needs to know the item x of somelist which has minimal foobar(x), rather than the whole somelist-ordered-by-foobar()] is just SUCH natural usage, IMHO...! Alex From theller at python.net Thu Jan 8 16:21:08 2004 From: theller at python.net (Thomas Heller) Date: Thu Jan 8 16:21:16 2004 Subject: [Python-Dev] upgrading to zlib 1.2.1 for Windows In-Reply-To: <1073594825.6341.180.camel@localhost.localdomain> (Jeremy Hylton's message of "Thu, 08 Jan 2004 15:47:06 -0500") References: <1073594825.6341.180.camel@localhost.localdomain> Message-ID: Jeremy Hylton writes: > I noticed that a new version of zlib was released last year. 1.2.0 came > out in March through 1.2.1 in late November. There are a lot of API > changes and new DLL support, but the new code seems to work fine with a > Python build. That is, it compiles with minor changes to the build and > the tests all pass. Even if we don't change our code at all, inflate is > about 20% faster and crc32 is about 50% faster. Sounds good. Note that we're not using the DLL. > Shall we upgrade the Windows build to use this new version? I have the > changes made locally, but don't want to commit until people have had a > chance to grab the new source. > "Tim Peters" writes: > [Jeremy] >> I noticed that a new version of zlib was released last year. 1.2.0 >> came out in March through 1.2.1 in late November. >> ... >> Shall we upgrade the Windows build to use this new version? I have >> the changes made locally, but don't want to commit until people have >> had a chance to grab the new source. > > Which Windows build(s)? If you mean the 2.4 VC7 Windows build, sure. The > VC6 build appears to be officially unsupported now, so unsure about that > one. If there's another kind of "security fix" gimmick in 1.2.0 or 1.2.1, > then a backport to 2.3 maint would also be appropriate. I've skimmed through the changelog, and noticed no such security fix. So I suggest to change it for 2.4, but leave 2.3 alone. Thomas From barry at barrys-emacs.org Thu Jan 8 16:48:52 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Thu Jan 8 16:49:03 2004 Subject: [Python-Dev] Re: PEP 324: popen5 - New POSIX process module In-Reply-To: <1073315470.3ff97e8e18f9f@mcherm.com> References: <1073315470.3ff97e8e18f9f@mcherm.com> Message-ID: <6.0.1.1.2.20040108211331.02291eb0@torment.chelsea.private> At 05-01-2004 15:11, Michael Chermside wrote: >So I guess I am forced to put up with (2), and the (hopefully minor) >headaches of copying Mark's code UNLESS we can somehow ensure that >win32all is available on all sane windows installs. If, for instance, >it were released with the python.org releases for the Windows platform, >(other packagers already include it), then I think we could dispense >with the annoying need to duplicate code. A number of people seem to under the impression that win32all contains a lots of useful value added code. It does not. It is a very thin SWIG wrapper over the windows API that it exposes. There is nothing that you will need to copy from it to implement popen5 in C. Just as you would not copy code from the python posix module to implement popen5 in C on Unix. The value of using win32all would be that you could implement popen5 in python rather then C. This might be a good way to develop and test the win32 popen5 algorithms before conversion to C. Developing the algorithm is going to be the hard part. The issue is a packaging one not a code cloning one. 1) include win32all in the standard windows python and implement popen5 in python 2) implement popen5 in C on windows. As I understand it (2) is the suggested way forwards. Barry From Andrew.MacKeith at ABAQUS.com Thu Jan 8 16:56:46 2004 From: Andrew.MacKeith at ABAQUS.com (Andrew MacKeith) Date: Thu Jan 8 16:56:45 2004 Subject: [Python-Dev] HP-UX clean-up References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> Message-ID: <3FFDD21E.6070002@ABAQUS.com> Skip Montanaro wrote: > Andrew> We therefore have to have flags that will force HP to compile > Andrew> for some common generic processor. > > Skip> Can you submit a patch with these changes? > > Cameron> You're losing me, Skip. What, to you, would constitute "a > Cameron> patch"? > > Probably a patch to configure.in. We set lots of platform- or OS-specific > compilation flags there. > > Skip > My lack of experience with configure make it hard to submit a patch, so I'll ask python-dev first. If that's too much I'll have a go, but it will take a while. These are the settings that we use: HP/RISC 32-bit: c89 -c +DA2.0 +DS2.0 +Z -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT aCC -c +DA2.0 +DS2.0 +Z -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT aCC +DA2.0 +DS2.0 +Z -Wl,+s -Wl,+k ... -lpthread HP/RISC 64-bit: c89 -c +DA2.0W +DS2.0a +Z -D_REENTRANT aCC -c +DA2.0W +DS2.0a +Z -D_REENTRANT aCC +DA2.0W +DS2.0a +Z -Wl,+s -Wl,+k ... -lpthread HP/ITANIUM 64-bit: cc -c +DD64 +DSitanium2 -Ae -mt -D_POSIX_C_SOURCE=199506L aCC -c +DD64 +DSitanium2 -AP -mt aCC +DD64 +DSitanium2 -AP -mt -Wl,+k -Wl,+s NOTE: the multithreading flags (-D_REENTRANT -mt) should only be applied to the files where there is actually threaded code. When applied to all files indiscriminately, they cause overall performance degradation. AFAIK the files with threaded code are: thread.c threadmodule.c We also found that if the build is linked without specifying the thread library, the build appears fine, but threading doesn't happen at run-time. Andrew MacKeith > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mackeith%40acm.org > From guido at python.org Thu Jan 8 16:58:25 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 8 16:58:31 2004 Subject: [Python-Dev] Re: PEP 324: popen5 - New POSIX process module In-Reply-To: Your message of "Thu, 08 Jan 2004 21:48:52 GMT." <6.0.1.1.2.20040108211331.02291eb0@torment.chelsea.private> References: <1073315470.3ff97e8e18f9f@mcherm.com> <6.0.1.1.2.20040108211331.02291eb0@torment.chelsea.private> Message-ID: <200401082158.i08LwPS21812@c-24-5-183-134.client.comcast.net> > At 05-01-2004 15:11, Michael Chermside wrote: > >So I guess I am forced to put up with (2), and the (hopefully minor) > >headaches of copying Mark's code UNLESS we can somehow ensure that > >win32all is available on all sane windows installs. If, for instance, > >it were released with the python.org releases for the Windows platform, > >(other packagers already include it), then I think we could dispense > >with the annoying need to duplicate code. > > A number of people seem to under the impression that win32all contains > a lots of useful value added code. It does not. It is a very thin SWIG > wrapper over the windows API that it exposes. There is nothing that you > will need to copy from it to implement popen5 in C. Just as you would not > copy code from the python posix module to implement popen5 in C on Unix. > > The value of using win32all would be that you could implement popen5 in > python rather then C. This might be a good way to develop and test the > win32 popen5 algorithms before conversion to C. Developing the algorithm > is going to be the hard part. > > The issue is a packaging one not a code cloning one. > > 1) include win32all in the standard windows python and implement popen5 in > python > 2) implement popen5 in C on windows. > > As I understand it (2) is the suggested way forwards. > > Barry Hm. I'd much rather see a pure Python implementation. This doesn't seem to be something that's particularly performance critical. Starting a process takes so many OS resources that the overhead of doing it in Python is negligeable. I expect only a very small number of win32all entry points to be needed in order to implement popen5, and those we can implement in a separate C module (just as we did for _winreg, which roughly duplicates the registry access API from win32all). --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at barrys-emacs.org Thu Jan 8 17:06:07 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Thu Jan 8 17:07:04 2004 Subject: [Python-Dev] The os module, unix and win32 Message-ID: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> Why does popen5 needs a C implementation on windows where as on unix it can be implemented in python? Answer: because the unix API is in standard python where as the windows one is not. win32all covers a huge number of API functions, more then would be sane to add to os. But would there be any mileage in added enough from win32all to allow problems like popen5 to be implemented? There is already the _reg module that has some win32 functions in it on the standard install. Barry From guido at python.org Thu Jan 8 17:11:16 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 8 17:11:22 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: Your message of "Thu, 08 Jan 2004 22:06:07 GMT." <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> Message-ID: <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net> > Why does popen5 needs a C implementation on windows where as > on unix it can be implemented in python? > > Answer: because the unix API is in standard python where as > the windows one is not. > > win32all covers a huge number of API functions, more then would > be sane to add to os. But would there be any mileage in added > enough from win32all to allow problems like popen5 to be > implemented? > > There is already the _reg module that has some win32 functions in > it on the standard install. (Not sure if this was in response to my "let's do it in Python" post.) How many APIs would have to be copied from win32all to enable implementing popen5 in Python? --Guido van Rossum (home page: http://www.python.org/~guido/) From dave at pythonapocrypha.com Thu Jan 8 17:22:28 2004 From: dave at pythonapocrypha.com (Dave Brueck) Date: Thu Jan 8 17:22:33 2004 Subject: [Python-Dev] The os module, unix and win32 References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> Message-ID: <048a01c3d635$e6e08370$6401fea9@YODA> Barry Scott wrote: > Why does popen5 needs a C implementation on windows where as > on unix it can be implemented in python? > > Answer: because the unix API is in standard python where as > the windows one is not. > > win32all covers a huge number of API functions, more then would > be sane to add to os. But would there be any mileage in added > enough from win32all to allow problems like popen5 to be > implemented? > > There is already the _reg module that has some win32 functions in > it on the standard install. Another possibility: we'd get a lot more mileage out of simply adding ctypes to the standard install. That would solve the immediate popen problem, let us get rid of _winreg entirely (and replace it with a pure Python version), and make future Win32 problems easier to solve. -Dave From guido at python.org Thu Jan 8 17:27:21 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 8 17:27:28 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: Your message of "Thu, 08 Jan 2004 15:22:28 MST." <048a01c3d635$e6e08370$6401fea9@YODA> References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> Message-ID: <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> > Another possibility: we'd get a lot more mileage out of simply > adding ctypes to the standard install. That would solve the > immediate popen problem, let us get rid of _winreg entirely (and > replace it with a pure Python version), and make future Win32 > problems easier to solve. I don't know that a Python version of _winreg using ctypes would be preferable over a C version. I'd expect it to be slower, less readable than the C version, and more susceptible to the possibility of causing segfaults. (As for speed, I actually have a windows registry application where speed of access is important.) That doesn't mean I don't think ctypes is a good idea -- just that I don't think applying it to _winreg would be useful. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Thu Jan 8 17:29:54 2004 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 8 17:30:18 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <3FFDD21E.6070002@ABAQUS.com> References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> <3FFDD21E.6070002@ABAQUS.com> Message-ID: <16381.55778.208434.489763@montanaro.dyndns.org> Andrew> My lack of experience with configure make it hard to submit a Andrew> patch, so I'll ask python-dev first. That's okay. What you gave is a good start. I have a couple questions though: 1. How can I distinguish the three processors you described (HP/RISC 32-bit, HP/RISC 64-bit, HP/ITANIUM 64-bit)? I suspect uname might be the culprit. If so, can you provide uname output for the three processors? 2. Is there no way to actually create a truly processor-independent executable (perhaps by creating a "fat" binary)? Thx, Skip From astrand at lysator.liu.se Thu Jan 8 17:31:53 2004 From: astrand at lysator.liu.se (Peter Astrand) Date: Thu Jan 8 17:31:57 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net> Message-ID: > > win32all covers a huge number of API functions, more then would > > be sane to add to os. But would there be any mileage in added > > enough from win32all to allow problems like popen5 to be > > implemented? > > > > There is already the _reg module that has some win32 functions in > > it on the standard install. > > (Not sure if this was in response to my "let's do it in Python" post.) > > How many APIs would have to be copied from win32all to enable > implementing popen5 in Python? I'm not sure since I haven't implemented much yet, but we'll need at least: win32.CreatePipe win32api.DuplicateHandle win32api.GetCurrentProcess win32process.CreateProcess win32process.STARTUPINFO win32gui.EnumThreadWindows win32event.WaitForSingleObject win32process.TerminateProcess -- /Peter ?strand From skip at pobox.com Thu Jan 8 17:40:15 2004 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 8 17:40:25 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <3FFDD21E.6070002@ABAQUS.com> References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> <3FFDD21E.6070002@ABAQUS.com> Message-ID: <16381.56399.329186.126709@montanaro.dyndns.org> Andrew> My lack of experience with configure make it hard to submit a Andrew> patch, so I'll ask python-dev first. One other question. Are these flags OS version-specific or will they work with 10.x and 11.x systems? Skip From pf_moore at yahoo.co.uk Thu Jan 8 17:41:19 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Thu Jan 8 17:41:10 2004 Subject: [Python-Dev] Re: The os module, unix and win32 References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net> Message-ID: <7k02kqr4.fsf@yahoo.co.uk> Guido van Rossum writes: > How many APIs would have to be copied from win32all to enable > implementing popen5 in Python? My (so far relatively trivial) prototype implementation uses 7, but they are a fairly mixed bunch. In many cases, only one of a "group" is needed (GetStdHandle but not SetStdHandle, and WaitForSingleObject, but none of the other WaitForXXX functions). I'm not sure what to conclude from this... Paul. -- This signature intentionally left blank From dave at pythonapocrypha.com Thu Jan 8 17:46:42 2004 From: dave at pythonapocrypha.com (Dave Brueck) Date: Thu Jan 8 17:46:48 2004 Subject: [Python-Dev] The os module, unix and win32 References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> Message-ID: <04bb01c3d639$4993aa80$6401fea9@YODA> Guido wrote: > > Another possibility: we'd get a lot more mileage out of simply > > adding ctypes to the standard install. That would solve the > > immediate popen problem, let us get rid of _winreg entirely (and > > replace it with a pure Python version), and make future Win32 > > problems easier to solve. > > I don't know that a Python version of _winreg using ctypes would be > preferable over a C version. I'd expect it to be slower, less > readable than the C version, Really? I don't see how it could be less readable. For example: http://www.pythonapocrypha.com/WinReg.py (gets used like: import WinReg as WR key = r'HKEY_CLASSES_ROOT\CLSID\%s' % clsid WR.QSetValue(key, '', desc) WR.QSetValue(r'%s\ProgID' % key, '', progid) ... WR.DeleteKey(WR.HKEY_CLASSES_ROOT, r'CLSID\%s' % clsid) WR.DeleteKey(WR.HKEY_CLASSES_ROOT, r'AppID\%s' % clsid) ) It's not a completely fair comparison to _winreg.c since it implements only a few routines I use all the time and probably does less error checking, but it at least shows how simple it would be to create such a module (if nothing else, the 100 lines of Python versus 1000+ for the C version encourages contribution). > and more susceptible to the possibility of causing segfaults. I don't see how, but maybe I don't know enough about the registry. > (As for speed, I actually have a windows registry application where speed of access is important.) But surely that's the exception rather than the rule. Anyway, I'd be curious to see how much of a difference there is - in the end the same Windows APIs get called, so it boils down to the difference between ctypes API overhead versus Python/C API overhead. -Dave From mon at ABAQUS.com Thu Jan 8 17:48:32 2004 From: mon at ABAQUS.com (Nick Monyatovsky) Date: Thu Jan 8 17:48:37 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: Skip Montanaro "Re: [Python-Dev] HP-UX clean-up" (Jan 8, 16:29) References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> <3FFDD21E.6070002@ABAQUS.com> <16381.55778.208434.489763@montanaro.dyndns.org> Message-ID: <1040108174832.ZM11304@reliant.hks.com> >------------------- Message from Skip Montanaro -------------------- > > Andrew> My lack of experience with configure make it hard to submit a > Andrew> patch, so I'll ask python-dev first. > > That's okay. What you gave is a good start. I have a couple questions > though: > > 1. How can I distinguish the three processors you described (HP/RISC > 32-bit, HP/RISC 64-bit, HP/ITANIUM 64-bit)? I suspect uname might > be the culprit. If so, can you provide uname output for the three > processors? > > 2. Is there no way to actually create a truly processor-independent > executable (perhaps by creating a "fat" binary)? > > Thx, > > Skip > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mon%40abaqus.com >-------------------------------------------------------- Would not it be better if you let the users (people who compile Python) decide what they want? There are several reasons for this: 1) It is hard (== unreliable) to try to distinguish them using 'model', etc.; 2) Reliance on such techniques will inevitably break overtime when the new processor denominations come into play; 3) Most importantly, it is possible to be on the 64-bit machine and be purposefully compiling for a 32-bit processor (to cover the broadest customer base). It is also possible to be on RISC and compile for ITANIUM and vice versa, because it is the same compiler. Moreover, ITANIUM machines will run RISC binaries. In other words, the confusion never stops ... People, on the other hand, usually know what machine they are working on now, and what machine they want Python to work on after they built it. There will be less surprises for them of the sort "Oh, that what happened during the compilation!... I had not idea ..." after the fact that you built this, distributed it to somebody, and had report back that it all failed. Maybe there should also be some sort of option for expects who will not want Python to be compiled for generic processor; who might purposefully want to compile Python on their particular hardware and have it fully optimized for it to gain maximum performance. Thank you, -- Nick Monyatovsky -- mon@abaqus.com From pf_moore at yahoo.co.uk Thu Jan 8 17:54:43 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Thu Jan 8 17:54:31 2004 Subject: [Python-Dev] Re: The os module, unix and win32 References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> Message-ID: <3caqkq4s.fsf@yahoo.co.uk> Guido van Rossum writes: > I don't know that a Python version of _winreg using ctypes would be > preferable over a C version. I'd expect it to be slower, less > readable than the C version, and more susceptible to the possibility > of causing segfaults. (As for speed, I actually have a windows > registry application where speed of access is important.) > > That doesn't mean I don't think ctypes is a good idea -- just that I > don't think applying it to _winreg would be useful. Agreed on both counts. The main advantage of having ctypes in the core would be to bypass these "but we can't do a Windows version without writing a C wrapper round some API calls" issues. I prefer the idea of including ctypes in the core and writing popen5 (or whatever it gets called) using it, over writing a support extension which just wraps "some" API calls. There's no way of making such an extension self-consistent in the way that _winreg is. The mix of APIs needed for popen5 is too varied. If popen5 is going to have a C extension behind it, I'd say that extension should expose higher level functions coded specifically for popen5, and be named something like _popen5.pyd. Don't try to make it generic - it won't be. But these are implementation and packaging issues. Let's get the design right first (and prototype using whatever seems convenient). Paul. -- This signature intentionally left blank From pf_moore at yahoo.co.uk Thu Jan 8 18:03:00 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Thu Jan 8 18:02:46 2004 Subject: [Python-Dev] Re: HP-UX clean-up References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> <3FFDD21E.6070002@ABAQUS.com> <16381.55778.208434.489763@montanaro.dyndns.org> <1040108174832.ZM11304@reliant.hks.com> Message-ID: "Nick Monyatovsky" writes: > Would not it be better if you let the users (people who compile Python) decide > what they want? I believe the idea is to cater for people wanting to produce binary distributions of Python. For such people, the fewer variations they need, the better. I would be very happy to get a binary distribution of Python for HPUX, as I regularly work on HPUX machines which don't have compilers. I've a particular bee in my bonnet about HP machines, though, as I tend to work via telnet/SSH only, no X, on boxes with almost no optional extras installed. I'm sick of finding distributions of Vim which require X libraries installed, even though I'm not going to use the X features. And need to be root to install, although I only want the *(&^&*@#$ thing for my own use! Sorry about that... Paul. -- This signature intentionally left blank From pf_moore at yahoo.co.uk Thu Jan 8 18:03:51 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Thu Jan 8 18:10:34 2004 Subject: [Python-Dev] Re: The os module, unix and win32 References: <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net> Message-ID: Peter Astrand writes: > I'm not sure since I haven't implemented much yet, but we'll need at > least: > > win32gui.EnumThreadWindows ??? Why this one? Paul. -- This signature intentionally left blank From trentm at ActiveState.com Thu Jan 8 21:20:34 2004 From: trentm at ActiveState.com (Trent Mick) Date: Thu Jan 8 21:22:34 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net>; from guido@python.org on Thu, Jan 08, 2004 at 02:11:16PM -0800 References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net> Message-ID: <20040108182033.A17993@ActiveState.com> [Guido van Rossum wrote] > How many APIs would have to be copied from win32all to enable > implementing popen5 in Python? I am sorry that I am not keeping up with this right now, but my process.py uses the following: ['win32gui.PostMessage', 'win32process.CREATE_NEW_CONSOLE', 'win32pipe.CreatePipe', 'win32event.WAIT_FAILED', 'win32process.TerminateProcess', 'win32process.STARTF_USESTDHANDLES', 'win32file.CloseHandle', 'win32process.STARTF_USESHOWWINDOW', 'win32event.INFINITE', 'win32event.WAIT_TIMEOUT', 'win32process.CREATE_NEW_PROCESS_GROUP', 'win32api.GenerateConsoleCtrlEvent', 'win32api.GetModuleFileName', 'win32process.GetWindowThreadProcessId', 'win32process.CreateProcess', 'win32api.DuplicateHandle', 'wintypes.SECURITY_ATTRIBUTES', 'win32api.CloseHandle', 'win32api.GetCurrentProcess', 'win32process.GetExitCodeProcess', 'win32file.WriteFile', 'win32process.CREATE_', 'win32event.WaitForSingleObject', 'win32file.ReadFile', 'wintypes.error', 'win32api.Sleep', 'win32api.GetVersionEx', 'win32api.error', 'win32api.GetVersion', 'win32process.STARTUPINFO', 'win32gui.EnumWindows'] 31 Trent -- Trent Mick TrentM@ActiveState.com From jeremy at alum.mit.edu Thu Jan 8 21:28:06 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu Jan 8 21:32:43 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> Message-ID: <1073615286.6341.205.camel@localhost.localdomain> On Thu, 2004-01-08 at 17:06, Barry Scott wrote: > Why does popen5 needs a C implementation on windows where as > on unix it can be implemented in python? > > Answer: because the unix API is in standard python where as > the windows one is not. > > win32all covers a huge number of API functions, more then would > be sane to add to os. But would there be any mileage in added > enough from win32all to allow problems like popen5 to be > implemented? > > There is already the _reg module that has some win32 functions in > it on the standard install. In principle, this sounds like a nice idea. If the majority of Python's users are on win32 systems, it seems only reasonable to have some support for native OS functionality for win32. We've got it for Mac and Linux after all. I suppose a primary concern would be how hard it is to maintain the code and whether it would be tied to particular versions of Windows. The fact that you can get win32all for Python 1.5.2 through 2.3 suggests that these aren't insurmountable problems. Jeremy From mfb at lotusland.dyndns.org Thu Jan 8 22:52:21 2004 From: mfb at lotusland.dyndns.org (Matthew F. Barnes) Date: Thu Jan 8 22:52:25 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <20040108182033.A17993@ActiveState.com> References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private><2004010822 11.i08MBGx21879@c-24-5-183-134.client.comcast.net> <20040108182033.A17993@ActiveState.com> Message-ID: <33628.192.168.1.101.1073620341.squirrel@server.lotusland.dyndns.org> Guido van Rossum said: > How many APIs would have to be copied from win32all to enable > implementing popen5 in Python? As long as the possibility of copying APIs from win32all is being entertained, would it be feasible to make os.kill do something meaningful on Windows? I'm no expert on Windows APIs, but it may be as simple as calling win32api.GenerateConsoleCtrlEvent or win32api.TerminateProcess. (Although I seem to remember reading that killing processes externally on Windows is something of a thorny issue. It may corrupt the state of DLLs, or something along those lines.) Matthew Barnes From ajsiegel at optonline.net Thu Jan 8 23:02:02 2004 From: ajsiegel at optonline.net (Arthur) Date: Thu Jan 8 23:02:07 2004 Subject: [Python-Dev] re: The os module, unix and win32 Message-ID: <0HR7006O9F7GKD@mta1.srv.hcvlny.cv.net> Barry writes - >win32all covers a huge number of API functions, more then would >be sane to add to os. But would there be any mileage in added >enough from win32all to allow problems like popen5 to be >implemented? I know I am wandering a bit off topic. But tangentially on topic - is the possibility of getting enough of the win32all API functions into the core distribution to give more Windows standard functionality to disutils - start menu entries in particular. I won't get wordy on why I think this would be a very good thing. But if my guess (which is all it is) that it would only require small bits of win32all is correct, the return on investment would be, I believe, tremendous. Art From python at rcn.com Fri Jan 9 00:32:52 2004 From: python at rcn.com (Raymond Hettinger) Date: Fri Jan 9 00:33:46 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401042241.i04MfqE06932@c-24-5-183-134.client.comcast.net> Message-ID: <000a01c3d672$097e6500$95b52c81@oemcomputer> I would like to establish a new module for some collection classes. The first type would be a bag, modeled after one of Smalltalk's key collection classes (similar to MultiSets in C++ and bags in Objective C). The second type would be a high speed queue implemented using linked list blocks like we used for itertools.tee(). The interface could be as simple as push, pop, and len. Each operation ought to run about as fast a list assignment (meaning that it beats the heck out of the list.pop(0), list.append(data) version). This module could also serve as the "one obvious place to put it" for other high performance datatypes that might be added in the future (such as fibheaps or some such). Guido wanted this to be discussed by a number of people. While a rich set of collection classes have proven their worth in other contexts, it could be that more is not better for python. One the language's elegances is the ability to do just about anything with just lists and dicts. If you have too many mutable containers to choose from, the choice of which to use becomes less obvious. My thoughts are that the existence of the array module, for example, has never impaired anyone's ability to learn or use lists and dicts. Also, no one reaches for a bag unless they are counting things. Likewise, queues are in a mental concept space distinct from dicts, sets, and such. My only worry with queues is differentiating them from Queue objects which have builtin synchronization for use in threads. Besides, having bag and queue in a separate module means that they needn't enter anyone's consciousness until a need arises. What do you guys think? Raymond Hettinger >>> from collections import bag >>> b = bag('banana') >>> b.sortedbycount() [('a', 3), ('n', 2), ('b', 1)] >>> list(b) ['a', 'a', 'a', 'b', 'n', 'n'] >>> b.asSet() set(['a', 'b', 'n']) From jcarlson at uci.edu Fri Jan 9 01:07:30 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Jan 9 01:09:40 2004 Subject: [Python-Dev] collections module In-Reply-To: <000a01c3d672$097e6500$95b52c81@oemcomputer> References: <200401042241.i04MfqE06932@c-24-5-183-134.client.comcast.net> <000a01c3d672$097e6500$95b52c81@oemcomputer> Message-ID: <20040108215224.1374.JCARLSON@uci.edu> > The second type would be a high speed queue implemented using > linked list blocks like we used for itertools.tee(). The > interface could be as simple as push, pop, and len. Each > operation ought to run about as fast a list assignment (meaning > that it beats the heck out of the list.pop(0), list.append(data) > version). There's been a python-based FIFO that is O(1) on {push, pop} and uses lists in the ASPN Python cookbook (adding len is easy): http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/210459 You'd probably be more interested in the variant I posted as a comment; it is significantly faster when the output queue is exhausted. Speaking of which...the Queue module REALLY should be re-done with the Fifo above and a single conditional, the multiple mutexes in the current Queue module are ugly as hell, and very difficult to understand. If desired, I'll write the new one. > Guido wanted this to be discussed by a number of people. While > a rich set of collection classes have proven their worth in other > contexts, it could be that more is not better for python. One the > language's elegances is the ability to do just about anything with > just lists and dicts. If you have too many mutable containers to > choose from, the choice of which to use becomes less obvious. The real trick is documenting them so that users can compare and contrast the features of each, in order to choose properly. > Likewise, queues are in a mental concept space distinct from dicts, > sets, and such. My only worry with queues is differentiating > them from Queue objects which have builtin synchronization for > use in threads. It all really depends on the overhead of the underlying mutexes or conditionals, which can be tested. - Josiah From martin at v.loewis.de Fri Jan 9 01:49:55 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Fri Jan 9 01:50:07 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <1040108174832.ZM11304@reliant.hks.com> References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> <3FFDD21E.6070002@ABAQUS.com> <16381.55778.208434.489763@montanaro.dyndns.org> <1040108174832.ZM11304@reliant.hks.com> Message-ID: <3FFE4F13.5000702@v.loewis.de> Nick Monyatovsky wrote: > Would not it be better if you let the users (people who compile Python) decide > what they want? No, not at all. configure should always give a working installation if invoked without options. If we have to make decisions for people to obtain such a working installation - fine. We then have to ask whether people might have made different decisions, and whether we want to support them in that decision. In many cases, we found that not giving choices is the right thing. So, for your set of options: 1. configure should find out automatically how to enable large file support on a 32-bit machine. In fact, there is code in configure.in to automatically detect LFS. I'm surprised that you are specifying LFS on the command line - this should not be necessary. 2. Assuming -Wl,+s -Wl,+k are warnings: We would turn them on if they are always acceptable, and leave them off if there are cases were they are not applicable. Figuring out whether warning options are needed/possible is too much effort, letting the user make the choice is pointless - our other compilers give plenty of warnings, so I believe that the HP compiler's complaints are likely false positives or otherwise irrelevant. 3. We would compile/link with threads always if possible, unless --disable-threads was given. If compiled with threads, configure should know to pass -lpthread. 4. On the model selection options, I don't quite understand why you say that this set of options gives portable binaries. I would have thought that +DA1.1 +DS2.0 is the set of option that gives portable binaries. Of course, if we can assume all hardware is PA 2.0 meanwhile, this would be fine. In that case, we would always add the PA 2.0 option set, leaving the user no choice. > 1) It is hard (== unreliable) to try to distinguish them using 'model', > etc.; Then there should be no choice at all. > 2) Reliance on such techniques will inevitably break overtime when the new > processor denominations come into play; Then there should be as few options as possible, assuming that the compiler defaults will do the right thing, anyway (which appears to be the case with respect to model selection). > 3) Most importantly, it is possible to be on the 64-bit machine and be > purposefully compiling for a 32-bit processor (to cover the broadest customer > base). It is also possible to be on RISC and compile for ITANIUM and vice > versa, because it is the same compiler. Moreover, ITANIUM machines will run > RISC binaries. I don't think it should be directly supported. By default, configure should force PA binaries on PA hardware, and Itanium binaries on Itanium hardware. One issue is whether we should produce 32-bit or 64-bit binaries on PA 2.0 machines by default; I don't know which one is more reasonable. Switching to the other one should be as simple as setting CC before invoking configure (this is how it works on Solaris). > People, on the other hand, usually know what machine they are working on > now, and what machine they want Python to work on after they built it. People should be able to specify things that they reasonably should be able to specify. If they don't specify anything, something reasonable should still happen. > Maybe there should also be some sort of option for expects who will not want > Python to be compiled for generic processor; who might purposefully want to > compile Python on their particular hardware and have it fully optimized for it > to gain maximum performance. No, no, no. If you want to highly tune your build, you are expected to edit the makefile. Providing an explicit option for that will place a burden on users who are not able to answer difficult build questions (and, trust me, this is the majority of the users). Regards, Martin From martin at v.loewis.de Fri Jan 9 01:59:06 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Fri Jan 9 01:59:11 2004 Subject: [Python-Dev] collections module In-Reply-To: <000a01c3d672$097e6500$95b52c81@oemcomputer> References: <000a01c3d672$097e6500$95b52c81@oemcomputer> Message-ID: <3FFE513A.3020201@v.loewis.de> Raymond Hettinger wrote: > What do you guys think? -1. There is no evidence that these types are missing. The fact that they are absent does not apply that anybody would want to use them if they were available. It might be better to develop such a module as a separate package, wait for user feedback, and perhaps incorporate this into Python in a few years. Regards, Martin From jcarlson at uci.edu Fri Jan 9 02:15:05 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Jan 9 02:17:41 2004 Subject: [Python-Dev] collections module In-Reply-To: <002101c3d67c$9556fa60$95b52c81@oemcomputer> References: <20040108215224.1374.JCARLSON@uci.edu> <002101c3d67c$9556fa60$95b52c81@oemcomputer> Message-ID: <20040108231104.1377.JCARLSON@uci.edu> > None of those can compete with my proposed C implementation > which pre-allocates around 50 cells at a time and does it's > read/writes through array lookups. It avoids all the expensive > resizing of list and uses C ints instead of PyInts. That's pretty crazy. If I had a say, I'd be +1 just because I've seen so many people using their own {append, pop(0)} queues before. - Josiah P.S. I know knuth had it first, but one never knows how well read anyone else is. It is better to err on the side of giving too much information, than too little. From zack at codesourcery.com Fri Jan 9 02:23:24 2004 From: zack at codesourcery.com (Zack Weinberg) Date: Fri Jan 9 02:23:40 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040108231104.1377.JCARLSON@uci.edu> (Josiah Carlson's message of "Thu, 08 Jan 2004 23:15:05 -0800") References: <20040108215224.1374.JCARLSON@uci.edu> <002101c3d67c$9556fa60$95b52c81@oemcomputer> <20040108231104.1377.JCARLSON@uci.edu> Message-ID: <877k01d1qr.fsf@egil.codesourcery.com> Josiah Carlson writes: >> None of those can compete with my proposed C implementation >> which pre-allocates around 50 cells at a time and does it's >> read/writes through array lookups. It avoids all the expensive >> resizing of list and uses C ints instead of PyInts. > > That's pretty crazy. If I had a say, I'd be +1 just because I've seen > so many people using their own {append, pop(0)} queues before. As a user of Python, I am +1 on making there be a *really really fast* queue data structure somewhere, but -0 on its being something other than the existing Queue class. Can't we just optimize the one we've got? Ideally without losing the thread safety guarantees (but making it usable in an unthreaded program would be nice)? zw From aleaxit at yahoo.com Fri Jan 9 02:32:12 2004 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Jan 9 02:32:17 2004 Subject: [Python-Dev] collections module In-Reply-To: <000a01c3d672$097e6500$95b52c81@oemcomputer> References: <000a01c3d672$097e6500$95b52c81@oemcomputer> Message-ID: <200401090832.12495.aleaxit@yahoo.com> On Friday 09 January 2004 06:32 am, Raymond Hettinger wrote: > I would like to establish a new module for some collection classes. Great idea. > a rich set of collection classes have proven their worth in other > contexts, it could be that more is not better for python. One the > language's elegances is the ability to do just about anything with > just lists and dicts. If you have too many mutable containers to > choose from, the choice of which to use becomes less obvious. Most uses will be excellently well met by lists and dicts, just like today. But just like today you sometimes find tuples or array.array's or sets.Set's serve your needs better, so might tomorrow a collections.Bag do so. The existence of sets.Set doesn't impede people from still using lists or dicts to represent sets as they did in the past, it just offer what may be a better way to programmers who _are_ familiar with the new facilities (and the concept of sets in the first place); collections.Bag will be just the same IMHO. In other words, I agree with you. > Likewise, queues are in a mental concept space distinct from dicts, > sets, and such. My only worry with queues is differentiating > them from Queue objects which have builtin synchronization for > use in threads. Maybe naming them Fifo rather than Queue, while less elegant, will make the distinction so sharp that no fear of confusion remains. Alex From perky at i18n.org Fri Jan 9 03:30:21 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Fri Jan 9 03:30:29 2004 Subject: [Python-Dev] CJKCodecs integration into Python Message-ID: <20040109083021.GA4128@i18n.org> Hello! I just submitted a patch to integrate CJKCodecs into python. (http://www.python.org/sf/873597) As I remember, the problems mentioned on a previous discussion about this are these three: 1) Size Python+CJKCodecs is just 102% of the original by source size and 104% by source line counts. Installed binary size is about 2.5MB. I think it's not a big size for todays harddisks. 2) Backward Compatibility with C/J/K Codecs with ChineseCodecs: it's perfectly compatible except ChineseCodecs' trivial bugs. with JapaneseCodecs: it's now compatible enough to use. the only difference is ISO-2022-JP's error handling for bytes set MSB which is very unusual and invalid. with KoreanCodecs: it's perfectly compatibie. 3) Maintenance I can maintain it on Python CVS by myself now. :) And I put cjkcodecs files into Modules/cjkcodecs/, but I can't sure this is the right place to put them. Any opionions will be appreciated. Thank you! Hye-Shik From sjoerd at acm.org Fri Jan 9 04:01:35 2004 From: sjoerd at acm.org (Sjoerd Mullender) Date: Fri Jan 9 04:01:40 2004 Subject: [Python-Dev] Fixes for imageop module In-Reply-To: <200312272241.hBRMfHb01333@c-24-5-183-134.client.comcast.net> References: <3FEC9C64.5080805@acm.org> <200312262249.hBQMnhf24306@c-24-5-183-134.client.comcast.net> <3FECBFB3.7090906@acm.org> <200312271749.hBRHnkN25210@c-24-5-183-134.client.comcast.net> <3FEDCBA2.3030607@acm.org> <200312272241.hBRMfHb01333@c-24-5-183-134.client.comcast.net> Message-ID: <3FFE6DEF.9070209@acm.org> Guido van Rossum wrote: >>>Hm. Anybody who uses the imageop module currently on Linux will find >>>their programs broken. The strings manipulated by imageop all end up >>>either being written to a file or handed to some drawing code, and >>>changing the byte order would definitely break things! >> >>That's why I asked. >> >> >>>So I don't think this is an acceptable change. I take it that for >>>IRIX, the byte order implied by the old code is simply wrong? Maybe >>>the module can be given a global (shrudder?) byte order setting that >>>you can change but that defaults to the old setting? >> >>The problem is, the documentation says: "This is the same format as used >>by gl.lrectwrite() and the imgfile module." This implies the byte order >>that you get on the SGI which is opposite of what you get on Intel. The >>code produced the correct results on the SGI, but not on Intel. > > > Your fix would be okay if until now it was *only* used on IRIX with > gl.lrectwrite() and/or the imgfile module. But how can we prove that? I don't know. >>(By the way, I'm away from computers for a week starting tomorrow >>extremely early.) > > > It's okay to way a week before making a decision. I'm back (and have been this week). Any thoughts about a decision? -- Sjoerd Mullender From Paul.Moore at atosorigin.com Fri Jan 9 06:42:34 2004 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Fri Jan 9 06:42:42 2004 Subject: [Python-Dev] collections module Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060D9F@UKDCX001.uk.int.atosorigin.com> From: Raymond Hettinger > I would like to establish a new module for some collection classes. In general, I approve of this idea. However, one point you don't make clear - are you proposing that this be coded in Python, or in C? I'd be against it being coded in C, but for a Python implementation. A python module serves not only to provide the feature, but also to demonstrate good Python code, to show how to code such basic data types in Python, and to offer explicit, "use the source" form, documentation of corner cases. As an example of this, I once used the heapq Python implementation as a basis for developing my own variant, which allowed updating of the priority. I even went as far as importing some of the module's private functions to help my implementation. With a C coded version, I couldn't do that - I'd have had to write the whole thing for myself. (Yes, I know that Python 2.4 you recoded heapq in C - this isn't a dig at that as such, although it will be a problem for me. It's just a note that coding in C is not always a good thing...) The only reason I can see for coding in C is performance. And performance just isn't that big a deal to me. And we don't want to give a message that "dictionaries and lists are slow" by offering "fast" versions of things which can be coded in terms of them. > The first type would be a bag I don't, personally, have much use for this data type. In practice, I would naturally choose to use a dictionary mapping item to count. It would be important in the documentation to provide clear use cases, showing when this type is more appropriate than a simple dictionary, as I describe, and when it is not. In terms of "making dictionaries look slow", how would you answer the question "Why use this rather than just dict.get(k, 0) + 1" without giving the impression that dictionaries "aren't fast enough"? > The second type would be a high speed queue This seems like a good idea. I have found myself on occasion wanting to use Queue.Queue in non-threaded programs, simply because a "queue" fits my mental model of what I am doing. But it's for code readability reasons - not performance. And the same issue here over performance impressions - how do you make this seem like a good alternative to list.pop, without making lists look slow? > This module could also serve as the "one obvious place to put it" > for other high performance datatypes that might be added in the > future (such as fibheaps or some such). That is the biggest benefit I see to this proposal. There have recently been a number of discussions of useful library code, where much of the problem seemed to be about where to put the code, rather than about semantics, etc. > Guido wanted this to be discussed by a number of people. While > a rich set of collection classes have proven their worth in other > contexts, it could be that more is not better for python. One the > language's elegances is the ability to do just about anything with > just lists and dicts. If you have too many mutable containers to > choose from, the choice of which to use becomes less obvious. > My thoughts are that the existence of the array module, for example, > has never impaired anyone's ability to learn or use lists and dicts. That is true, but equally the array module has a feeling of being "specialised". I'm not sure I can quantify this, but your description of the collection module doesn't feel similarly "specialised". > Also, no one reaches for a bag unless they are counting things. ... and when they are, they reach for a dictionary - d.get(k, 0) + 1 feels like a very natural Python counting idiom. > Likewise, queues are in a mental concept space distinct from dicts, > sets, and such. My only worry with queues is differentiating > them from Queue objects which have builtin synchronization for > use in threads. Queues currently "feel like" a threading concept in Python (as opposed to what they feel like in a more general data structures concept). Having a queue in a collections module would clash with that. I'm not against it, but I believe your worry is important - you're fiddling with people's mental models, something bound to generate controversy... Apologies for going on about performance. I am in favour of the idea of a module containing implementations of standard container data structures, but I believe it should be in Python, and should be presented as the "best of breed" implementation in terms of handling corner cases, and covering subtle algorithmic issues, and providing good design, and offering subclassability where needed. Basically, "everyone can, and often does, reinvent this stuff in Python - but getting the details right is tricky, so we've done it for you." If performance of a Python implementation is a problem, speed up the bottlenecks in the Python VM (likely to be function calls, I'd guess) rather than spending effort reimplementing in C... Sorry if that was a bit rambling. I hope it's of use. Paul. From theller at python.net Fri Jan 9 07:24:44 2004 From: theller at python.net (Thomas Heller) Date: Fri Jan 9 07:24:55 2004 Subject: [Python-Dev] re: The os module, unix and win32 In-Reply-To: <0HR7006O9F7GKD@mta1.srv.hcvlny.cv.net> (Arthur's message of "Thu, 08 Jan 2004 23:02:02 -0500") References: <0HR7006O9F7GKD@mta1.srv.hcvlny.cv.net> Message-ID: <8ykhux6b.fsf@python.net> Arthur writes: > I know I am wandering a bit off topic. > > But tangentially on topic - is the possibility of getting enough of the > win32all API functions into the core distribution to give more Windows > standard functionality to disutils - start menu entries in particular. bdist_wininst can create Windows shortcuts in it's postinstall script. Thomas From theller at python.net Fri Jan 9 07:32:08 2004 From: theller at python.net (Thomas Heller) Date: Fri Jan 9 07:32:16 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> (Guido van Rossum's message of "Thu, 08 Jan 2004 14:27:21 -0800") References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> Message-ID: <4qv5uwtz.fsf@python.net> [Dave] >> Another possibility: we'd get a lot more mileage out of simply >> adding ctypes to the standard install. That would solve the >> immediate popen problem, let us get rid of _winreg entirely (and >> replace it with a pure Python version), and make future Win32 >> problems easier to solve. [Guido] > I don't know that a Python version of _winreg using ctypes would be > preferable over a C version. Since there is a C version, there's no need to code it in Python again. > I'd expect it to be slower, less readable than the C version, and more > susceptible to the possibility of causing segfaults. Slower: yes, but I don't know how much. Less readable: no, more readable. Susceptible to segfaults: If written correctly, it should be bullet proof - exactly the same as a C version. > That doesn't mean I don't think ctypes is a good idea -- just that I > don't think applying it to _winreg would be useful. Cool. Should this be a Windows only version, or cross-platform? Thomas From gmccaughan at synaptics-uk.com Fri Jan 9 07:33:42 2004 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Fri Jan 9 07:33:46 2004 Subject: [Python-Dev] collections module Message-ID: <200401091233.42875.gmccaughan@synaptics-uk.com> Paul Moore wrote: > The only reason I can see for coding in C is performance. And performance > just isn't that big a deal to me. And we don't want to give a message > that "dictionaries and lists are slow" by offering "fast" versions of > things which can be coded in terms of them. I think we should be more concerned about how useful a new feature is than about what "message" it gives about existing features. Anyway, the message that will actually be given is that for some purposes lists and dictionaries are not the best data structures. That's obviously true, so why try to avoid saying it? [Bags:] > In terms of "making dictionaries look slow", how would you answer > the question "Why use this rather than just dict.get(k, 0) + 1" > without giving the impression that dictionaries "aren't fast enough"? If the advantage *is* that bags are faster, then I don't think there's any reason not to say so. If it gives the impression that dictionaries "aren't fast enough" for the particular tasks bags are designed for, what's so terrible? The result would be that when people want bags they use bags instead of emulating them with dictionaries; is that bad? (I'm not convinced that there would, in fact, be much speed advantage for bags over dictionaries. I'd guess that the real advantage would just be that some things can be expressed more clearly in terms of bags than in terms of dictionaries. I suspect there isn't much clarity advantage either...) >> The second type would be a high speed queue > > This seems like a good idea. I have found myself on occasion > wanting to use Queue.Queue in non-threaded programs, simply because > a "queue" fits my mental model of what I am doing. But it's for > code readability reasons - not performance. I've often built my own queue using lists, with x.append(item) and x.pop() as the enqueue and dequeue operations. That's horribly inefficient, but makes for reasonably neat code. If Python grew nice fast queues with a nice interface, I'd use those instead. It wouldn't make much difference to the clarity of my code, but it would make it a lot faster. That seems like a good thing to me. > And the same issue here over performance impressions - how do you > make this seem like a good alternative to list.pop, without making > lists look slow? Lists *are* slow when you use them for queues. What's wrong with admitting that? >> This module could also serve as the "one obvious place to put it" >> for other high performance datatypes that might be added in the >> future (such as fibheaps or some such). > > That is the biggest benefit I see to this proposal. There have > recently been a number of discussions of useful library code, where > much of the problem seemed to be about where to put the code, > rather than about semantics, etc. I concur. >> Guido wanted this to be discussed by a number of people. While >> a rich set of collection classes have proven their worth in other >> contexts, it could be that more is not better for python. One the >> language's elegances is the ability to do just about anything with >> just lists and dicts. If you have too many mutable containers to >> choose from, the choice of which to use becomes less obvious. >> >> My thoughts are that the existence of the array module, for example, >> has never impaired anyone's ability to learn or use lists and dicts. > > That is true, but equally the array module has a feeling of being > "specialised". I'm not sure I can quantify this, but your description > of the collection module doesn't feel similarly "specialised". Being less "specialized" is a matter of being more broadly useful, no? That seems like a good thing to me. >> Likewise, queues are in a mental concept space distinct from dicts, >> sets, and such. My only worry with queues is differentiating >> them from Queue objects which have builtin synchronization for >> use in threads. > > Queues currently "feel like" a threading concept in Python (as opposed > to what they feel like in a more general data structures concept). > Having a queue in a collections module would clash with that. I'm not > against it, but I believe your worry is important - you're fiddling > with people's mental models, something bound to generate controversy... If Python's existing library has warped users' mental models so that they think of a queue primarily as "something you use with threads" rather than as "a data structure where you can add things to the front and pull them off the rear efficiently" then that's a *bad* thing about Python's existing library, and decoupling the two concepts by adding a fast queue implementation will be a win. > Apologies for going on about performance. It makes a refreshing change to hear someone saying "No, don't do that, it would be too fast" :-). > I am in favour of the idea of > a module containing implementations of standard container data structures, > but I believe it should be in Python, and should be presented as the > "best of breed" implementation in terms of handling corner cases, and > covering subtle algorithmic issues, and providing good design, and > offering subclassability where needed. Basically, "everyone can, and > often does, reinvent this stuff in Python - but getting the details > right is tricky, so we've done it for you." > > If performance of a Python implementation is a problem, speed up the > bottlenecks in the Python VM (likely to be function calls, I'd guess) > rather than spending effort reimplementing in C... If performance of a Python implementation is a problem, then it is unlikely that any plausible modifications to the VM will stop it being a problem. -- g From Paul.Moore at atosorigin.com Fri Jan 9 08:25:11 2004 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Fri Jan 9 08:25:17 2004 Subject: [Python-Dev] collections module Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060DA1@UKDCX001.uk.int.atosorigin.com> From: Gareth McCaughan > I think we should be more concerned about how useful a new > feature is than about what "message" it gives about existing > features. Anyway, the message that will actually be given > is that for some purposes lists and dictionaries are not > the best data structures. That's obviously true, so why > try to avoid saying it? You're right. But on reading your response, I think that I didn't manage to express my concern very well (I shouldn't compose emails while getting continually interrupted). What I was trying to say was that there are benefits to having an implementation in Python [1] (if that is possible). The key benefit of a C implementation seems to me to be that it is faster. That's fine, but it's important to have a balanced view - performance isn't overwhelmingly more important than other qualities. [1] For example: - Maintainability - Usefulness as an example of good coding practices (on the assumption that we're after good *Python* practices, rather than good C ones...) - Accessibility for reference/documentation - Reusability (maybe less so if the C version is designed to allow subclassing) >> That is true, but equally the array module has a feeling of being >> "specialised". I'm not sure I can quantify this, but your >> description of the collection module doesn't feel similarly >> "specialised". > > Being less "specialized" is a matter of being more broadly > useful, no? That seems like a good thing to me. I expressed that badly - if I had 100 numbers to store, I'd use a list and never even think of the array module. In other words, the array module feels "specialised" in the sense that the tasks it is good (ie, optimal - better than a generic list - whatever) at are more restricted than "just" storing homogeneous data. The implication is that it sacrifices generality for other useful features (not just performance, conversion to a binary format also comes to mind), *and the trade-off is worth it*. My characterisation of "less specialised" was meant in the same sense as is implied by the phrase "jack of all trades, master of none". On reflection, I'm not sure it applies here. So I'll concede that point. > If Python's existing library has warped users' mental models > so that they think of a queue primarily as "something you use > with threads" rather than as "a data structure where you can > add things to the front and pull them off the rear efficiently" > then that's a *bad* thing about Python's existing library, and > decoupling the two concepts by adding a fast queue implementation > will be a win. Good point. On that basis, I'd support using the name queue in the collections module, and maybe even going so far as to suggest that Queue.Queue be renamed (with the old name retained for compatibility) to something like Threading.FIFO (or maybe Threading.channel, like Occam/CSP, but that'd clash with Stackless...). >> Apologies for going on about performance. > > It makes a refreshing change to hear someone saying "No, > don't do that, it would be too fast" :-). :-) > If performance of a Python implementation is a problem, > then it is unlikely that any plausible modifications to > the VM will stop it being a problem. You may well be right. I just dislike the "code it in C, because Python isn't fast enough" attitude. I overreacted here, based on an assumption (Raymond hasn't even said that he intends this module to be in C!). Sorry. Paul. From lists at hlabs.spb.ru Fri Jan 9 11:26:56 2004 From: lists at hlabs.spb.ru (Dmitry Vasiliev) Date: Fri Jan 9 08:26:18 2004 Subject: [Python-Dev] collections module In-Reply-To: <000a01c3d672$097e6500$95b52c81@oemcomputer> References: <000a01c3d672$097e6500$95b52c81@oemcomputer> Message-ID: <3FFED650.6000808@hlabs.spb.ru> Raymond Hettinger wrote: > I would like to establish a new module for some collection classes. Cool. > The first type would be a bag, modeled after one of Smalltalk's > key collection classes (similar to MultiSets in C++ and bags in > Objective C). > > The second type would be a high speed queue implemented using > linked list blocks like we used for itertools.tee(). The > interface could be as simple as push, pop, and len. Each > operation ought to run about as fast a list assignment (meaning > that it beats the heck out of the list.pop(0), list.append(data) > version). In some of my projects I'd really needed sorted dict. (dict sorted by value and indexed by key or index) > This module could also serve as the "one obvious place to put it" > for other high performance datatypes that might be added in the > future (such as fibheaps or some such). Maybe set and frozenset are need to be in this module too? -- Dmitry Vasiliev (dima at hlabs.spb.ru) From pje at telecommunity.com Fri Jan 9 09:55:49 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 9 09:52:45 2004 Subject: [Python-Dev] collections module In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060DA1@UKDCX001.uk.int.a tosorigin.com> Message-ID: <5.1.0.14.0.20040109095013.02c93570@mail.telecommunity.com> At 01:25 PM 1/9/04 +0000, Moore, Paul wrote: >What I was trying to say was that there are benefits to having an >implementation in Python [1] (if that is possible). The key benefit >of a C implementation seems to me to be that it is faster. That's >fine, but it's important to have a balanced view - performance >isn't overwhelmingly more important than other qualities. > >[1] For example: > > - Maintainability > - Usefulness as an example of good coding practices (on > the assumption that we're after good *Python* practices, > rather than good C ones...) > - Accessibility for reference/documentation > - Reusability (maybe less so if the C version is designed > to allow subclassing) Perhaps future debates on this issue could be put to bed by using Pyrex. Granted, Pyrex is not actually Python. But it's awfully close, and would certainly address the issue of making algorithms clearer than would be the case in a pure C version. (It also automatically produces subclassable types.) From skip at pobox.com Fri Jan 9 10:13:34 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 9 10:13:47 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <3FFE4F13.5000702@v.loewis.de> References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> <3FFDD21E.6070002@ABAQUS.com> <16381.55778.208434.489763@montanaro.dyndns.org> <1040108174832.ZM11304@reliant.hks.com> <3FFE4F13.5000702@v.loewis.de> Message-ID: <16382.50462.118522.58707@montanaro.dyndns.org> Martin> 2. Assuming -Wl,+s -Wl,+k are warnings: ... They cause the +s and +k options to be passed to the linker. They could also be written as -Wl,+s,+k. Martin> 4. On the model selection options, I don't quite understand Martin> why you say that this set of options gives portable binaries. Martin> I would have thought that +DA1.1 +DS2.0 is the set of option Martin> that gives portable binaries. I found an HP-UX machine I could log into which has aCC installed. It's just a 10.20 machine. HP's are on the way out here. I didn't found any running HP-UX 11. About +DA and +DS the aCC man page has this to say: +DAarchitecture Generate code for a particular version of the PA- RISC architecture specified. Also specifies which version of the HP-UX math library to link when you have specified -lm. Note Object code generated for PA-RISC 2.0 will not execute on PA-RISC 1.1 systems. To generate code compatible across PA-RISC 1.1 and 2.0 workstations and servers, use the +DAportable option. For best performance use +DA with the model number or architecture where you plan to execute the program. See the file /opt/langtools/lib/sched.models for a list of model numbers and their PA-RISC architecture designations. If you do not specify this option, the default object code generated is determined automatically as that of the machine on which you compile. Examples +DA1.1 +DA867 +DA2.0 +DAportable The first two examples generate code for the PA- RISC 1.1 architecture. The third example generates code for the PA-RISC 2.0 architecture. The fourth example generates code compatible across PA-RISC 1.1 and 2.0 workstations and servers. +DSmodel Use the instruction scheduler tuned to the model specified. If this option is not used, the compiler uses the instruction scheduler for the architecture on which the program is compiled. The architecture is determined by uname() (see uname(2)). model can be a model number, PA-RISC architecture designation or PA-RISC processor name. See the file /opt/langtools/lib/sched.models for a list of model numbers and processor names. Obviously, due to the age of the OS on this machine, there's no mention of Itanium. >> Maybe there should also be some sort of option for expects who will >> not want Python to be compiled for generic processor; who might >> purposefully want to compile Python on their particular hardware and >> have it fully optimized for it to gain maximum performance. Martin> No, no, no. If you want to highly tune your build, you are Martin> expected to edit the makefile. Providing an explicit option for Martin> that will place a burden on users who are not able to answer Martin> difficult build questions (and, trust me, this is the majority Martin> of the users). It would also make the configure script that much more fragile. Skip From skip at pobox.com Fri Jan 9 10:20:16 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 9 10:20:31 2004 Subject: [Python-Dev] collections module In-Reply-To: <000a01c3d672$097e6500$95b52c81@oemcomputer> References: <200401042241.i04MfqE06932@c-24-5-183-134.client.comcast.net> <000a01c3d672$097e6500$95b52c81@oemcomputer> Message-ID: <16382.50864.699810.476915@montanaro.dyndns.org> Raymond> I would like to establish a new module for some collection Raymond> classes. Raymond> The first type would be a bag, modeled after one of Smalltalk's Raymond> key collection classes (similar to MultiSets in C++ and bags in Raymond> Objective C). I'm neither a Smalltalk, C++ or Objective C programmer. How is a bag different from a set? In my GE days (pre-Python) we had a collection class in a proprietary language which was used as a proxy. You'd send a message to the collection and it would forward it to all the elements it contained. This was rather convenient. You could tell all the actors to turn and face the camera, for example. Oddly enough, although we used collections heavily, I never really missed that functionality in Python. I implemented it once or twice, but never used it much. Different style of programming I guess. Skip From mon at abaqus.com Fri Jan 9 10:44:44 2004 From: mon at abaqus.com (Nick Monyatovsky) Date: Fri Jan 9 10:44:49 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: "Martin v. Loewis" "Re: [Python-Dev] HP-UX clean-up" (Jan 9, 7:49) References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> <3FFDD21E.6070002@ABAQUS.com> <16381.55778.208434.489763@montanaro.dyndns.org> <1040108174832.ZM11304@reliant.hks.com> <3FFE4F13.5000702@v.loewis.de> Message-ID: <1040109104444.ZM12180@reliant.hks.com> >------------------- Message from Martin v. Loewis -------------------- > Nick Monyatovsky wrote: > > Would not it be better if you let the users (people who compile Python) decide > > what they want? > > No, not at all. configure should always give a working installation if > invoked without options. If we have to make decisions for people to > obtain such a working installation - fine. We then have to ask whether > people might have made different decisions, and whether we want to > support them in that decision. In many cases, we found that not > giving choices is the right thing. > > So, for your set of options: > 1. configure should find out automatically how to enable large > file support on a 32-bit machine. In fact, there is code in > configure.in to automatically detect LFS. I'm surprised that > you are specifying LFS on the command line - this should > not be necessary. > > 2. Assuming -Wl,+s -Wl,+k are warnings: We would turn them on > if they are always acceptable, and leave them off if there > are cases were they are not applicable. Figuring out whether > warning options are needed/possible is too much effort, > letting the user make the choice is pointless - our other > compilers give plenty of warnings, so I believe that the > HP compiler's complaints are likely false positives or > otherwise irrelevant. > > 3. We would compile/link with threads always if possible, > unless --disable-threads was given. If compiled with threads, > configure should know to pass -lpthread. > > 4. On the model selection options, I don't quite understand > why you say that this set of options gives portable binaries. > I would have thought that +DA1.1 +DS2.0 is the set of option > that gives portable binaries. Of course, if we can assume > all hardware is PA 2.0 meanwhile, this would be fine. In > that case, we would always add the PA 2.0 option set, > leaving the user no choice. > > > 1) It is hard (== unreliable) to try to distinguish them using 'model', > > etc.; > > Then there should be no choice at all. > > > 2) Reliance on such techniques will inevitably break overtime when the new > > processor denominations come into play; > > Then there should be as few options as possible, assuming that the > compiler defaults will do the right thing, anyway (which appears > to be the case with respect to model selection). > > > 3) Most importantly, it is possible to be on the 64-bit machine and be > > purposefully compiling for a 32-bit processor (to cover the broadest customer > > base). It is also possible to be on RISC and compile for ITANIUM and vice > > versa, because it is the same compiler. Moreover, ITANIUM machines will run > > RISC binaries. > > I don't think it should be directly supported. By default, configure > should force PA binaries on PA hardware, and Itanium binaries on Itanium > hardware. > > One issue is whether we should produce 32-bit or 64-bit binaries on > PA 2.0 machines by default; I don't know which one is more reasonable. > Switching to the other one should be as simple as setting CC before > invoking configure (this is how it works on Solaris). > > > People, on the other hand, usually know what machine they are working on > > now, and what machine they want Python to work on after they built it. > > People should be able to specify things that they reasonably should > be able to specify. If they don't specify anything, something reasonable > should still happen. > > > Maybe there should also be some sort of option for expects who will not want > > Python to be compiled for generic processor; who might purposefully want to > > compile Python on their particular hardware and have it fully optimized for it > > to gain maximum performance. > > No, no, no. If you want to highly tune your build, you are expected to > edit the makefile. Providing an explicit option for that will place > a burden on users who are not able to answer difficult build questions > (and, trust me, this is the majority of the users). > > Regards, > Martin >-------------------------------------------------------- Those are all correct observations: we are using an older (2.0) version of Python, and had to add large file support and threads explicitly. I will answer the architectures issue in another message, where people brought more details of this to the surface. Thank you, -- Nick Monyatovsky -- mon@abaqus.com From aleaxit at yahoo.com Fri Jan 9 10:56:40 2004 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Jan 9 10:56:45 2004 Subject: [Python-Dev] collections module In-Reply-To: <16382.50864.699810.476915@montanaro.dyndns.org> References: <200401042241.i04MfqE06932@c-24-5-183-134.client.comcast.net> <000a01c3d672$097e6500$95b52c81@oemcomputer> <16382.50864.699810.476915@montanaro.dyndns.org> Message-ID: <200401091656.40994.aleaxit@yahoo.com> On Friday 09 January 2004 04:20 pm, Skip Montanaro wrote: > Raymond> I would like to establish a new module for some collection > Raymond> classes. > > Raymond> The first type would be a bag, modeled after one of > Smalltalk's Raymond> key collection classes (similar to MultiSets in C++ > and bags in Raymond> Objective C). > > I'm neither a Smalltalk, C++ or Objective C programmer. How is a bag > different from a set? In a bag, or multiset, a certain item can be in the collection a given _number_ of times, rather than just "either being there or not" as in a set. > In my GE days (pre-Python) we had a collection class > in a proprietary language which was used as a proxy. You'd send a message > to the collection and it would forward it to all the elements it contained. > This was rather convenient. You could tell all the actors to turn and face > the camera, for example. > > Oddly enough, although we used collections heavily, I never really missed > that functionality in Python. I implemented it once or twice, but never > used it much. Different style of programming I guess. Yes, there are "proxying-containers" like that in the Python Cookbook, too -- and I've noticed I also don't use them. "for item in collection: item.turn()" is just too nice and general an idiom (with the per-item processing being expressible inline, which is often a big clarity win in Python) to easily yield to replacement by collection.each(lambda x: x.turn()) or the like... Alex From guido at python.org Fri Jan 9 11:10:07 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 9 11:10:35 2004 Subject: [Python-Dev] collections module In-Reply-To: Your message of "Thu, 08 Jan 2004 23:23:24 PST." <877k01d1qr.fsf@egil.codesourcery.com> References: <20040108215224.1374.JCARLSON@uci.edu> <002101c3d67c$9556fa60$95b52c81@oemcomputer> <20040108231104.1377.JCARLSON@uci.edu> <877k01d1qr.fsf@egil.codesourcery.com> Message-ID: <200401091610.i09GA8j23016@c-24-5-183-134.client.comcast.net> [Raymond] > >> None of those can compete with my proposed C implementation > >> which pre-allocates around 50 cells at a time and does it's > >> read/writes through array lookups. It avoids all the expensive > >> resizing of list and uses C ints instead of PyInts. > Josiah Carlson writes: > > That's pretty crazy. If I had a say, I'd be +1 just because I've seen > > so many people using their own {append, pop(0)} queues before. [Zack] > As a user of Python, I am +1 on making there be a *really really fast* > queue data structure somewhere, but -0 on its being something other > than the existing Queue class. Can't we just optimize the one we've > got? Ideally without losing the thread safety guarantees (but making > it usable in an unthreaded program would be nice)? You miss one thing -- the Queue class exists *specifically* for thread communication. It has very different semantics than I'd want for a generic queue. E.g. with Queue, if you try to get something out of an empty queue, your thread *blocks*. That's only appropriate for applications that expect this. (Imagine the newbie questions to c.l.py. :-) My own ideal solution would be a subtle hack to the list implementation to make pop(0) and insert(0, x) go a lot faster -- this can be done by adding one or two extra fields to the list head and allowing empty space at the front of the list as well as at the end. --Guido van Rossum (home page: http://www.python.org/~guido/) From mon at ABAQUS.com Fri Jan 9 11:10:49 2004 From: mon at ABAQUS.com (Nick Monyatovsky) Date: Fri Jan 9 11:10:56 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: Skip Montanaro "Re: [Python-Dev] HP-UX clean-up" (Jan 9, 9:13) References: <16380.10524.703424.981390@montanaro.dyndns.org> <16380.14707.303691.996868@montanaro.dyndns.org> <3FFDD21E.6070002@ABAQUS.com> <16381.55778.208434.489763@montanaro.dyndns.org> <1040108174832.ZM11304@reliant.hks.com> <3FFE4F13.5000702@v.loewis.de> <16382.50462.118522.58707@montanaro.dyndns.org> Message-ID: <1040109111049.ZM12760@reliant.hks.com> >------------------- Message from Skip Montanaro -------------------- > > Martin> 2. Assuming -Wl,+s -Wl,+k are warnings: ... > > They cause the +s and +k options to be passed to the linker. They could > also be written as -Wl,+s,+k. > > Martin> 4. On the model selection options, I don't quite understand > Martin> why you say that this set of options gives portable binaries. > Martin> I would have thought that +DA1.1 +DS2.0 is the set of option > Martin> that gives portable binaries. > > I found an HP-UX machine I could log into which has aCC installed. It's > just a 10.20 machine. HP's are on the way out here. I didn't found any > running HP-UX 11. > > About +DA and +DS the aCC man page has this to say: > > +DAarchitecture > Generate code for a particular version of the PA- > RISC architecture specified. Also specifies which > version of the HP-UX math library to link when you > have specified -lm. > > Note Object code generated for PA-RISC 2.0 will > not execute on PA-RISC 1.1 systems. > > To generate code compatible across PA-RISC 1.1 and > 2.0 workstations and servers, use the +DAportable > option. For best performance use +DA with the > model number or architecture where you plan to > execute the program. See the file > /opt/langtools/lib/sched.models for a list of > model numbers and their PA-RISC architecture > designations. > > If you do not specify this option, the default > object code generated is determined automatically > as that of the machine on which you compile. > > Examples > +DA1.1 > +DA867 > +DA2.0 > +DAportable > > The first two examples generate code for the PA- > RISC 1.1 architecture. The third example generates > code for the PA-RISC 2.0 architecture. The fourth > example generates code compatible across PA-RISC > 1.1 and 2.0 workstations and servers. > > +DSmodel Use the instruction scheduler tuned to the model > specified. If this option is not used, the > compiler uses the instruction scheduler for the > architecture on which the program is compiled. > The architecture is determined by uname() (see > uname(2)). model can be a model number, PA-RISC > architecture designation or PA-RISC processor > name. See the file > /opt/langtools/lib/sched.models for a list of > model numbers and processor names. > > Obviously, due to the age of the OS on this machine, there's no mention of > Itanium. > > >> Maybe there should also be some sort of option for expects who will > >> not want Python to be compiled for generic processor; who might > >> purposefully want to compile Python on their particular hardware and > >> have it fully optimized for it to gain maximum performance. > > Martin> No, no, no. If you want to highly tune your build, you are > Martin> expected to edit the makefile. Providing an explicit option for > Martin> that will place a burden on users who are not able to answer > Martin> difficult build questions (and, trust me, this is the majority > Martin> of the users). > > It would also make the configure script that much more fragile. > > Skip > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mon%40abaqus.com >-------------------------------------------------------- As far as architectures are concerned, we can no longer find any PA 1.1 machines. All machines that we deal with happen to be PA 2.0 machines. So far we have had no trouble with binaries compiled for PA 2.0. (We re-distribute Python to users in binary form). We had initially also tried to compile for 1.1 and Portable, but found we that those binaries run much slower: there was about 20% performance improvement for switching from PA 1.1 to PA 2.0. But switching from PA-RISC 32-bit to 64-bit brought no performance improvements at all. Binaries compiled for PA 2.0 in 32-bit run successfully on all current processors: 32-bit RISC, 64-bit RISC, as well as Itanium machines. So PA-RISC 2.0 32-bit seems like a reasonable default. Could other users please weigh in here if this is not your typical setup? --- I also accept the point that configure should have a reasonable default when used with no arguments. And what that particular default is, is not particularly relevant. What's important is that it should pick one of the broad/generic setups we are discussing right now, and not tie itself to a very particular hardware on a single machine. What's also good about this, is that at least it will display which Architecture it has chosen (if you look at the -DA -DS options it picked), and not just compile something silently that might be very unportable. Thank you, -- Nick Monyatovsky -- mon@abaqus.com From skip at pobox.com Fri Jan 9 11:17:28 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 9 11:17:39 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401091656.40994.aleaxit@yahoo.com> References: <200401042241.i04MfqE06932@c-24-5-183-134.client.comcast.net> <000a01c3d672$097e6500$95b52c81@oemcomputer> <16382.50864.699810.476915@montanaro.dyndns.org> <200401091656.40994.aleaxit@yahoo.com> Message-ID: <16382.54296.290728.87680@montanaro.dyndns.org> >> I'm neither a Smalltalk, C++ or Objective C programmer. How is a bag >> different from a set? Alex> In a bag, or multiset, a certain item can be in the collection a Alex> given _number_ of times, rather than just "either being there or Alex> not" as in a set. Thx. >> Oddly enough, although we used collections heavily, I never really >> missed that functionality in Python. I implemented it once or twice, >> but never used it much. Different style of programming I guess. Alex> for item in collection: item.turn() or Alex> collection.each(lambda x: x.turn()) In LYMB (using Python syntax) it would have been something more like all_actors.look_at(active_camera.position()) Hard to get any clearer than that. ;-) It's pretty easily done with __getattr__. Skip From guido at python.org Fri Jan 9 11:19:51 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 9 11:19:58 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: Your message of "Fri, 09 Jan 2004 13:32:08 +0100." <4qv5uwtz.fsf@python.net> References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> <4qv5uwtz.fsf@python.net> Message-ID: <200401091619.i09GJq723065@c-24-5-183-134.client.comcast.net> > [Dave] > > >> Another possibility: we'd get a lot more mileage out of simply > >> adding ctypes to the standard install. That would solve the > >> immediate popen problem, let us get rid of _winreg entirely (and > >> replace it with a pure Python version), and make future Win32 > >> problems easier to solve. > > [Guido] > > I don't know that a Python version of _winreg using ctypes would be > > preferable over a C version. > > Since there is a C version, there's no need to code it in Python again. > > > I'd expect it to be slower, less readable than the C version, and more > > susceptible to the possibility of causing segfaults. > > Slower: yes, but I don't know how much. > Less readable: no, more readable. > Susceptible to segfaults: If written correctly, it should be bullet > proof - exactly the same as a C version. But the version that was referenced here before isn't bulletproof, right? Wouldn't the bullet-proofing reduce the readability? > > That doesn't mean I don't think ctypes is a good idea -- just that I > > don't think applying it to _winreg would be useful. > > Cool. > > Should this be a Windows only version, or cross-platform? I don't know enough about ctypes and its user community to answer that (I doubt I'd have much direct need for it myself). But in general I'm biased towards cross-platform tools. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Jan 9 11:13:32 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 9 11:32:31 2004 Subject: [Python-Dev] Fixes for imageop module In-Reply-To: Your message of "Fri, 09 Jan 2004 10:01:35 +0100." <3FFE6DEF.9070209@acm.org> References: <3FEC9C64.5080805@acm.org> <200312262249.hBQMnhf24306@c-24-5-183-134.client.comcast.net> <3FECBFB3.7090906@acm.org> <200312271749.hBRHnkN25210@c-24-5-183-134.client.comcast.net> <3FEDCBA2.3030607@acm.org> <200312272241.hBRMfHb01333@c-24-5-183-134.client.comcast.net> <3FFE6DEF.9070209@acm.org> Message-ID: <200401091613.i09GDWO23032@c-24-5-183-134.client.comcast.net> > >>>Hm. Anybody who uses the imageop module currently on Linux will find > >>>their programs broken. The strings manipulated by imageop all end up > >>>either being written to a file or handed to some drawing code, and > >>>changing the byte order would definitely break things! > >> > >>That's why I asked. > >> > >> > >>>So I don't think this is an acceptable change. I take it that for > >>>IRIX, the byte order implied by the old code is simply wrong? Maybe > >>>the module can be given a global (shrudder?) byte order setting that > >>>you can change but that defaults to the old setting? > >> > >>The problem is, the documentation says: "This is the same format as used > >>by gl.lrectwrite() and the imgfile module." This implies the byte order > >>that you get on the SGI which is opposite of what you get on Intel. The > >>code produced the correct results on the SGI, but not on Intel. > > > > > > Your fix would be okay if until now it was *only* used on IRIX with > > gl.lrectwrite() and/or the imgfile module. But how can we prove that? > > I don't know. > > >>(By the way, I'm away from computers for a week starting tomorrow > >>extremely early.) > > > > > > It's okay to way a week before making a decision. > > I'm back (and have been this week). Any thoughts about a decision? I suggest that you implement a way to change the endianness of the operations. A global flag in the module is fine. It should default to the historic behavior, and there should be a call to switch it to the behavior you want. --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Fri Jan 9 12:04:35 2004 From: theller at python.net (Thomas Heller) Date: Fri Jan 9 12:04:53 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <200401091619.i09GJq723065@c-24-5-183-134.client.comcast.net> (Guido van Rossum's message of "Fri, 09 Jan 2004 08:19:51 -0800") References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> <4qv5uwtz.fsf@python.net> <200401091619.i09GJq723065@c-24-5-183-134.client.comcast.net> Message-ID: <65flt5ng.fsf@python.net> Guido van Rossum writes: >> [Dave] >> >> >> Another possibility: we'd get a lot more mileage out of simply >> >> adding ctypes to the standard install. That would solve the >> >> immediate popen problem, let us get rid of _winreg entirely (and >> >> replace it with a pure Python version), and make future Win32 >> >> problems easier to solve. >> >> [Guido] >> > I don't know that a Python version of _winreg using ctypes would be >> > preferable over a C version. >> >> Since there is a C version, there's no need to code it in Python again. >> >> > I'd expect it to be slower, less readable than the C version, and more >> > susceptible to the possibility of causing segfaults. >> >> Slower: yes, but I don't know how much. >> Less readable: no, more readable. >> Susceptible to segfaults: If written correctly, it should be bullet >> proof - exactly the same as a C version. > > But the version that was referenced here before isn't bulletproof, > right? Wouldn't the bullet-proofing reduce the readability? Here is a small part a Dave's code: def CreateKey(baseKey, subKey): 'Creates/opens the given key and returns it' RCK = windll.advapi32.RegCreateKeyExA key = c_int(0) RCK(baseKey, subKey, 0, 0, REG_OPTION_NON_VOLATILE, KEY_ALL_ACCESS, 0, byref(key), 0) return key.value He doesn't check that baseKey and subKey are strings, and he doesn't check the return value of the function call. There are several ways to fix these problems. A simple one would be to write this instead: def CreateKey(baseKey, subKey): 'Creates/opens the given key and returns it' RCK = windll.advapi32.RegCreateKeyExA key = c_int(0) result = RCK(str(baseKey), str(subKey), 0, 0, REG_OPTION_NON_VOLATILE, KEY_ALL_ACCESS, 0, byref(key), 0): raise WindowsError(result) return key.value But ctypes supports a kind of 'function prototypes' (which do automatic argument type checking and/or conversion= as well as automatic result checking: def CheckNonNull(errcode): # if errcode is nonzero, raise a WindowsError if errcode: raise WindowsError(errcode) RCK = windll.advapi32.RegCreateKeyExA # specify the argument types RCK.argtypes = (HKEY, LPCTSTR, DWORD, REGSAM, LPSECURITY_ATTRIBUTES, PMKEY, LPDWROD) # RegCreateKeyExA returns 0 on success, or a windows error code RCK.restype = CheckNonNull def CreateKey(baseKey, subKey): 'Creates/opens the given key and returns it' key = c_int(0) RCK(baseKey, subKey, 0, 0, REG_OPTION_NON_VOLATILE, KEY_ALL_ACCESS, 0, byref(key), 0) return key.value > >> > That doesn't mean I don't think ctypes is a good idea -- just that I >> > don't think applying it to _winreg would be useful. >> >> Cool. >> >> Should this be a Windows only version, or cross-platform? > > I don't know enough about ctypes and its user community to answer that > (I doubt I'd have much direct need for it myself). But in general I'm > biased towards cross-platform tools. I had reports that it works on Solaris, Linux, MacOS, BSD. Maybe more systems. The problem is that the non-windows version uses libffi, which is difficult to find and install - it seems to be maintained now as part of gcc, although the license is more BSD like. I would like to get rid of libffi - but the only multi-architecture alternative I know of is Bruno Haible's ffcall (which is GPL). There *may* be other options (including writing assembly code). Sam Rushing's calldll did this, AFAIK. And, remember: you can write bulletproof code with ctypes, but you can also easily crash Python. Or other weird things. Thomas From tanzer at swing.co.at Fri Jan 9 12:12:56 2004 From: tanzer at swing.co.at (Christian Tanzer) Date: Fri Jan 9 12:15:39 2004 Subject: [Python-Dev] collections module In-Reply-To: Your message of "Fri, 09 Jan 2004 00:32:52 EST." <000a01c3d672$097e6500$95b52c81@oemcomputer> Message-ID: "Raymond Hettinger" wrote: > I would like to establish a new module for some collection classes. I'd prefer a package. The number of collection classes is bound to increase over time and a module with too many classes sucks (been there, done that :-) > Guido wanted this to be discussed by a number of people. While > a rich set of collection classes have proven their worth in other > contexts, it could be that more is not better for python. One the > language's elegances is the ability to do just about anything with > just lists and dicts. If you have too many mutable containers to > choose from, the choice of which to use becomes less obvious. I used Python for more than five years before I really needed something else than lists and dicts (that includes using lists for stacks and queues). But quite recently I had to implement a doubly-linked list. Using that made the code in question substantially easier. I'd like to see things like a queue in the standard library. Maybe even a stack would be nice (maybe I've just seen too many attempts pushing/popping at the front of a list -:) -- Christian Tanzer http://www.c-tanzer.at/ From guido at python.org Fri Jan 9 12:28:36 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 9 12:28:41 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: Your message of "Fri, 09 Jan 2004 18:04:35 +0100." <65flt5ng.fsf@python.net> References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> <4qv5uwtz.fsf@python.net> <200401091619.i09GJq723065@c-24-5-183-134.client.comcast.net> <65flt5ng.fsf@python.net> Message-ID: <200401091728.i09HSaa23177@c-24-5-183-134.client.comcast.net> > There are several ways to fix these problems. A simple one would be to > write this instead: > > def CreateKey(baseKey, subKey): > 'Creates/opens the given key and returns it' > RCK = windll.advapi32.RegCreateKeyExA > key = c_int(0) > result = RCK(str(baseKey), str(subKey), 0, 0, > REG_OPTION_NON_VOLATILE, > KEY_ALL_ACCESS, 0, byref(key), 0): > raise WindowsError(result) > return key.value > > But ctypes supports a kind of 'function prototypes' (which do automatic > argument type checking and/or conversion= as well as automatic result > checking: > > def CheckNonNull(errcode): > # if errcode is nonzero, raise a WindowsError > if errcode: > raise WindowsError(errcode) > > RCK = windll.advapi32.RegCreateKeyExA > # specify the argument types > RCK.argtypes = (HKEY, LPCTSTR, DWORD, REGSAM, LPSECURITY_ATTRIBUTES, > PMKEY, LPDWROD) > # RegCreateKeyExA returns 0 on success, or a windows error code > RCK.restype = CheckNonNull > > def CreateKey(baseKey, subKey): > 'Creates/opens the given key and returns it' > key = c_int(0) > RCK(baseKey, subKey, 0, 0, > REG_OPTION_NON_VOLATILE, > KEY_ALL_ACCESS, 0, byref(key), 0) > return key.value If no C code existed, this all sounds great -- but don't underestimate its obscurity. For _winreg.c, I don't see a reason to switch. If it's missing something you'd like to see, please submit a patch to the C code. > > I don't know enough about ctypes and its user community to answer that > > (I doubt I'd have much direct need for it myself). But in general I'm > > biased towards cross-platform tools. > > I had reports that it works on Solaris, Linux, MacOS, BSD. Maybe more > systems. > > The problem is that the non-windows version uses libffi, which is > difficult to find and install - it seems to be maintained now as part of > gcc, although the license is more BSD like. > > I would like to get rid of libffi - but the only multi-architecture > alternative I know of is Bruno Haible's ffcall (which is GPL). > > There *may* be other options (including writing assembly code). > Sam Rushing's calldll did this, AFAIK. > > And, remember: you can write bulletproof code with ctypes, but you can > also easily crash Python. Or other weird things. OK, never mind. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Jan 9 12:30:54 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 9 12:30:59 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: Your message of "Thu, 08 Jan 2004 21:52:21 CST." <33628.192.168.1.101.1073620341.squirrel@server.lotusland.dyndns.org> References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private><2004010822 11.i08MBGx21879@c-24-5-183-134.client.comcast.net> <20040108182033.A17993@ActiveState.com> <33628.192.168.1.101.1073620341.squirrel@server.lotusland.dyndns.org> Message-ID: <200401091730.i09HUsK23211@c-24-5-183-134.client.comcast.net> > As long as the possibility of copying APIs from win32all is being > entertained, would it be feasible to make os.kill do something meaningful > on Windows? > > I'm no expert on Windows APIs, but it may be as simple as calling > win32api.GenerateConsoleCtrlEvent or win32api.TerminateProcess. (Although > I seem to remember reading that killing processes externally on Windows is > something of a thorny issue. It may corrupt the state of DLLs, or > something along those lines.) Clearly, until someone shows up with more expertise, nothing will happen. --Guido van Rossum (home page: http://www.python.org/~guido/) From niemeyer at conectiva.com Fri Jan 9 12:40:51 2004 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Fri Jan 9 12:46:15 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040107104457.0F01.JCARLSON@uci.edu> References: <20040106224357.13CC.JCARLSON@uci.edu> <200401071501.i07F1ZL13165@c-24-5-183-134.client.comcast.net> <20040107104457.0F01.JCARLSON@uci.edu> Message-ID: <20040109174051.GA6480@burma.localdomain> > I don't know about that; until this thread, I've basically agreed with > every direction Python has gone in the 5 years since I started using it, > but that is a one-way relationship. I think this shouldn't be added to the standard library at all, for a few reasons: - All given examples in the PEP are easily rewritable with existent logic (and I disagree that the new method would be easier to understand); - I can't think about any real usage cases for such object which couldn't be easily done without it; - The "One True Large Object" isn't "True Large" at all, since depending on the comparison order, another object might belive itself to be larger than this object. If this was to be implemented as a supported feature, Python should special case it internally to support the "True" in the given name. - It's possible to implement that object with a couple of lines, as you've shown; - Any string is already a maximum object for any int/long comparison (IOW, do "cmp.high = 'My One True Large Object'" and you're done). - Your Dijkstra example is a case of abusement of tuple's sorting behavior. If you want readable code as you suggest, try implementing a custom object with a comparison operator, instead of writting "foo = (a,b,c,d)", and "(a,b,c,d) = foo", and "foo[x][y]" all over the place. -- Gustavo Niemeyer http://niemeyer.net From theller at python.net Fri Jan 9 13:02:22 2004 From: theller at python.net (Thomas Heller) Date: Fri Jan 9 13:02:31 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <200401091728.i09HSaa23177@c-24-5-183-134.client.comcast.net> (Guido van Rossum's message of "Fri, 09 Jan 2004 09:28:36 -0800") References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> <4qv5uwtz.fsf@python.net> <200401091619.i09GJq723065@c-24-5-183-134.client.comcast.net> <65flt5ng.fsf@python.net> <200401091728.i09HSaa23177@c-24-5-183-134.client.comcast.net> Message-ID: Guido van Rossum writes: > If no C code existed, this all sounds great -- but don't underestimate > its obscurity. For _winreg.c, I don't see a reason to switch. If > it's missing something you'd like to see, please submit a patch to the > C code. I wasn't suggesting to replace the existing _winreg.c with ctypes, I only wanted to demonstrate how ctypes code looks like, and took the example that was posted before. >> > I don't know enough about ctypes and its user community to answer that >> > (I doubt I'd have much direct need for it myself). But in general I'm >> > biased towards cross-platform tools. >> >> I had reports that it works on Solaris, Linux, MacOS, BSD. Maybe more >> systems. >> >> The problem is that the non-windows version uses libffi, which is >> difficult to find and install - it seems to be maintained now as part of >> gcc, although the license is more BSD like. >> >> I would like to get rid of libffi - but the only multi-architecture >> alternative I know of is Bruno Haible's ffcall (which is GPL). >> >> There *may* be other options (including writing assembly code). >> Sam Rushing's calldll did this, AFAIK. >> >> And, remember: you can write bulletproof code with ctypes, but you can >> also easily crash Python. Or other weird things. > > OK, never mind. I don't understand this comment - what do you mean? Thomas From fincher.8 at osu.edu Fri Jan 9 13:09:07 2004 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Fri Jan 9 13:09:00 2004 Subject: [Python-Dev] collections module In-Reply-To: <000a01c3d672$097e6500$95b52c81@oemcomputer> References: <000a01c3d672$097e6500$95b52c81@oemcomputer> Message-ID: On Jan 9, 2004, at 12:32 AM, Raymond Hettinger wrote: > The second type would be a high speed queue implemented using > linked list blocks like we used for itertools.tee(). The > interface could be as simple as push, pop, and len. Each > operation ought to run about as fast a list assignment (meaning > that it beats the heck out of the list.pop(0), list.append(data) > version). That's great, though I'd beg that the methods push/pop be named enqueue/dequeue. I can imagine the confusion that would result from read code that used .pop() to clear out a queue -- people very possibly wouldn't be able to tell whether the data structure being used is a queue or a stack. Jeremy From tim.one at comcast.net Fri Jan 9 13:33:51 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 9 13:33:56 2004 Subject: [Python-Dev] HP-UX clean-up In-Reply-To: <3FFDD21E.6070002@ABAQUS.com> Message-ID: [Andrew MacKeith] > ... > NOTE: the multithreading flags (-D_REENTRANT -mt) should only > be applied to the files where there is actually threaded code. > When applied to all files indiscriminately, they cause overall > performance degradation. > AFAIK the files with threaded code are: > thread.c > threadmodule.c I'm not sure that makes sense, or maybe it's that I don't understand what sense it makes . For concrete examples, any number of threads may simultaneously be executing fileobject.c's file_close() function, or zlibmodule.c's PyZlib_decompress() function. If the compiler doesn't generate thread-correct code for those, they can fail in arbitrarily horrid ways. Every place the source code releases the GIL is a place any number of threads may be active simultaneously. From niemeyer at conectiva.com Fri Jan 9 14:14:01 2004 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Fri Jan 9 14:19:25 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <001401c3d6e3$8a4c91e0$ceb99d8d@oemcomputer> References: <20040109174051.GA6480@burma.localdomain> <001401c3d6e3$8a4c91e0$ceb99d8d@oemcomputer> Message-ID: <20040109191401.GA8086@burma.localdomain> [forwarding to python-dev for general ack] > The changes to re are still throwing off signed/unsigned warnings. Can > you please get that fixed up? My changes to sre have nothing to do with these warnings. These are being caused by the fact that gcc throws a warning even with casts in the following situation: char i = 0; if ((int)i < 256) ... Unlike one might think, this is a valid test because the type of 'i' is not always char. I won't fix these warnings until I find an elegant solution to workaround the gcc stupidity. -- Gustavo Niemeyer http://niemeyer.net From barry at barrys-emacs.org Fri Jan 9 14:27:57 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Fri Jan 9 14:28:08 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.ne t> References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net> Message-ID: <6.0.1.1.2.20040109191843.0213cb08@torment.chelsea.private> At 08-01-2004 22:11, Guido van Rossum wrote: >(Not sure if this was in response to my "let's do it in Python" post.) Sent before reading you message. An idea who's time has arrived I guess. BArry From tim.one at comcast.net Fri Jan 9 14:36:05 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 9 14:36:13 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: <200401091730.i09HUsK23211@c-24-5-183-134.client.comcast.net> Message-ID: [Guido, on os.kill() for Windows] > Clearly, until someone shows up with more expertise, nothing will > happen. We hash this out periodically here; e.g., http://mail.python.org/pipermail/python-dev/2002-December/030627.html Windows wasn't designed to support clean termination of one process from another, and there are lots of dangers. For Zope's ZRS, I wrapped TerminateProcess() in a teensy C extension so the ZRS test suite could kill the various Python processes it starts (hundreds before the test suite finishes). These are console-mode processes (no windows), and make no use of Windows features, so the dangers are minimal in doing TerminateProcess() on one of those. The general case is a bottomless pit (see the link referenced in the article referenced above for a taste). Recent versions of Windows (not Win9x) have a new API, CreateRemoteThread(), which allows process A to create a thread that runs in some other process B's virtual address space, as if B had started the thread. I read once somewhere that this API can be used by A to execute the safer ExitProcess() "from within" B. From barry at barrys-emacs.org Fri Jan 9 15:12:46 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Fri Jan 9 15:15:42 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: References: <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net> Message-ID: <6.0.1.1.2.20040109195218.021a3d28@torment.chelsea.private> I've got code that implements the popen5 like functionality in my Emacs do you to look it over? Its based on sample code from microsoft. In http://www.barrys-emacs.org/bemacs_7.2-230.win32_proc_code.zip you will find the C++ modules. I can guess from your API list that what you have will probably not work from pythonw, I recall that you need to call AllocConsole for example have play games with std handles. Barry At 08-01-2004 22:31, Peter Astrand wrote: > > > win32all covers a huge number of API functions, more then would > > > be sane to add to os. But would there be any mileage in added > > > enough from win32all to allow problems like popen5 to be > > > implemented? > > > > > > There is already the _reg module that has some win32 functions in > > > it on the standard install. > > > > (Not sure if this was in response to my "let's do it in Python" post.) > > > > How many APIs would have to be copied from win32all to enable > > implementing popen5 in Python? > >I'm not sure since I haven't implemented much yet, but we'll need at >least: > >win32.CreatePipe >win32api.DuplicateHandle >win32api.GetCurrentProcess >win32process.CreateProcess >win32process.STARTUPINFO >win32gui.EnumThreadWindows >win32event.WaitForSingleObject >win32process.TerminateProcess > > >-- >/Peter ?strand > > > > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/nospam%40barrys-emacs.org From schizo at debian.org Fri Jan 9 15:24:53 2004 From: schizo at debian.org (Clint Adams) Date: Fri Jan 9 15:24:59 2004 Subject: [Python-Dev] Re: trying to build 2.3.3 with db-4.2.52 In-Reply-To: <16375.62406.740857.546472@gargle.gargle.HOWL> References: <16375.62406.740857.546472@gargle.gargle.HOWL> Message-ID: <20040109202453.GA14292@scowler.net> > Trying to build python2.3.3 with db-4.2.52, I get many failures in > various tests: Seems that db4.2 was being built for i386 on a machine with a 2.6 kernel, and libpthread wasn't being linked in. Does it work with 4.2.52-5? From jcarlson at uci.edu Fri Jan 9 15:29:38 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Jan 9 15:33:12 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040109174051.GA6480@burma.localdomain> References: <20040107104457.0F01.JCARLSON@uci.edu> <20040109174051.GA6480@burma.localdomain> Message-ID: <20040109114039.FB88.JCARLSON@uci.edu> > - All given examples in the PEP are easily rewritable with existent > logic (and I disagree that the new method would be easier to > understand); You are free to disagree that two objects, which clearly state that they are either the largest or smallest objects, are not clear. I don't know how to address your concern. In terms of the names for such objects, as well as their locations being good and descriptive, that is still an open issue, and is listed as such in the latest version of the PEP, which is available at http://www.python.org/peps/pep-0326.html > - I can't think about any real usage cases for such object which > couldn't be easily done without it; It is not whether "if we don't have it we can do it easy", it is about "if we have it, we can do it easier". The Queue module, just recently brought up, is one of those examples where having it for interthread communication is invaluable. Sure, everyone could write their own threaded Queue class when needed, but having it there to use is nice. Arguably, there aren't any usage cases where None is necessary. But yet we have None. Which is preferable: _mynone = [] def foo(arg=_mynone): if id(arg) == id(_mynone): #equivalent to None pass def goo(arg=None): if arg is None: pass Again, it is not about "can we do it easy without", it is "can we do it easier with". > - The "One True Large Object" isn't "True Large" at all, since depending > on the comparison order, another object might belive itself to be > larger than this object. If this was to be implemented as a supported > feature, Python should special case it internally to support the > "True" in the given name. I don't know how difficult modifying the comparison function to test whether or not one object in the comparison is the universal max or min, so cannot comment. I'm also don't know if defining a __rcmp__ method would be sufficient. > - It's possible to implement that object with a couple of lines, as > you've shown; I don't see how the length of the implementation has anything to do with how useful it is. I (and others) have provided examples of where they would be useful. If you care to see more, we can discuss this further off-list. > - Any string is already a maximum object for any int/long comparison > (IOW, do "cmp.high = 'My One True Large Object'" and you're done). That also ignores the idea of an absolute minimum. There are two parts of the proposal, a Maximum and a Minimum. The existance of which has already been agreed upon as being useful. Whether they are in the standard library, standard language, or otherwise, and what name they will have, are all currently open issues. A few options are listed in the `Open Issues` portion of the PEP, in not so many words. > - Your Dijkstra example is a case of abusement of tuple's sorting > behavior. If you want readable code as you suggest, try implementing > a custom object with a comparison operator, instead of writting > "foo = (a,b,c,d)", and "(a,b,c,d) = foo", and "foo[x][y]" all over > the place. I don't see how using the features of a data structure already in the standard language is "abuse". Perhaps I should have created my own custom class that held all the relevant information, included a __cmp__ method, added attribute access and set rutines ...OR... maybe it was reasonable that I used a tuple and saved people the pain of looking through twice as much source, just to see a class, that has nothing to do with the PEP. - Josiah From jcarlson at uci.edu Fri Jan 9 15:51:54 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Jan 9 15:55:32 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401091610.i09GA8j23016@c-24-5-183-134.client.comcast.net> References: <877k01d1qr.fsf@egil.codesourcery.com> <200401091610.i09GA8j23016@c-24-5-183-134.client.comcast.net> Message-ID: <20040109111053.FB85.JCARLSON@uci.edu> > You miss one thing -- the Queue class exists *specifically* for thread > communication. It has very different semantics than I'd want for a > generic queue. E.g. with Queue, if you try to get something out of an > empty queue, your thread *blocks*. That's only appropriate for > applications that expect this. (Imagine the newbie questions to > c.l.py. :-) Agreed. About the only question is what the new fifo queue should be called, and wether it should be located in Queue (probably not, given the module description), Raymond's proposed high speed data structure module, or somewhere else. > My own ideal solution would be a subtle hack to the list > implementation to make pop(0) and insert(0, x) go a lot faster -- this > can be done by adding one or two extra fields to the list head and > allowing empty space at the front of the list as well as at the end. I'm sure you know this, but just for sake of argument, how many extra fields? A couple? A few? 20? 30? I'm not sure there really is a good hard and fast number. I think it makes as much sense to preallocate the same number of entries to the front of a list, as to what is currently allocated to the back. At that point, you can guarantee O(1) {pop(0), pop(), append() and insert(0)} behavior, at the cost of slightly more memory. Then it really wouldn't matter how people implement their stacks or queues (front, back, double-list), it would still be fast. - Josiah From tim.one at comcast.net Fri Jan 9 16:07:04 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 9 16:07:30 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040108215224.1374.JCARLSON@uci.edu> Message-ID: [Josiah Carlson] > There's been a python-based FIFO that is O(1) on {push, pop} and uses > lists in the ASPN Python cookbook (adding len is easy): > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/210459 > You'd probably be more interested in the variant I posted as a > comment; it is significantly faster when the output queue is > exhausted. Yup! Yours is the nicest way to build an efficient (but thread-unsafe) queue in Python. > Speaking of which...the Queue module REALLY should be re-done with the > Fifo above Maybe; it probably doesn't matter; serious threaded apps pass maxsize to Queue, generally equal to the # of worker threads (or a small constant multiple thereof), and the bounded queue does double-duty then as a flow-control throttle too; list.pop(0) isn't really slow when len(list) is that small. > and a single conditional, Sorry, I don't know what that might mean. > the multiple mutexes in the current Queue module are ugly as > hell, and very difficult to understand. self.mutex is dead easy to understand. esema and fsema would be clearer if renamed to not_empty and not_full (respectively), and clearer still (but slower) if they were changed to threading.Conditions (which didn't exist at the time Queue was written). Regardless, there are 3 of them for good reasons, although that may be beyond the understanding you've reached so far: mutex has to do with threading correctness, and esema and fsema have to do with threading efficiency via reducing lock contention (so long as the queue isn't empty, a would-be get'er can acquire esema instantly, and doesn't care about fsema regardless; so long as the queue isn't full, a would-be put'er can acquire fsema instantly, and doesn't care about esema regardless; and in both of those "instantly" means "maybe" ). Note that you cannot, for example, start each method by acquiring self.mutex, and have that be the only lock. If you try that, then, e.g., someone doing a get on an emtpy queue-- and so needing to block --will hold on to self.mutex forever, preventing any other thread from adding seomthing to the queue (and the whole shebang deadlocks). > ... > It all really depends on the overhead of the underlying mutexes or > conditionals, which can be tested. I still don't know what "conditionals" means to you (you're counting "if" statements, perhaps?), but mucking with mutexes isn't cheap. You don't want to bother with locks unless thread-safety is an important issue. From martin at v.loewis.de Fri Jan 9 16:25:58 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Fri Jan 9 16:27:31 2004 Subject: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <20040109083021.GA4128@i18n.org> References: <20040109083021.GA4128@i18n.org> Message-ID: <3FFF1C66.5030902@v.loewis.de> Hye-Shik Chang wrote: > Any opionions will be appreciated. I like the changes, and have accepted the patch. Please wait a few days before committing them so strong objections can be voiced. Regards, Martin From jcarlson at uci.edu Fri Jan 9 16:42:59 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Jan 9 16:46:35 2004 Subject: [Python-Dev] collections module In-Reply-To: References: <20040108215224.1374.JCARLSON@uci.edu> Message-ID: <20040109131500.FB8E.JCARLSON@uci.edu> > Maybe; it probably doesn't matter; serious threaded apps pass maxsize to > Queue, generally equal to the # of worker threads (or a small constant > multiple thereof), and the bounded queue does double-duty then as a > flow-control throttle too; list.pop(0) isn't really slow when len(list) is > that small. It really all depends on how fast the producers and consumers are. In the cases where I've used Queue.Queue, I didn't have small Queues. > > and a single conditional, > > Sorry, I don't know what that might mean. oops, I misspelled it, threading.Condition() A lock object with three additional methods: wait(), notify(), and notifyAll() When you have the lock, you can wait() for a notification, notify() an arbitrary waiter, or notifyAll() waiters. Terribly nice. > > the multiple mutexes in the current Queue module are ugly as > > hell, and very difficult to understand. > > self.mutex is dead easy to understand. esema and fsema would be clearer if > renamed to not_empty and not_full (respectively), and clearer still (but > slower) if they were changed to threading.Conditions (which didn't exist at > the time Queue was written). Regardless, there are 3 of them for good > reasons, although that may be beyond the understanding you've reached so > far: I figured it out when I decided it wasn't fast enough for my use. The below is quite fast for most reasonable numbers of threads - it does wake all threads up that were waiting, but I find it considerably clearer, and can be done with a single synchronization primitive. from threading import Condition #assuming fifo exists and is fast class CQueue: def __init__(self, maxsize): if maxsize > 0: self._m = maxsize else: #big enough, but Max would be better self._m = 2**64 self._q = fifo() self._l = Condition() def qsize(self): self._l.acquire() l = len(self._q) self._l.release() return l def empty(self): return self.qsize() == 0 def full(self): return self.qsize() == self._m def put(self, item, block=1): try: self._l.acquire() while block and len(self._q) == self._m: self._l.wait() self._l.release() self._l.acquire() if len(self._q) < self._m: self._q.put(item) self._l.notifyAll() else: raise Full finally: self._l.release() def put_nowait(self, item): self.put(item, 0) def get(self, block=1): try: self._l.acquire() while block and len(self._q) == 0: self._l.wait() self._l.release() self._l.acquire() if len(self._q) > 0: a = self._q.get() self._l.notifyAll() return a raise Empty finally: self._l.release() def get_nowait(self): return self.get(0) > > ... > > It all really depends on the overhead of the underlying mutexes or > > conditionals, which can be tested. > > I still don't know what "conditionals" means to you (you're counting "if" > statements, perhaps?), but mucking with mutexes isn't cheap. You don't want > to bother with locks unless thread-safety is an important issue. I agree completely, most locking mechanisms are slow as hell. There are some tricks that can be done with critical sections (or its equivalent on multiple platforms); an article was written about multithreaded locking performance on IBM at least a year ago. I wonder if any of those performance-tuning semantics can be used with the underlying locks in Python. - Josiah From martin at v.loewis.de Fri Jan 9 16:47:18 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Fri Jan 9 16:47:42 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <20040109191401.GA8086@burma.localdomain> References: <20040109174051.GA6480@burma.localdomain> <001401c3d6e3$8a4c91e0$ceb99d8d@oemcomputer> <20040109191401.GA8086@burma.localdomain> Message-ID: <3FFF2166.90201@v.loewis.de> Gustavo Niemeyer wrote: >>The changes to re are still throwing off signed/unsigned warnings. Can >>you please get that fixed up? > > > My changes to sre have nothing to do with these warnings. This is not true. I get Modules/_sre.c:811: warning: comparison between signed and unsigned Now, line 811 is DATA_ALLOC(SRE_MATCH_CONTEXT, ctx); This expands to alloc_pos = state->data_stack_base; \ TRACE(("allocating %s in %d (%d)\n", \ SFY(type), alloc_pos, sizeof(type))); \ if (state->data_stack_size < alloc_pos+sizeof(type)) { \ int j = data_stack_grow(state, sizeof(type)); \ if (j < 0) return j; \ if (ctx_pos != -1) \ DATA_STACK_LOOKUP_AT(state, SRE_MATCH_CONTEXT, ctx, ctx_pos); \ } \ ptr = (type*)(state->data_stack+alloc_pos); \ state->data_stack_base += sizeof(type); \ } while (0) The culprit is the line if (state->data_stack_size < alloc_pos+sizeof(type)) { \ because data_stack_size is signed, and sizeof(type) is size_t. This line is yours 2.101 (niemeyer 17-Oct-03): if (state->data_stack_size < alloc_pos+sizeof (type)) { Changing the type of data_stack_size to unsigned makes the warning go away. Please change all uses of sizes/positions to "size_t", and change the special -1 marker to (size_t)-1. Regards, Martin From tim.one at comcast.net Fri Jan 9 17:03:24 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 9 17:03:47 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040109111053.FB85.JCARLSON@uci.edu> Message-ID: [Guido] >> My own ideal solution would be a subtle hack to the list >> implementation to make pop(0) and insert(0, x) go a lot faster -- >> this can be done by adding one or two extra fields to the list head >> and allowing empty space at the front of the list as well as at the >> end. [Josiah Carlson] > I'm sure you know this, but just for sake of argument, how many extra > fields? A couple? A few? 20? 30? As Guido said, "one or two" would be enough. I think you're talking about different things, though: Guido is talking about the number of new named struct members that would have to be added to Python's list object header. I think you're talking about how many array slots to leave unused on both ends. Just as now, on a reallocation the number of free slots asked for has to be proportional to the total number of slots, so that growth via a huge number of one-at-a-time appends takes (amortized) constant-time per append. Picking any fixed number of free slots leads to overall quadratic-time (in the number of appends) behavior. From jcarlson at uci.edu Fri Jan 9 17:02:55 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Jan 9 17:05:23 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040109131500.FB8E.JCARLSON@uci.edu> References: <20040109131500.FB8E.JCARLSON@uci.edu> Message-ID: <20040109135537.FB94.JCARLSON@uci.edu> > below is quite fast for most reasonable numbers of threads - it does > wake all threads up that were waiting, but I find it considerably clearer, > and can be done with a single synchronization primitive. Oh, by adding a second Condition(), you don't need to notify everyone and can replace the 'while' uses with 'if'. Even faster. One thing that bothered me about using the three mutexes as given in Queue.Queue is that it relies on the fact that a thread can unlock an underlying thread.lock that it didn't acquire. To me, this is strange behavior. - Josiah From guido at python.org Fri Jan 9 17:10:45 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 9 17:11:11 2004 Subject: [Python-Dev] collections module In-Reply-To: Your message of "Fri, 09 Jan 2004 12:51:54 PST." <20040109111053.FB85.JCARLSON@uci.edu> References: <877k01d1qr.fsf@egil.codesourcery.com> <200401091610.i09GA8j23016@c-24-5-183-134.client.comcast.net> <20040109111053.FB85.JCARLSON@uci.edu> Message-ID: <200401092210.i09MAjP23610@c-24-5-183-134.client.comcast.net> > > My own ideal solution would be a subtle hack to the list > > implementation to make pop(0) and insert(0, x) go a lot faster -- this > > can be done by adding one or two extra fields to the list head and > > allowing empty space at the front of the list as well as at the end. > > I'm sure you know this, but just for sake of argument, how many extra > fields? You misunderstood -- the *fields* I was talking about are counters or pointers that keep track of the overallocation, and 1 or at most 2 is enough. > A couple? A few? 20? 30? I'm not sure there really is a > good hard and fast number. You're talking about the amount of overallocation. I would propose to overallocate exactly as much as the list implementation currently does, which is a very subtle scheme that overallocates more when the list grows larger to maintain logarithmic behavior. > I think it makes as much sense to > preallocate the same number of entries to the front of a list, as to > what is currently allocated to the back. At that point, you can > guarantee O(1) {pop(0), pop(), append() and insert(0)} behavior, at the > cost of slightly more memory. Then it really wouldn't matter how people > implement their stacks or queues (front, back, double-list), it would > still be fast. Right. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Fri Jan 9 17:35:26 2004 From: aahz at pythoncraft.com (Aahz) Date: Fri Jan 9 17:35:54 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040109135537.FB94.JCARLSON@uci.edu> References: <20040109131500.FB8E.JCARLSON@uci.edu> <20040109135537.FB94.JCARLSON@uci.edu> Message-ID: <20040109223526.GA11185@panix.com> On Fri, Jan 09, 2004, Josiah Carlson wrote: > > One thing that bothered me about using the three mutexes as given in > Queue.Queue is that it relies on the fact that a thread can unlock an > underlying thread.lock that it didn't acquire. To me, this is strange > behavior. If that bothers you, don't use threading.Condition, either. All the synchronizing capabilities in Python rely on the ability of threads to release a lock acquired by another thread. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From mal at egenix.com Fri Jan 9 17:43:34 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Fri Jan 9 17:44:11 2004 Subject: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <20040109083021.GA4128@i18n.org> References: <20040109083021.GA4128@i18n.org> Message-ID: <3FFF2E96.9000505@egenix.com> Hye-Shik Chang wrote: > Hello! > > I just submitted a patch to integrate CJKCodecs into python. > (http://www.python.org/sf/873597) > > As I remember, the problems mentioned on a previous discussion about > this are these three: > > 1) Size > > Python+CJKCodecs is just 102% of the original by source size and > 104% by source line counts. Installed binary size is about 2.5MB. > I think it's not a big size for todays harddisks. > > 2) Backward Compatibility with C/J/K Codecs > > with ChineseCodecs: it's perfectly compatible except ChineseCodecs' > trivial bugs. > > with JapaneseCodecs: it's now compatible enough to use. the only > difference is ISO-2022-JP's error handling for bytes set MSB which > is very unusual and invalid. > > with KoreanCodecs: it's perfectly compatibie. > > 3) Maintenance > > I can maintain it on Python CVS by myself now. :) > > And I put cjkcodecs files into Modules/cjkcodecs/, but I can't sure > this is the right place to put them. > > Any opionions will be appreciated. I like it except for the module names of the underlying C modules. Could you group them under a common prefix, e.g. _cjk_ ?! Even better would be to put them into a package, but I'm not sure whether that would complicate the setup. > Thank you! Thank you for contributing these ! Since this is a major contribution, we will probably have to ask you to sign one of the contribution agreements that we are currently having in the legal review queue. Would that pose a problem ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Fri Jan 9 17:59:30 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Fri Jan 9 18:00:03 2004 Subject: [Python-Dev] collections module In-Reply-To: <3FFE513A.3020201@v.loewis.de> References: <000a01c3d672$097e6500$95b52c81@oemcomputer> <3FFE513A.3020201@v.loewis.de> Message-ID: <3FFF3252.1050509@egenix.com> Martin v. Loewis wrote: > Raymond Hettinger wrote: > >> What do you guys think? > > -1. There is no evidence that these types are missing. > The fact that they are absent does not apply that anybody > would want to use them if they were available. > > It might be better to develop such a module as a separate > package, wait for user feedback, and perhaps incorporate > this into Python in a few years. While I second your -1, the kjbuckets package has been around for ages and seems to provide at least a subset (or superset depending on how you see it) of what Raymond wants to do. I think these things are better kept and maintained outside the Python core. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From barry at python.org Fri Jan 9 18:06:32 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 9 18:06:54 2004 Subject: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <3FFF1C66.5030902@v.loewis.de> References: <20040109083021.GA4128@i18n.org> <3FFF1C66.5030902@v.loewis.de> Message-ID: <1073689591.3926.62.camel@anthem> On Fri, 2004-01-09 at 16:25, Martin v. Loewis wrote: > I like the changes, and have accepted the patch. Please wait a few days > before committing them so strong objections can be voiced. I have only one reservation. I tried to replace JapaneseCodecs and KoreanCodecs with cjkcodecs in Mailman 2.1.4, but met with resistance. Part of the problem was that we'd have to update the email package to handle these changes and there wasn't time given the freeze on the Python 2.3 tree (still in effect if I'm not mistaken). But also it seems like there isn't universal preference for cjkcodecs over JapaneseCodecs: http://mail.python.org/pipermail/mailman-i18n/2003-December/001084.html I don't even pretend to know what the issues Japanese users have, but it bothers me that there still seems to be some controversy. Can we, and should we resolve that before committing cjkcodecs to Python 2.4 cvs? -Barry From mal at egenix.com Fri Jan 9 18:07:04 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Fri Jan 9 18:07:36 2004 Subject: [Python-Dev] collections module In-Reply-To: References: Message-ID: <3FFF3418.2080207@egenix.com> Tim Peters wrote: > [Josiah Carlson] > >>There's been a python-based FIFO that is O(1) on {push, pop} and uses >>lists in the ASPN Python cookbook (adding len is easy): >>http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/210459 >>You'd probably be more interested in the variant I posted as a >>comment; it is significantly faster when the output queue is >>exhausted. > > Yup! Yours is the nicest way to build an efficient (but thread-unsafe) > queue in Python. FWIW, if you're interested in fast stacks and queues you may want to have a look at mxStack and mxQueue from the egenix-mx-base distribution. They have nothing to do with threads (who needs them anyway when we have copy-on-write :-). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Fri Jan 9 18:39:56 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Fri Jan 9 18:39:58 2004 Subject: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <1073689591.3926.62.camel@anthem> References: <20040109083021.GA4128@i18n.org> <3FFF1C66.5030902@v.loewis.de> <1073689591.3926.62.camel@anthem> Message-ID: <3FFF3BCC.7020200@v.loewis.de> Barry Warsaw wrote: > But also it seems like there isn't universal preference for cjkcodecs > over JapaneseCodecs: > > http://mail.python.org/pipermail/mailman-i18n/2003-December/001084.html There won't be universal preference for any package that we provide, or, for any kind of change made to Python. Unfortunately, the message you quote is hearsay, so we don't know what the specific objections are. If enough users complain that a certain encoding should be modified to work in a specific different manner, we should consider changing the encoding. If the change does not find universal acceptance, we should consider providing an alternative encoding, under a different name; users could then change the alias at run-time if they want to. If too few users complain to justify a change, users could still implement their own modifications, by providing a new encoding that special-cases a few code points, and then falls back to the standard encoding. However, we can only start accepting changes, and documenting our design choices, when we actually have code to contribute to. OTOH, the changes to the email test module should probably be held back after you have incorporated the current code base. Regards, Martin From jason at mastaler.com Fri Jan 9 19:03:31 2004 From: jason at mastaler.com (Jason R. Mastaler) Date: Fri Jan 9 19:03:43 2004 Subject: [Python-Dev] Re: CJKCodecs integration into Python References: <20040109083021.GA4128@i18n.org> <3FFF1C66.5030902@v.loewis.de> <1073689591.3926.62.camel@anthem> <3FFF3BCC.7020200@v.loewis.de> Message-ID: "Martin v. Loewis" writes: > Unfortunately, the message you quote is hearsay, so we don't know > what the specific objections are. Also, since the objectionable message was written, a new release of CJKCodecs has been made, so it's possible these objections have been addressed. I'm all for including CJKCodecs in Python because it will mean one less download for the CJK users of my software. From niemeyer at conectiva.com Fri Jan 9 23:40:33 2004 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Fri Jan 9 23:49:19 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <3FFF2166.90201@v.loewis.de> References: <20040109174051.GA6480@burma.localdomain> <001401c3d6e3$8a4c91e0$ceb99d8d@oemcomputer> <20040109191401.GA8086@burma.localdomain> <3FFF2166.90201@v.loewis.de> Message-ID: <20040110044032.GA2901@burma.localdomain> > >>The changes to re are still throwing off signed/unsigned warnings. Can > >>you please get that fixed up? > > > > > >My changes to sre have nothing to do with these warnings. > > This is not true. I get > > Modules/_sre.c:811: warning: comparison between signed and unsigned > > Now, line 811 is > > DATA_ALLOC(SRE_MATCH_CONTEXT, ctx); Ohh.. I wasn't aware about this case, since I get no errors at all at this line. Do you see any others? I personally get these: Modules/_sre.c:381: warning: comparison is always true due to limited range of data type Modules/_sre.c:383: warning: comparison is always true due to limited range of data type Modules/_sre.c:390: warning: comparison is always true due to limited range of data type Modules/_sre.c:392: warning: comparison is always true due to limited range of data type Modules/_sre.c: In function `sre_charset': Modules/_sre.c:487: warning: comparison is always true due to limited range of data type Modules/_sre.c: In function `sre_ucharset': Modules/_sre.c:487: warning: comparison is always true due to limited range of data type Suggestions welcome. > The culprit is the line > > if (state->data_stack_size < alloc_pos+sizeof(type)) { \ > > because data_stack_size is signed, and sizeof(type) is > size_t. This line is yours > > 2.101 (niemeyer 17-Oct-03): if (state->data_stack_size < > alloc_pos+sizeof (type)) { > > Changing the type of data_stack_size to unsigned makes the > warning go away. It will be a pleasure to fix that once I'm back from vacation (dialup tsc tsc), and get the necessary time. Thanks for pointing me out. > Please change all uses of sizes/positions to "size_t", and > change the special -1 marker to (size_t)-1. Ditto. -- Gustavo Niemeyer http://niemeyer.net From tim.one at comcast.net Sat Jan 10 00:06:14 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 10 00:06:22 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040109131500.FB8E.JCARLSON@uci.edu> Message-ID: [Tim] >> Maybe; it probably doesn't matter; serious threaded apps pass >> maxsize to Queue, generally equal to the # of worker threads (or a >> small constant multiple thereof), and the bounded queue does >> double-duty then as a flow-control throttle too; list.pop(0) isn't >> really slow when len(list) is that small. [Josiah Carlson] > It really all depends on how fast the producers and consumers are. Mediating differences in speed is what "flow-control throttle" means. A "serious threaded app" can't allow speed differences to lead to unbounded resource consumption, and a bounded queue is a wonderful solution to that. > In the cases where I've used Queue.Queue, I didn't have small Queues. That doesn't demonstrate you had a sane design . >>> and a single conditional, >> Sorry, I don't know what that might mean. > oops, I misspelled it, threading.Condition() Ah, got it. > ... > The below is quite fast for most reasonable numbers of threads - it > does wake all threads up that were waiting, but I find it considerably > clearer, and can be done with a single synchronization primitive. I think you mean a single synchronization instance variable. As I noted last time, Queue was written before Condition variables-- or even the threading module --existed. (Note that it still imports the ancient low-level 'thread' module instead.) There's no question about condvars being clearer than raw locks for anything fancier than a plain critical section, and I'd be in favor of rewriting Queue.py to use condvars, provided nothing important is lost. > from threading import Condition > #assuming fifo exists and is fast > > class CQueue: > def __init__(self, maxsize): > if maxsize > 0: > self._m = maxsize > else: > #big enough, but Max would be better > self._m = 2**64 The way the code uses it, -1 would work fine -- even on 128-bit machines . > self._q = fifo() > self._l = Condition() > def qsize(self): > self._l.acquire() > l = len(self._q) > self._l.release() > return l > def empty(self): > return self.qsize() == 0 > def full(self): > return self.qsize() == self._m > def put(self, item, block=1): > try: > self._l.acquire() That line should go before the "try:" (if the acquire() attempt raises an exception for some reason, the "finally:" block will try to release an unacquired condvar, raising a new exception, and so masking the real problem). > while block and len(self._q) == self._m: > self._l.wait() > self._l.release() > self._l.acquire() Note that the last two lines there aren't necessary. Or, if they are, this is much more obscure than Queue.py . wait() releases the associated lock as part of waiting, and reacquires it as part of waking up. > if len(self._q) < self._m: > self._q.put(item) > self._l.notifyAll() It's not clear why this isn't plain notify(). Note that if self._q.put() raises an exception (for example, MemoryError if the box is plain of out memory), no notify is done, and so other threads blocked in put() and/or get() that *could* make progress aren't woken up. Queue.py endures excruciating pain to try to prevent endcase glitches like that. > else: > raise Full > finally: > self._l.release() > ... [similar stuff for get()] ... Part of the simplification here is due to the use of a condvar. A great deal more is due to that it's simply ignoring some of the bells and whistles Queue.py has sprouted: a defined protocol for subclassing so that "the real queue" can have any concrete representation and semantics (e.g., it's easy to make a stack-like Queue subclass now, or a priority-queue-like Queue subclass) without the subclasses needing to know anything about the thread convolutions); optional timeouts on get and put; and anal attention paid to the consequences of "unexpected" exceptions. A possible advantage of the condvar logic is that it doesn't raise Full unless the queue is actually full, or Empty unless the queue is actually empty. The current code can raise those now for another reason, and users have a hard time understanding this: self.mutex just doesn't happen to be instantly available. OTOH, that's also a feature, taking "don't block" very seriously, not even waiting for a queue operation that happens to take a long time. For example, one thread gets into self._q.put(item) and loses its timeslice. Some other thread tries to do put(item, block=False). Under Queue.py today, the latter thread raises Full at once, because it can't instantly acquire the mutex. Under the condvar scheme, it does block, and *may* block for a long time (the former thread has to get enough timeslices to finish its _q.put(item), get through its self._l.release(), and the latter thread may not acquire the condvar even then (any number of other threads may get it first)). The tradeoffs here are murky, but the condvar scheme would be a change in actual behavior in such cases, so they can't be ignored. > ... > I agree completely, most locking mechanisms are slow as hell. There > are some tricks that can be done with critical sections (or its > equivalent on multiple platforms); an article was written about > multithreaded locking performance on IBM at least a year ago. I > wonder if any of those performance-tuning semantics can be used with > the underlying locks in Python. [and later] > One thing that bothered me about using the three mutexes as given > in Queue.Queue is that it relies on the fact that a thread can > unlock an underlying thread.lock that it didn't acquire. To me, > this is strange behavior. All those are related. If you port Python to a new thread implementation, there's only one synchronization object you have to implement in C: thread.lock. While it may seem strange to you, it's a "universal" primitive, meaning that all other synch primitives can be built on top of it (and, in Python, they are). When Guido was designing Python, "the Python lock" was the same synch primitive used in the OS he was working on at the time, and it didn't seem strange to me either because it was also the only synch primitive we built into hardware at Kendall Square Research at the time (on that machine, Python's lock.acquire() and lock.release() were each one machine instruction!). So I didn't gripe either at the time, especially seeing as KSR machines were clearly destined to take over the world . Note that the various synch objects supplied by threading.py (like RLocks and Conditions) are supplied via factory functions, not directly by Python classes. This was deliberate, so that someone motivated enough *could* replace them by faster platform-specific implementations, and not have to worry about breaking user-defined subclasses. For example, RLocks are supplied natively by Windows, so a much faster RLock gimmick *could* be implemented specific to Windows. > Oh, by adding a second Condition(), you don't need to notify everyone > and can replace the 'while' uses with 'if'. Even faster. You can't replace 'while' with 'if' then (think about it, and note that the docs explicitly warn that .notify() *may* wake more than one thread), and then you're well on the way to reinventing esema and fsema, with the current self.mutex being the lock shared by your two condvars. They'd be clearer as condvars, though, and it's always a good idea to name a "condition variable" after the *condition* it's modelling (like not_full or not_empty). From tim.one at comcast.net Sat Jan 10 00:26:23 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 10 00:26:27 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401092210.i09MAjP23610@c-24-5-183-134.client.comcast.net> Message-ID: [Guido] > You misunderstood -- the *fields* I was talking about are > counters or pointers that keep track of the overallocation, and 1 or > at most 2 is enough. One would be enough. To avoid breaking huge mounds of code all over the place, I think the existing ob_item should "float", to point always to the current element "at index 0". So ob_item would change when inserting or deleting "at the left end". The only new thing you need then is a field recording the number of unused slots "left-side" slots, between the start of the memory actually allocated and the slot ob_item points at; that field would have to be added or subtracted to/from ob_item in the obvious ways when allocating or deallocating the raw memory, but *most* of listobject.c could ignore it, and none of the PyList macros would need to change (in particular, the speed of indexed access wouldn't change). The overallocation strategy gets more complicated, though, and less effective for prepending. If you run out of room when prepending, it's not enough just to ask realloc() for more memory, you also have to memmove() the entire vector, to "make room at the left end". In previous lives I've found that "appending at the right" is as efficient as it is in practice largely because most non-trivial C realloc() calls end up extending the vector in-place, not needing to copy anything. The O() behavior is the same if each boost in size has to move everything that already existed, but the constant factor goes up more than just measurably . From tim.one at comcast.net Sat Jan 10 01:02:04 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 10 01:02:15 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <20040110044032.GA2901@burma.localdomain> Message-ID: [Gustavo Niemeyer] > Ohh.. I wasn't aware about this case, since I get no errors at all at > this line. I think Martin is using a more-recent version of gcc than most people here. Everyone on Windows (MSVC) sees these warnings, though. > Do you see any others? Windows doesn't complain about the ones your compiler complains about; it complains about 17 others, listed here (although the line numbers have probably changed in the last 3 months ): http://mail.python.org/pipermail/python-dev/2003-October/039059.html I personally get these: > > Modules/_sre.c:381: warning: comparison is always true due to limited > range of data type > Modules/_sre.c:383: warning: comparison is always true due to limited > range of data type > Modules/_sre.c:390: warning: comparison is always true due to limited > range of data type > Modules/_sre.c:392: warning: comparison is always true due to limited > range of data type > Modules/_sre.c: In function `sre_charset': > Modules/_sre.c:487: warning: comparison is always true due to limited > range of data type > Modules/_sre.c: In function `sre_ucharset': > Modules/_sre.c:487: warning: comparison is always true due to limited > range of data type > > Suggestions welcome. Upgrade to Windows . Most of those seem to come from lines of the form SRE_LOC_IS_WORD((int) ptr[-1]) : 0; where SRE_CHAR* ptr and either #define SRE_CHAR unsigned char or #define SRE_CHAR Py_UNICODE and #define SRE_LOC_IS_WORD(ch) (SRE_LOC_IS_ALNUM((ch)) || (ch) == '_') #define SRE_LOC_IS_ALNUM(ch) ((ch) < 256 ? isalnum((ch)) : 0) So it's apparently bitching about (((int) ptr[-1])) < 256 when the "unsigned char" expansion of SRE_CHAR is in effect. I suppose that could be repaired by defining SRE_LOC_IS_ALNUM differently depending on how SRE_CHAR is defined. The warning on line 487 comes from if (ch < 65536) where SRE_CODE ch and SRE_CODE is unsigned short or unsigned long, depending on Py_UNICODE_WIDE. This warning is really irritating. I suppose #if defined(Py_UNICODE_WIDE) || SIZEOF_SHORT > 2 if (ch < 65536) #endif block = ((unsigned char*)set)[ch >> 8]; #if defined(Py_UNICODE_WIDE) || SIZEOF_SHORT > 2 else block = -1; #endif would shut it up, but that's sure ugly. From jcarlson at uci.edu Sat Jan 10 01:40:22 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat Jan 10 01:43:26 2004 Subject: [Python-Dev] collections module In-Reply-To: References: <20040109131500.FB8E.JCARLSON@uci.edu> Message-ID: <20040109211753.FB9A.JCARLSON@uci.edu> > That doesn't demonstrate you had a sane design . Indeed, which is why the software was never released publically. I still need to get around to rewriting it, it has only been 2 years. > > The below is quite fast for most reasonable numbers of threads - it > > does wake all threads up that were waiting, but I find it considerably > > clearer, and can be done with a single synchronization primitive. > > I think you mean a single synchronization instance variable. Right. > As I noted > last time, Queue was written before Condition variables-- or even the > threading module --existed. (Note that it still imports the ancient > low-level 'thread' module instead.) There's no question about condvars > being clearer than raw locks for anything fancier than a plain critical > section, and I'd be in favor of rewriting Queue.py to use condvars, provided > nothing important is lost. I don't believe it would be too difficult to modify what I provided to be in the style of the Queue module methods for allowing different types of sequences. It may also be a desireable option to be able to pass in a class with put/get/__len__ methods, so that Queue wouldn't need to be subclassed when the object already exists. > The way the code uses it, -1 would work fine -- even on 128-bit machines > . Actually it does need it. In put, there is a comparison that makes all the difference: len(self._q) < self._m, which would always return false, if the underlying sequence wasn't braindead. > > try: > > self._l.acquire() > > That line should go before the "try:" (if the acquire() attempt raises an > exception for some reason, the "finally:" block will try to release an > unacquired condvar, raising a new exception, and so masking the real > problem). You're right, I was too quick with my fingers when I did a block indent for try/finally. > > while block and len(self._q) == self._m: > > self._l.wait() > > self._l.release() > > self._l.acquire() > > Note that the last two lines there aren't necessary. Or, if they are, this > is much more obscure than Queue.py . wait() releases the associated > lock as part of waiting, and reacquires it as part of waking up. *looks at docs* I could have sworn the docs said that a release and acquire was necessary, but I suppose the last time I read the docs for Condition was 2 years ago *shakes cobwebs out of head*, so I probably messed something up along the way. > It's not clear why this isn't plain notify(). Note that if self._q.put() You're right, it should be notify. > raises an exception (for example, MemoryError if the box is plain of out > memory), no notify is done, and so other threads blocked in put() and/or > get() that *could* make progress aren't woken up. Queue.py endures > excruciating pain to try to prevent endcase glitches like that. > > Part of the simplification here is due to the use of a condvar. A great > deal more is due to that it's simply ignoring some of the bells and whistles > Queue.py has sprouted: a defined protocol for subclassing so that "the real > queue" can have any concrete representation and semantics (e.g., it's easy > to make a stack-like Queue subclass now, or a priority-queue-like Queue > subclass) without the subclasses needing to know anything about the thread > convolutions); optional timeouts on get and put; and anal attention paid to > the consequences of "unexpected" exceptions. It would be relatively easy to drop in any object with put, get and __len__ methods. Adding code to be additionally anal about "unexpected" exceptions may or may not be too bad, depending on the level of "anal attention" required. In terms of having a timeout, I did not notice the functionality because I was looking on docs for 2.2, I see them now. Adding timeouts when using Condition() is fairly easy. Farther down is a snippet for both this functionality and the fast on nowait option. > A possible advantage of the condvar logic is that it doesn't raise Full > unless the queue is actually full, or Empty unless the queue is actually > empty. The current code can raise those now for another reason, and users > have a hard time understanding this: self.mutex just doesn't happen to be > instantly available. OTOH, that's also a feature, taking "don't block" very > seriously, not even waiting for a queue operation that happens to take a > long time. For example, one thread gets into self._q.put(item) and loses > its timeslice. Some other thread tries to do put(item, block=False). Under > Queue.py today, the latter thread raises Full at once, because it can't > instantly acquire the mutex. Under the condvar scheme, it does block, and > *may* block for a long time (the former thread has to get enough timeslices > to finish its _q.put(item), get through its self._l.release(), and the > latter thread may not acquire the condvar even then (any number of other > threads may get it first)). The tradeoffs here are murky, but the condvar > scheme would be a change in actual behavior in such cases, so they can't be > ignored. A quick modification to put/get alters it to be fast in those cases where you don't want to block. def get(self, block=1, timeout=None): if FAST: r = self._l.acquire(block) if not block and not r: raise Empty else: self._l.acquire() try: while block and len(self._q) == 0: self._l.wait(timeout) if timeout is not None: break if len(self._q) > 0: a = self._q.get() self._l.notify() return a raise Empty finally: self._l.release() Setting FAST to 1 would result in the put/get to be as fast as Queue when you want it to be, but setting FAST to 0 would give you the earlier behavior. > > I agree completely, most locking mechanisms are slow as hell. There > > are some tricks that can be done with critical sections (or its > > equivalent on multiple platforms); an article was written about > > multithreaded locking performance on IBM at least a year ago. I > > wonder if any of those performance-tuning semantics can be used with > > the underlying locks in Python. > > [and later] > > One thing that bothered me about using the three mutexes as given > > in Queue.Queue is that it relies on the fact that a thread can > > unlock an underlying thread.lock that it didn't acquire. To me, > > this is strange behavior. > > All those are related. If you port Python to a new thread implementation, > there's only one synchronization object you have to implement in C: > thread.lock. While it may seem strange to you, it's a "universal" "One lock to rule them all" isn't strange at all. I was just commenting that even using threading.Lock (the same as thread.allocate_lock) with acquire/release, seem to be slow. Is it really that the underlying locks are slow, or is the thread switching such a performance killer that lock performance doesn't matter? > > Oh, by adding a second Condition(), you don't need to notify everyone > > and can replace the 'while' uses with 'if'. Even faster. > > You can't replace 'while' with 'if' then (think about it, and note that the > docs explicitly warn that .notify() *may* wake more than one thread), and I forgot about that. Maybe that is why I my brain said you had to release and acquire after wait, probably. - Josiah From kbk at shore.net Sat Jan 10 02:58:56 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat Jan 10 02:58:59 2004 Subject: [Python-Dev] collections module In-Reply-To: (Tim Peters's message of "Sat, 10 Jan 2004 00:26:23 -0500") References: Message-ID: <87hdz4dykf.fsf@hydra.localdomain> "Tim Peters" writes: > [Guido] >> You misunderstood -- the *fields* I was talking about are >> counters or pointers that keep track of the overallocation, and 1 or >> at most 2 is enough. > > One would be enough. To avoid breaking huge mounds of code all over the > place, I think the existing ob_item should "float", to point always to the > current element "at index 0". So ob_item would change when inserting or > deleting "at the left end". The only new thing you need then is a field > recording the number of unused slots "left-side" slots, between the start of > the memory actually allocated and the slot ob_item points at; that field > would have to be added or subtracted to/from ob_item in the obvious ways > when allocating or deallocating the raw memory, but *most* of listobject.c > could ignore it, and none of the PyList macros would need to change (in > particular, the speed of indexed access wouldn't change). > > The overallocation strategy gets more complicated, though, and less > effective for prepending. If you run out of room when prepending, it's not > enough just to ask realloc() for more memory, you also have to memmove() the > entire vector, to "make room at the left end". In previous lives I've found > that "appending at the right" is as efficient as it is in practice largely > because most non-trivial C realloc() calls end up extending the vector > in-place, not needing to copy anything. The O() behavior is the same if > each boost in size has to move everything that already existed, but the > constant factor goes up more than just measurably . Aren't array-based implementations of queue usually based on circular buffers? Head and tail pointers plus a count are sufficient. Then a push is: check count (resize if necessary), write, adjust head pointer, adjust count. A pull is similar, using the tail pointer. Constant time. The resize is a little tricky because the gap between the tail and head pointers is what needs to grow. The current implementation of Queue (with what appears to me to be a listpop(v, 0) for get() ) involves an internal shift of the whole pointer vector, O(n), for every read of the queue. The Cookbook examples mentioned alleviate this but still with excess copying or unbounded memory usage. IMHO it's better done in listobject. I'm not up to speed on the reasons why ob_item needs to float but perhaps there is a straightforward indirect relationship to the head pointer. Why not have listobject grow deque functionality using circular buffer techniques? That way you get high performance queues and deques while avoiding the name collision with Queue. put(x) == insert(0, x) get() == pop() put_tail(x) == append(x) == push(x) (?? this must have been discussed)) get_head() == pop(0) All list operations would preserve the head and tail pointers. -- KBK From sjoerd at acm.org Sat Jan 10 05:49:40 2004 From: sjoerd at acm.org (Sjoerd Mullender) Date: Sat Jan 10 05:49:45 2004 Subject: [Python-Dev] Fixes for imageop module In-Reply-To: <200401091613.i09GDWO23032@c-24-5-183-134.client.comcast.net> References: <3FEC9C64.5080805@acm.org> <200312262249.hBQMnhf24306@c-24-5-183-134.client.comcast.net> <3FECBFB3.7090906@acm.org> <200312271749.hBRHnkN25210@c-24-5-183-134.client.comcast.net> <3FEDCBA2.3030607@acm.org> <200312272241.hBRMfHb01333@c-24-5-183-134.client.comcast.net> <3FFE6DEF.9070209@acm.org> <200401091613.i09GDWO23032@c-24-5-183-134.client.comcast.net> Message-ID: <3FFFD8C4.40304@acm.org> Guido van Rossum wrote: >>>>>Hm. Anybody who uses the imageop module currently on Linux will find >>>>>their programs broken. The strings manipulated by imageop all end up >>>>>either being written to a file or handed to some drawing code, and >>>>>changing the byte order would definitely break things! >>>> >>>>That's why I asked. >>>> >>>> >>>> >>>>>So I don't think this is an acceptable change. I take it that for >>>>>IRIX, the byte order implied by the old code is simply wrong? Maybe >>>>>the module can be given a global (shrudder?) byte order setting that >>>>>you can change but that defaults to the old setting? >>>> >>>>The problem is, the documentation says: "This is the same format as used >>>>by gl.lrectwrite() and the imgfile module." This implies the byte order >>>>that you get on the SGI which is opposite of what you get on Intel. The >>>>code produced the correct results on the SGI, but not on Intel. >>> >>> >>>Your fix would be okay if until now it was *only* used on IRIX with >>>gl.lrectwrite() and/or the imgfile module. But how can we prove that? >> >>I don't know. >> >> >>>>(By the way, I'm away from computers for a week starting tomorrow >>>>extremely early.) >>> >>> >>>It's okay to way a week before making a decision. >> >>I'm back (and have been this week). Any thoughts about a decision? > > > I suggest that you implement a way to change the endianness of the > operations. A global flag in the module is fine. It should default > to the historic behavior, and there should be a call to switch it to > the behavior you want. I submitted a patch (874358) to SourceForge. -- Sjoerd Mullender From martin at v.loewis.de Sat Jan 10 07:07:51 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 10 07:08:43 2004 Subject: [Python-Dev] collections module In-Reply-To: <000701c3d759$1bb035e0$ceb99d8d@oemcomputer> References: <000701c3d759$1bb035e0$ceb99d8d@oemcomputer> Message-ID: <3FFFEB17.4090806@v.loewis.de> Raymond Hettinger wrote: > I truly don't understand your answer. The bag and queue collections > have been tried and true in many languages. I'm not so sure about a queue class, but I do believe that bag classes got into libraries only for academic reasons such as symmetry. In particular, they got into Smalltalk because of that reason, and everybody copies it because Smalltalk has it. This has to stop :-) > Do I need to turn this into a PEP, take it to comp.lang.python, > cheerlead for it, and make the total effort ten times harder than it > needs to be? If everybody else thinks bags are a great idea, go ahead. YAGNI. > Or am I missing something basic like it being detrimental to the > language in some way? They are perhaps detrimental because they violate the principle There should be one-- and preferably only one --obvious way to do it. For queues, this is debatable, as the obvious way to do it (list.append, list.pop(0)) is inefficient if the list is long. In the specific case of bags, I think a different generalization would be more useful. The only use case of bags is when you want to count things, so the current pattern is # count them d = {} for t in things: d[t] = d.get(t, 0) + 1 # print them sorted by item for k, v in sorted(d.iteritems()): print k, v # print them sorted by frequency for k, v in sorted(d.iteritems(), key=operator.itemgetter(1)): print k,v With a bag, this could be "simplified" to d = Bag() for t in things: d.add(t) # print them sorted by item for k, v in sorted(d.iteritems()): print k, v # print them sorted by frequency for k, v in d.sortedByFrequency(): print k,v I don't think this is reasonably simpler. What's worse: this type does not support a very similar activity, namely collecting similar items d = {} for t in things: k = category(d) # e.g. if t are files, one may collect them # by size class (empty, <1k, <10k, <100k, ...) try: d[k].append(t) except KeyError: d[k] = [t] Both applications could be implemented if dictionaries had the notion of a default value, which would be generated by a callable (whether with or without argument is debatable) # count things d = {} d.defaultmaker = lambda k:0 for t in things: d[t] += 1 # sort things by category d = {} d.defaultmaker = lambda k:[] for t in things: d[category(t)].append(t) Having a key in the lambda is probably better, as it would also allow lazily-filled dictionaries. Regards, Martin From martin at v.loewis.de Sat Jan 10 07:16:45 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 10 07:16:44 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <20040110044032.GA2901@burma.localdomain> References: <20040109174051.GA6480@burma.localdomain> <001401c3d6e3$8a4c91e0$ceb99d8d@oemcomputer> <20040109191401.GA8086@burma.localdomain> <3FFF2166.90201@v.loewis.de> <20040110044032.GA2901@burma.localdomain> Message-ID: <3FFFED2D.8060401@v.loewis.de> Gustavo Niemeyer wrote: > Ohh.. I wasn't aware about this case, since I get no errors at all at > this line. Do you see any others? As Tim explains, this is likely due to me using a more recent gcc (3.3). I'll attach my current list of warnings below. The signed/unsigned warnings are actually harmless: they cause a problem only when comparing negative and unsigned numbers (where the negative number gets converted to unsigned and may actually compare larger than the positive one). This should not be an issue for SRE - I believe all the numbers are always non-negative, unless they are the -1 marker (for which I'm uncertain whether it is ever relationally compared). Regards, Martin Modules/_sre.c: In function `sre_lower': Modules/_sre.c:149: warning: signed and unsigned type in conditional expression Modules/_sre.c: In function `sre_lower_locale': Modules/_sre.c:162: warning: signed and unsigned type in conditional expression In file included from Modules/_sre.c:296: Modules/_sre.c: In function `sre_at': Modules/_sre.c:381: warning: comparison is always true due to limited range of data type Modules/_sre.c:383: warning: comparison is always true due to limited range of data type Modules/_sre.c:390: warning: comparison is always true due to limited range of data type Modules/_sre.c:392: warning: comparison is always true due to limited range of data type Modules/_sre.c: In function `sre_match': Modules/_sre.c:811: warning: comparison between signed and unsigned Modules/_sre.c:824: warning: comparison between signed and unsigned Modules/_sre.c:980: warning: comparison between signed and unsigned Modules/_sre.c:991: warning: comparison between signed and unsigned Modules/_sre.c:1059: warning: comparison between signed and unsigned Modules/_sre.c:1076: warning: comparison between signed and unsigned Modules/_sre.c:1132: warning: comparison between signed and unsigned Modules/_sre.c:1166: warning: comparison between signed and unsigned Modules/_sre.c:1194: warning: comparison between signed and unsigned Modules/_sre.c:1198: warning: comparison between signed and unsigned Modules/_sre.c:1208: warning: comparison between signed and unsigned Modules/_sre.c:1215: warning: comparison between signed and unsigned Modules/_sre.c:1217: warning: comparison between signed and unsigned Modules/_sre.c:1220: warning: comparison between signed and unsigned Modules/_sre.c:1236: warning: comparison between signed and unsigned Modules/_sre.c:1257: warning: comparison between signed and unsigned Modules/_sre.c:1261: warning: comparison between signed and unsigned Modules/_sre.c:1275: warning: comparison between signed and unsigned Modules/_sre.c:1287: warning: comparison between signed and unsigned Modules/_sre.c:1292: warning: comparison between signed and unsigned Modules/_sre.c:1380: warning: comparison between signed and unsigned Modules/_sre.c:1392: warning: comparison between signed and unsigned Modules/_sre.c: In function `sre_umatch': Modules/_sre.c:811: warning: comparison between signed and unsigned Modules/_sre.c:824: warning: comparison between signed and unsigned Modules/_sre.c:980: warning: comparison between signed and unsigned Modules/_sre.c:991: warning: comparison between signed and unsigned Modules/_sre.c:1051: warning: comparison between signed and unsigned Modules/_sre.c:1059: warning: comparison between signed and unsigned Modules/_sre.c:1076: warning: comparison between signed and unsigned Modules/_sre.c:1132: warning: comparison between signed and unsigned Modules/_sre.c:1166: warning: comparison between signed and unsigned Modules/_sre.c:1194: warning: comparison between signed and unsigned Modules/_sre.c:1198: warning: comparison between signed and unsigned Modules/_sre.c:1208: warning: comparison between signed and unsigned Modules/_sre.c:1215: warning: comparison between signed and unsigned Modules/_sre.c:1217: warning: comparison between signed and unsigned Modules/_sre.c:1220: warning: comparison between signed and unsigned Modules/_sre.c:1236: warning: comparison between signed and unsigned Modules/_sre.c:1257: warning: comparison between signed and unsigned Modules/_sre.c:1261: warning: comparison between signed and unsigned Modules/_sre.c:1275: warning: comparison between signed and unsigned Modules/_sre.c:1287: warning: comparison between signed and unsigned Modules/_sre.c:1292: warning: comparison between signed and unsigned Modules/_sre.c:1380: warning: comparison between signed and unsigned Modules/_sre.c:1392: warning: comparison between signed and unsigned From martin at v.loewis.de Sat Jan 10 07:24:38 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 10 07:24:50 2004 Subject: [Python-Dev] collections module In-Reply-To: References: Message-ID: <3FFFEF06.5090109@v.loewis.de> Tim Peters wrote: > The overallocation strategy gets more complicated, though, and less > effective for prepending. If you run out of room when prepending, it's not > enough just to ask realloc() for more memory, you also have to memmove() the > entire vector, to "make room at the left end". In previous lives I've found > that "appending at the right" is as efficient as it is in practice largely > because most non-trivial C realloc() calls end up extending the vector > in-place, not needing to copy anything. The O() behavior is the same if > each boost in size has to move everything that already existed, but the > constant factor goes up more than just measurably . Shouldn't you malloc() in the case of prepending, and do the copying yourself? realloc can only avoid copying if it can append a free block. For the list, we know a) realloc is unlikely to find a new block, anyway, as it is pymalloc, which never extends a small block. b) even if realloc would be able to extend the block, we still would have to copy the data. I don't know whether the overallocation would make similar-sized room at both ends, or whether it would take the current operation into account. For the specific case of queues, we may find that doing *just* the copying might be sufficient, with no need to go to the allocator. Regards, Martin From sjoerd at acm.org Sat Jan 10 08:39:51 2004 From: sjoerd at acm.org (Sjoerd Mullender) Date: Sat Jan 10 08:39:55 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <3FFF2166.90201@v.loewis.de> References: <20040109174051.GA6480@burma.localdomain> <001401c3d6e3$8a4c91e0$ceb99d8d@oemcomputer> <20040109191401.GA8086@burma.localdomain> <3FFF2166.90201@v.loewis.de> Message-ID: <400000A7.6060009@acm.org> Martin v. Loewis wrote: > Please change all uses of sizes/positions to "size_t", and > change the special -1 marker to (size_t)-1. If sizeof(int) < sizeof(size_t), is it *guaranteed* that (size_t)-1 expands to a bit pattern of all 1's? Also, is it *guaranteed* that you don't get more warnings (converting a negative quantity to unsigned)? I've been using ~(size_t)1 for things like this where these *are* guaranteed. -- Sjoerd Mullender From perky at i18n.org Sat Jan 10 10:09:48 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Sat Jan 10 10:09:49 2004 Subject: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <3FFF2E96.9000505@egenix.com> References: <20040109083021.GA4128@i18n.org> <3FFF2E96.9000505@egenix.com> Message-ID: <20040110150948.GA71512@i18n.org> On Fri, Jan 09, 2004 at 11:43:34PM +0100, M.-A. Lemburg wrote: > Hye-Shik Chang wrote: [snip] > > > >Any opionions will be appreciated. > > I like it except for the module names of the underlying C modules. > Could you group them under a common prefix, e.g. _cjk_ ?! Okay. How about _codecs_ or _encodings_ ? _cjk_ seems to be easy to lose naming consistency when another C codec for non-CJK multibyte encodings like Vietnamese. > Even better would be to put them into a package, but I'm not > sure whether that would complicate the setup. -0 for this. We don't have any packages on lib-dynload currently. > >Thank you! > > Thank you for contributing these ! > > Since this is a major contribution, we will probably have to ask > you to sign one of the contribution agreements that we are currently > having in the legal review queue. Would that pose a problem ? How can I sign the aggrements? Airmail and PGP-signed email work for me. But I can't sure whether I am able to send a international FAX. :) Hye-Shik From tzot at sil-tec.gr Sat Jan 10 11:44:48 2004 From: tzot at sil-tec.gr (Christos Georgiou) Date: Sat Jan 10 11:45:18 2004 Subject: [Python-Dev] collections module Message-ID: An idea about the queue object: If all the problem is about performance of pop(0) operations when the queues grow a lot, why not implement a queue as a list of lists (sublists), and all the magic will be to select an appropriate maximum size for the sublists? Wouldn't this be actually a simulation of the "extra space at the left side of the list" algorithm without touching the C list implementation? The very general idea: class queue: maximum_sublist_size = def __init__(self): self.superlist= [[]] def put(self, item): if len(self.superlist[-1]) >= self.maximum_sublist_size: self.superlist.append([]) self.superlist[-1].append(item) def get(self): try: sublist= self.superlist[0] except IndexError: raise QueueEmpty try: return sublist.pop(0) except IndexError: del self.superlist[0] return self.get() def size(self): if self.superlist: return self.maximum_sublist_size*(len(self.superlist)-1) \ +len(self.superlist[-1]) return 0 Another idea, for a single list solution where most of the times there are lots of puts and then mostly lots of gets, perhaps this would be fast: An attribute of the instance stores the last operation (put or get) If the current operation is not the same as the last, first reverse the underlying list, then do append() or pop(). But that is not practical for very large queues. Just some ideas. From guido at python.org Sat Jan 10 11:45:39 2004 From: guido at python.org (Guido van Rossum) Date: Sat Jan 10 11:45:45 2004 Subject: [Python-Dev] collections module In-Reply-To: Your message of "Sat, 10 Jan 2004 13:24:38 +0100." <3FFFEF06.5090109@v.loewis.de> References: <3FFFEF06.5090109@v.loewis.de> Message-ID: <200401101645.i0AGjdq24660@c-24-5-183-134.client.comcast.net> > Shouldn't you malloc() in the case of prepending, and do the copying > yourself? realloc can only avoid copying if it can append a free block. > For the list, we know > a) realloc is unlikely to find a new block, anyway, as it is pymalloc, > which never extends a small block. Not for large lists. pymalloc passes large objects off to real malloc. > b) even if realloc would be able to extend the block, we still would > have to copy the data. For really large lists, you'd risk running out of memory due to the malloc(1GB), while realloc(1GB) very likely just extends (since such a large block effectively becomes in a memory segment of its own). > I don't know whether the overallocation would make similar-sized room > at both ends, or whether it would take the current operation into > account. For the specific case of queues, we may find that doing > *just* the copying might be sufficient, with no need to go to the > allocator. Hm. An idea: don't bother overallocating when inserting in the front, but do keep the free space in the front if possible. Then recommend that queus are implemented using append(x) and pop(0), rather than using insert(0, x) and pop(). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Jan 10 11:51:07 2004 From: guido at python.org (Guido van Rossum) Date: Sat Jan 10 11:51:13 2004 Subject: [Python-Dev] Fixes for imageop module In-Reply-To: Your message of "Sat, 10 Jan 2004 11:49:40 +0100." <3FFFD8C4.40304@acm.org> References: <3FEC9C64.5080805@acm.org> <200312262249.hBQMnhf24306@c-24-5-183-134.client.comcast.net> <3FECBFB3.7090906@acm.org> <200312271749.hBRHnkN25210@c-24-5-183-134.client.comcast.net> <3FEDCBA2.3030607@acm.org> <200312272241.hBRMfHb01333@c-24-5-183-134.client.comcast.net> <3FFE6DEF.9070209@acm.org> <200401091613.i09GDWO23032@c-24-5-183-134.client.comcast.net> <3FFFD8C4.40304@acm.org> Message-ID: <200401101651.i0AGp8l24707@c-24-5-183-134.client.comcast.net> > I submitted a patch (874358) to SourceForge. Great, check it in! --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Sat Jan 10 11:47:05 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Sat Jan 10 11:53:32 2004 Subject: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <20040110150948.GA71512@i18n.org> References: <20040109083021.GA4128@i18n.org> <3FFF2E96.9000505@egenix.com> <20040110150948.GA71512@i18n.org> Message-ID: <40002C89.4050105@egenix.com> Hye-Shik Chang wrote: > On Fri, Jan 09, 2004 at 11:43:34PM +0100, M.-A. Lemburg wrote: > >>Hye-Shik Chang wrote: > > [snip] > >>>Any opionions will be appreciated. >> >>I like it except for the module names of the underlying C modules. >>Could you group them under a common prefix, e.g. _cjk_ ?! > > > Okay. How about _codecs_ or _encodings_ ? > _cjk_ seems to be easy to lose naming consistency when > another C codec for non-CJK multibyte encodings like Vietnamese. _codec_ would be fine. This is really only to prevent cluttering up the global module namespace. >>Even better would be to put them into a package, but I'm not >>sure whether that would complicate the setup. > > -0 for this. We don't have any packages on lib-dynload currently. Right, so let's use a prefix. >>>Thank you! >> >>Thank you for contributing these ! >> >>Since this is a major contribution, we will probably have to ask >>you to sign one of the contribution agreements that we are currently >>having in the legal review queue. Would that pose a problem ? > > > How can I sign the aggrements? Airmail and PGP-signed email work > for me. But I can't sure whether I am able to send a international > FAX. :) We would need a wet-signed form (ie. postal mail). I just wanted to let you know about this process. It is not yet set up, but eventually we will ask for these signatures from all major contributors. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From tzot at sil-tec.gr Sat Jan 10 11:55:09 2004 From: tzot at sil-tec.gr (Christos Georgiou) Date: Sat Jan 10 11:57:21 2004 Subject: [Python-Dev] collections module (correction) Message-ID: And a correction to the size method I suggested for the queue object: def size(self): return sum(map(len, self.superlist)) The previous one was giving wrong results when get was called N times and N % maximum_sublist_size != 0. I believe this should be implemented in C, otherwise the overhead would be too much for queues that never grew. From python at rcn.com Sat Jan 10 13:36:21 2004 From: python at rcn.com (Raymond Hettinger) Date: Sat Jan 10 13:37:27 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401101645.i0AGjdq24660@c-24-5-183-134.client.comcast.net> Message-ID: <003801c3d7a8$a514fec0$ceb99d8d@oemcomputer> > > I don't know whether the overallocation would make similar-sized room > > at both ends, or whether it would take the current operation into > > account. For the specific case of queues, we may find that doing > > *just* the copying might be sufficient, with no need to go to the > > allocator. > > Hm. An idea: don't bother overallocating when inserting in the front, > but do keep the free space in the front if possible. Then recommend > that queus are implemented using append(x) and pop(0), rather than > using insert(0, x) and pop(). That is a reasonable approach and probably not difficult to implement. Since external code (outside listobject.c) may already be relying on ob_item, the approach should just change that field to always point to the start of valid data. A new field would have to keep a pointer to the actual allocated block. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From tim.one at comcast.net Sat Jan 10 13:59:05 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 10 13:59:14 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <400000A7.6060009@acm.org> Message-ID: [Martin v. Loewis] >> Please change all uses of sizes/positions to "size_t", and >> change the special -1 marker to (size_t)-1. [Sjoerd Mullender] > If sizeof(int) < sizeof(size_t), is it *guaranteed* that (size_t)-1 > expands to a bit pattern of all 1's? Guaranteed (not to mention *guaranteed* ) is a very strong thing, and C explicitly allows for that an integer type other than unsigned char may contain bits that don't contribute to the value (like parity bits, or a not-an-integer bit). This makes "a bit pattern of all 1's" hard to relate to what the standard says. For casts from signed to unsigned, when the original value can't be represented in the new type (which includes all negative original values): the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. If we assume a binary box and that int and size_t don't have "extra" bits (Python assumes all this in more than one place already), then yes, the result is guaranteed to be a string of 1 bits: it's mathematically 2**i-1, where i is the number of bits in a size_t. That resulting mathematical *value* is well-defined even if there are extra bits, although C doesn't define what the extra bits may contain. > Also, is it *guaranteed* that you don't get more warnings > (converting a negative quantity to unsigned)? Well, the standard never guarantees that you won't get a warning. It's traditional for C compilers not to whine about explicit casts, no matter how goofy they are. Casting from signed to unsigned is well-defined, so there's no reason at all to whine about that. Note that there are instances of (size_t)-1 in Python already (cPickle.c and longobject.c), and no reports of warnings from those. > I've been using ~(size_t)1 for things like this where these *are* > guaranteed. I think you mean ~(size_t)0. There are also two instances of that in Python's source now. I think either way is fine. From jcarlson at uci.edu Sat Jan 10 14:06:41 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat Jan 10 14:09:35 2004 Subject: [Python-Dev] collections module In-Reply-To: References: Message-ID: <20040110103339.FE45.JCARLSON@uci.edu> > If all the problem is about performance of pop(0) operations when the > queues grow a lot, why not implement a queue as a list of lists > (sublists), and all the magic will be to select an appropriate maximum > size for the sublists? Wouldn't this be actually a simulation of the > "extra space at the left side of the list" algorithm without touching > the C list implementation? *pulls out algorithm analysis hat* If your maximum size of the sublist is k, then any pop is at least O(k). If you have a total of n items in the queue, then every k items incurrs an O(n/k) penalty for the outer list shuft, or O(n/k^2) per item, or overall O(k + n/k^2) per pop for a total queue size of n, and a maximum sublist size of k. for k > n^(1/3), this is O(k) per pop operation for k < n^(1/3), this is O(n/k^2) per pop operation, which is always greater than k As k approaches 1, pops are horrible, but as long as k is greater than the cube root of n, the operations are reasonable. If one knows the maximal size of the queue beforehand, setting up a properly sized k would result in reasonable behavior using lists of lists. However, using knuth's method, a dictionary fifo, or [val, [val, []]] all result in O(1) puts and gets, on either end (the [val, [val, []]] would require a previous pointer). > Another idea, for a single list solution where most of the times there > are lots of puts and then mostly lots of gets, perhaps this would be > fast: > > An attribute of the instance stores the last operation (put or get) > If the current operation is not the same as the last, first reverse the > underlying list, then do append() or pop(). > > But that is not practical for very large queues. It would be perfectly reasonable for a streamed puts/gets, but I don't know how common such behavior is necessary. - Josiah From python at rcn.com Sat Jan 10 14:26:17 2004 From: python at rcn.com (Raymond Hettinger) Date: Sat Jan 10 14:27:23 2004 Subject: [Python-Dev] collections module In-Reply-To: <3FFFEB17.4090806@v.loewis.de> Message-ID: <003901c3d7af$9ed95e00$ceb99d8d@oemcomputer> [Martin] > I do believe that bag classes > got into libraries only for academic reasons such as symmetry. In > particular, they got into Smalltalk because of that reason, and > everybody copies it because Smalltalk has it. This has to stop :-) Some much for "everybody else has one" justification ;-) > Both applications could be implemented if dictionaries had > the notion of a default value, which would be generated > by a callable (whether with or without argument is debatable) > > # count things > d = {} > d.defaultmaker = lambda k:0 > for t in things: > d[t] += 1 > > # sort things by category > d = {} > d.defaultmaker = lambda k:[] > for t in things: > d[category(t)].append(t) These both read cleanly. I really like this idea as a general purpose solution with broad applicability. It fully encapsulates the ideas behind bag construction and building dicts of lists. Instead of a method, it may be better to use a keyword argument in the constructor: d = dict(default = lambda k:[]) for t in things: d[category(t)].append(t) If lambda is made parameterless, it allows type constructors to be used for the most common cases: dict(default=list) # this is clear enough dict(default=int) # this may be too cute Or, lambda could be avoided altogether by cloning a default value dict(default=[]) # individual default values computed by copy([]) dict(default=0) > Having a key in the lambda is probably better, as it would > also allow lazily-filled dictionaries. Can you give an example? Raymond Hettinger From marktrussell at btopenworld.com Sat Jan 10 14:44:48 2004 From: marktrussell at btopenworld.com (Mark Russell) Date: Sat Jan 10 14:42:18 2004 Subject: [Python-Dev] collections module In-Reply-To: <003901c3d7af$9ed95e00$ceb99d8d@oemcomputer> References: <003901c3d7af$9ed95e00$ceb99d8d@oemcomputer> Message-ID: <1073763888.1387.11.camel@localhost> On Sat, 2004-01-10 at 19:26, Raymond Hettinger wrote: > Instead of a method, it may be better to use a keyword argument in the > constructor: > > d = dict(default = lambda k:[]) Unfortunately that has a meaning already: >>> d = dict(default = lambda k:[]) >>> d {'default': at 0x40212ed4>} A shame, because it does look nicer (OTOH I use and like the keyword argument dict constructor). I'd love to have the implicit default functionality though. Mark Russell From fumanchu at amor.org Sat Jan 10 14:37:46 2004 From: fumanchu at amor.org (Robert Brewer) Date: Sat Jan 10 14:43:52 2004 Subject: [Python-Dev] collections module Message-ID: Raymond H. quoted Martin L. thusly: > > Both applications could be implemented if dictionaries had > > the notion of a default value, which would be generated > > by a callable (whether with or without argument is debatable) > > > > # count things > > d = {} > > d.defaultmaker = lambda k:0 > > for t in things: > > d[t] += 1 > > > > # sort things by category > > d = {} > > d.defaultmaker = lambda k:[] > > for t in things: > > d[category(t)].append(t) Then answered with: > These both read cleanly. > > I really like this idea as a general purpose solution with broad > applicability. It fully encapsulates the ideas behind bag > construction and building dicts of lists. > > Instead of a method, it may be better to use a keyword argument in the > constructor: > > d = dict(default = lambda k:[]) > for t in things: > d[category(t)].append(t) > > If lambda is made parameterless, it allows type constructors > to be used > for the most common cases: > > dict(default=list) # this is clear enough > dict(default=int) # this may be too cute This seems a bit too easy to do "as needed" to warrant a new builtin, but that's why we have proposals, I guess: class DefaultingDict(dict): def __init__(self, default): self.default = default def __getitem__(self, key): return self.get(key, self.default()) >>> d = DefaultingDict(lambda: 0) >>> d.update({'a': 1, 'b': 2, 'c': 3}) >>> d {'a': 1, 'c': 3, 'b': 2} >>> for x in ['a', 'b', 'c', 'd']: ... d[x] += 1 ... >>> d {'a': 2, 'c': 4, 'b': 3, 'd': 1} Robert Brewer MIS Amor Ministries fumanchu@amor.org From tim.one at comcast.net Sat Jan 10 15:05:26 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 10 15:05:34 2004 Subject: [Python-Dev] collections module In-Reply-To: <3FFFEF06.5090109@v.loewis.de> Message-ID: [Martin v. Loewis] >> Shouldn't you malloc() in the case of prepending, and do the copying >> yourself? r ealloc can only avoid copying if it can append a free >> block. For the list, we know >> >> a) realloc is unlikely to find a new block, anyway, as it is pymalloc, >> which never extends a small block. [Guido] > Not for large lists. pymalloc passes large objects off to real > malloc. pymalloc is irrelevant here: it's not used for list guts, just for list headers. This is an efficiency hack, to speed the append-at-the-end use case: as Martin says, pymalloc cannot extend in-place, so not using pymalloc saves the cycles pymalloc would have consumed copying small list guts repeatedly while the list was growing, and, after the list got large, re-deducing over and over that it has to redirect to the system realloc. This expense is exacerbated because Python doesn't keep track of how much it's overallocated, it *always* calls realloc() on an append, (but "almost always" calls realloc() with the same size it called realloc() with the last time). >> b) even if realloc would be able to extend the block, we still would >> have to copy the data. > For really large lists, you'd risk running out of memory due to the > malloc(1GB), while realloc(1GB) very likely just extends (since such a > large block effectively becomes in a memory segment of its own). Right, that's the idea. >> I don't know whether the overallocation would make similar-sized >> room at both ends, or whether it would take the current operation >> into account. For the specific case of queues, we may find that >> doing *just* the copying might be sufficient, with no need to go >> to the allocator. For example, in this steady-state queue: q = [x] while True: q.append(x) q.pop(0) q is never empty then, but never contains more than 1 element. > Hm. An idea: don't bother overallocating when inserting in the front, > but do keep the free space in the front if possible. I'm not clear on what that means, and the devil's in the details. > Then recommend that queues are implemented using append(x) and > pop(0), rather than using insert(0, x) and pop(). Given the way realloc() works, I expect that add-at-the-right will always be favored. If someone does "insert at the left", though, and we're out of room, I'd give "the extra space" to the left end (we have to do a memmove in that case regardless, and shifting stuff right N slots doesn't cost more than shifting stuff right 1 slot). There are many possible strategies, and it's worth some pain to make dequeue use of lists reasonably zippy too. From tim.one at comcast.net Sat Jan 10 15:49:26 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 10 15:49:28 2004 Subject: [Python-Dev] collections module In-Reply-To: <87hdz4dykf.fsf@hydra.localdomain> Message-ID: [Kurt B. Kaiser] > Aren't array-based implementations of queue usually based on > circular buffers? Yes, but those are generally bounded queues. > Head and tail pointers plus a count are sufficient. Then a push is: > check count (resize if necessary), write, adjust head pointer, adjust > count. By analogy to a real-life queue (of, say, people), pushing is usually viewed as adding to the tail (when you get in line for movie tickets, you're likely to get punched if you add yourself to the head of the queue). I have to insist on using that terminology, just because it ties my head in knots to swap the usual meanings. > A pull is similar, using the tail pointer. Constant time. So is the speculative Python list gimmick we're talking about, until it runs out of room. We use mildly exponential overallocation when growing a list to ensure amortized O(1) behavior regardless. > The resize is a little tricky because the gap between the tail and > head pointers is what needs to grow. Resizing is easier with the speculative Python list gimmick, and 0-based indexing (where list[0] is the current head) is identical to the list-indexing implementation Python already has. The latter is important because it's fast. Needing to "wrap around" an index to account for circularity would introduce new test+branch expense into indexing. > The current implementation of Queue (with what appears to me to be a > listpop(v, 0) for get() ) involves an internal shift of the whole > pointer vector, O(n), for every read of the queue. The Cookbook > examples mentioned alleviate this but still with excess copying or > unbounded memory usage. IMHO it's better done in listobject. Queue.py and the speculative Python list gimmick are different issues. The overhead of mucking with locks, and Python-level method calls, in Queue.py surely overwhelms whatever savings you could get over the two-list queue implementation Josiah put in his Cookbook comment. The only "extra" expense there over a straight push-and-pop-both-at-the-right-end Python list is that each element added to the queue is involved in (no more than) one C-level pointer swap (due to the reverse() call, when the input list is turned into the output list). If the speculative Python list gimmick were implemented, Queue.py could, of course, change to use it. > I'm not up to speed on the reasons why ob_item needs to float So that indexing a Python list is as fast as it is today, and so that a mountain of C code continues to work (all C code mucking with Python lists now believes that a list contains ob_size elements, starting at ob_items and ending at ob_items+ob_size-1). > but perhaps there is a straightforward indirect relationship to the > head pointer. ob_item would *be* the head pointer (for my meaning for "head"), whenever the list is non-empty. > Why not have listobject grow deque functionality using circular buffer > techniques? As above, I don't see much to recommend that over the speculative Python list gimmick: indexing is slower and more complicated, resizing is more complicated, and all the listobject.c internals would get more complicated by needing to deal with that the list contents are no longer necessarily in a contiguous slice of a C vector (from the C POV, the list contents could be in two disjoint contiguous slices). It would have the advantage that resizing is never needed until you're wholly out of room, but that seems minor compared to the drawbacks. From martin at v.loewis.de Sat Jan 10 16:53:45 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 10 16:54:12 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <400000A7.6060009@acm.org> References: <20040109174051.GA6480@burma.localdomain> <001401c3d6e3$8a4c91e0$ceb99d8d@oemcomputer> <20040109191401.GA8086@burma.localdomain> <3FFF2166.90201@v.loewis.de> <400000A7.6060009@acm.org> Message-ID: <40007469.4090503@v.loewis.de> Sjoerd Mullender wrote: > If sizeof(int) < sizeof(size_t), is it *guaranteed* that (size_t)-1 > expands to a bit pattern of all 1's? As Tim explains: If you assume that the machine uses two's complement, then yes. However, this is irrelevant. More interestingly: Is it guaranteed that you get the largest possible value of size_t? Based on the language spec fragment that Tim cites: yes. This actually isn't really relevant either. Is it guaranteed that you get a value of size_t different from all values that denote real sizes of things? To this, the answer clearly is no: There might be objects whose size is (size_t)-1. However, it is unlikely (perhaps even *unlikely* :-) that you ever meet such an object in real life. By definition, (size_t)-1 == ~(size_t)0 in all conforming implementations. Regards, Martin From martin at v.loewis.de Sat Jan 10 17:11:31 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 10 17:11:52 2004 Subject: [Python-Dev] collections module In-Reply-To: <003901c3d7af$9ed95e00$ceb99d8d@oemcomputer> References: <003901c3d7af$9ed95e00$ceb99d8d@oemcomputer> Message-ID: <40007893.9060900@v.loewis.de> Raymond Hettinger wrote: > If lambda is made parameterless, it allows type constructors to be used > for the most common cases: > > dict(default=list) # this is clear enough > dict(default=int) # this may be too cute A keyword parameter is a good idea - I had problems calling it "default", though, as it would give the impression that you specify the default value. > Or, lambda could be avoided altogether by cloning a default value > > dict(default=[]) # individual default values computed by > copy([]) > dict(default=0) This might work, but it would make copy.copy() part of the dictionary semantics, whereas at the moment copy is just a library. >>Having a key in the lambda is probably better, as it would >>also allow lazily-filled dictionaries. > > Can you give an example? We once had the need to evaluate code in an "unbounded" context, i.e. where it is not possible/efficient to compute a list of all variables upfront. So in principle, we would have done class D(dict): def __getitem__(self, key): return compute_value(key) exec some_code in D(params) except that instances of D() could not have been passed to exec/eval - this requires a "true" dictionary. I see that Python now allows to use D as the dictionary, but still won't use __getitem__ to find values. In the specific example, the "unbounded" context was a CORBA namespace, for which we new that the bindings would not randomly change. So we would very much have liked a lazy filling of the dictionary - each access to an uncached value is a roundtrip on the wire. With a default value function, we could have done exec some_code in dict(default=compute_value) We ended up analyzing the code to find out what variables are referenced, and prefetch those into a proper dictionary. Regards, Martin From martin at v.loewis.de Sat Jan 10 17:21:13 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 10 17:21:24 2004 Subject: [Python-Dev] collections module In-Reply-To: References: Message-ID: <40007AD9.9030309@v.loewis.de> Tim Peters wrote: >>Hm. An idea: don't bother overallocating when inserting in the front, >>but do keep the free space in the front if possible. > > > I'm not clear on what that means, and the devil's in the details. I interpret it as: when doing an insert in the middle or at the beginning, continue doing what we do today. I.e. never overallocate extra space in the front, unless the extra space is already there (so realloc would just copy it/leave it as-is). When appending, you would then first check whether there is space at the beginning, and move items if there is. Otherwise reallocate, with the overallocation-to-the-right formula. >>Then recommend that queues are implemented using append(x) and >>pop(0), rather than using insert(0, x) and pop(). > > > Given the way realloc() works, I expect that add-at-the-right will always be > favored. Indeed, I would always implement lists as .append/.pop(0) - it would not have occurred to my mind to write them as insert(0, x)/.pop(), as I consider .insert on a list unintuitive. Regards, Martin From dmorton at lab49.com Sat Jan 10 17:22:10 2004 From: dmorton at lab49.com (Damien Morton) Date: Sat Jan 10 17:22:32 2004 Subject: [Python-Dev] collections module Message-ID: <57B50771635BF54C8A9D1874F4C5BEF807AEC8@rasputin.lab49.com> """Indeed, I would always implement lists as .append/.pop(0) - it would not have occurred to my mind to write them as insert(0, x)/.pop(), as I consider .insert on a list unintuitive. """ How about list.prepend() ------------- Damien Morton Lab49, Inc. Phone: 212 966 3468 Email: dmorton@lab49.com Web : www.lab49.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20040110/2e16c50d/attachment.html From martin at v.loewis.de Sat Jan 10 17:35:16 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 10 17:35:37 2004 Subject: [Python-Dev] collections module In-Reply-To: References: Message-ID: <40007E24.6000109@v.loewis.de> Robert Brewer wrote: > This seems a bit too easy to do "as needed" to warrant a new builtin, > but that's why we have proposals, I guess: > > class DefaultingDict(dict): > > def __init__(self, default): > self.default = default > > def __getitem__(self, key): > return self.get(key, self.default()) It would not necessarily have to be a builtin, but it should be part of the standard library somewhere. Having to write it anew everytime you need it is too tedious, so I usually have a try:except: block, doing the defaulting in the except: part. There would be also more semantic issues, such as whether has_key should always return True on a DefaultingDict. Your implementation is incomplete, too - .pop() should invoke the default function, as should .get(). Regards, Martin From kbk at shore.net Sat Jan 10 18:21:06 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat Jan 10 18:21:09 2004 Subject: [Python-Dev] collections module In-Reply-To: (Tim Peters's message of "Sat, 10 Jan 2004 15:49:26 -0500") References: Message-ID: <87n08vcrvh.fsf@hydra.localdomain> "Tim Peters" writes: > [Kurt B. Kaiser] >> Aren't array-based implementations of queue usually based on >> circular buffers? > > Yes, but those are generally bounded queues. And Pythonic versions would not be :-) >> Head and tail pointers plus a count are sufficient. Then a push is: >> check count (resize if necessary), write, adjust head pointer, adjust >> count. > > By analogy to a real-life queue (of, say, people), pushing is usually viewed > as adding to the tail (when you get in line for movie tickets, you're likely > to get punched if you add yourself to the head of the queue). I have to > insist on using that terminology, just because it ties my head in knots to > swap the usual meanings. Reading other posts in the thread and further reflection had already converted me to that point of view. We are agreed! Put to the tail and get from the head. Also, agreed that ob_item _is_ the headpointer. >> A pull is similar, using the tail pointer. Constant time. > > So is the speculative Python list gimmick we're talking about, until it runs > out of room. We use mildly exponential overallocation when growing a list > to ensure amortized O(1) behavior regardless. The whole game lies in what happens when you resize: how much data are you going to copy? >> The resize is a little tricky because the gap between the tail and >> head pointers is what needs to grow. > > Resizing is easier with the speculative Python list gimmick, and 0-based > indexing (where list[0] is the current head) is identical to the > list-indexing implementation Python already has. The latter is important > because it's fast. Needing to "wrap around" an index to account for > circularity would introduce new test+branch expense into indexing. If I understand the gimmick, as the queue is used the ob_item/headpointer will walk down the block and items will be appended at the tail until the tail collides with the end of the block. At that point the gimmick approach will memmove the queue to place ob_item at the beginning of the block. Since the block is only a little longer than the queue, that's quite a bit of copying, unless I'm missing something. With a circular buffer approach, the headpointer and tailpointer will walk down the block and when the end is reached the tailpointer will wrap around and "appended" items will be stored at the low end of the block. The process continues without any copying. The position of the pointers is calculated modulo the blocksize, so no test/branch is involved. If there is some space at the beginning of the block which has been freed up by pop(0), the list will automatically wrap around to fill it, even when the pop(0) has been done in a non-queue context. With a circular buffer, upon resize the gap between tail and head is widened by the amount added to the block. With a queue, that would on average amount to copying half the data, but with a known stride. One could take a conservative approach and rotate the buffer every resize so that ob_item is at the beginning of the block, but doing that seems to be unnecessary. The question of which way to rotate the buffer into the new free space seems to be irrelevant in the case of a queue/deque, but may be interesting for ordinary lists. I suspect pushing the tail to the left is optimum to minimize copying. [...] >> Why not have listobject grow deque functionality using circular buffer >> techniques? > > As above, I don't see much to recommend that over the speculative Python > list gimmick: indexing is slower and more complicated, pointer arithmetic is cheap compared to memory access. > resizing is more complicated, I'm not sure that's the case. And the gimmick approach has a 'complication' when the tail hits the end of the block even when the queue lenth is constant. > and all the listobject.c internals would get more complicated by > needing to deal with that the list contents are no longer > necessarily in a contiguous slice of a C vector (from the C POV, the > list contents could be in two disjoint contiguous slices). Not if modulo pointer arithmetic is used. > It would have the advantage that resizing is never needed until > you're wholly out of room, but that seems minor compared to the > drawbacks. -- KBK From martin at v.loewis.de Sat Jan 10 18:34:29 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sat Jan 10 18:35:50 2004 Subject: [Python-Dev] collections module In-Reply-To: <87n08vcrvh.fsf@hydra.localdomain> References: <87n08vcrvh.fsf@hydra.localdomain> Message-ID: <40008C05.7030803@v.loewis.de> Kurt B. Kaiser wrote: > With a circular buffer approach, the headpointer and tailpointer will > walk down the block and when the end is reached the tailpointer will > wrap around and "appended" items will be stored at the low end of the > block. The process continues without any copying. I think you two are talking about completely different things. Tim is talking about a modification to the builtin list type. You are talking about a new data type for queues. You are both right: Changing the list type to leave space in the front would make it more efficient for implementing queues than the current list type, especially if people use the standard idiom. Using a new type would be even more efficient if the queue is bounded. I'm still opposed to adding a new queue type, because it complicates the library, and it does not give any benefit to existing code - one actually has to change the existing implementations of queues. People don't want to change their code, especially if they need to support old Python versions. So changing the list type looks more promising to me. Not taking any action might be acceptable as well. Regards, Martin From kbk at shore.net Sat Jan 10 19:08:46 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat Jan 10 19:08:50 2004 Subject: [Python-Dev] collections module In-Reply-To: <40008C05.7030803@v.loewis.de> (Martin v. Loewis's message of "Sun, 11 Jan 2004 00:34:29 +0100") References: <87n08vcrvh.fsf@hydra.localdomain> <40008C05.7030803@v.loewis.de> Message-ID: <87ad4vcpo1.fsf@hydra.localdomain> "Martin v. Loewis" writes: > I think you two are talking about completely different things. > > Tim is talking about a modification to the builtin list type. > > You are talking about a new data type for queues. I am talking about a modification to listobject.c, also. The discussion is related to implementation and associated efficiencies. I'm -1 on adding a new queue dataype. I'm +1 on adding queue/deque functionality to the list datatype. -- KBK From tim.one at comcast.net Sat Jan 10 23:22:21 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 10 23:22:26 2004 Subject: [Python-Dev] collections module In-Reply-To: <87n08vcrvh.fsf@hydra.localdomain> Message-ID: [Tim] >> ... >> We use mildly exponential overallocation when growing a list to >> ensure amortized O(1) behavior regardless. [Kurt B. Kaiser] > The whole game lies in what happens when you resize: how much data are > you going to copy? We currently don't copy anything when we resize, although the platform realloc() may. To "make room at the left end" we would have to do a memmove() on a contiguous vector of len(list) pointers. I don't agree that's "the whole game", though -- it's a small piece of the game (the point of using exponential overallaction is to ensure that). As I said in the first msg, though, the overallocation gimmick we use now is less effective if a list can grow and shrink at both ends. > ... > If I understand the gimmick, as the queue is used If a list is used as a queue, yes. > the ob_item/headpointer will walk down the block and items will be > appended at the tail until the tail collides with the end of the > block. At that point the gimmick approach will memmove the queue to > place ob_item at the beginning of the block. Possibly -- nobody has given enough detail to say. I'd be inclined to continue doing what we do now when an append runs out of room, which is to ask for more room, overallocating by a small percentage of the current list size. If there's free room "at the left end", though, it may or may not be more efficient to memmove the guts left. Those are important details that haven't gotten beyond the hand-wavy stage. > Since the block is only a little longer than the queue, that's > quite a bit of copying, unless I'm missing something. The amount of copying in *any* case where the speculative gimmick copies is proportional to the number of elements currenty in the list. When a list contains a million items, it's expensive. If a list contains a few dozen items, it's cheap. The current scheme overallocates proportional to the number of elements in the list, so "a little longer" is "a little" in a relative sense rather than in an absolute sense (e.g., it's millions of elements longer if the list contains hundreds of millions of elements -- then you can go through millions of appends after a resize before needing to do anything fancy again). > With a circular buffer approach, the headpointer and tailpointer will > walk down the block and when the end is reached the tailpointer will > wrap around and "appended" items will be stored at the low end of the > block. The process continues without any copying. So long as it doesn't run out of room. Eventually it will, and then it has to move stuff too. At that level the schemes are the same. Given specific common queue access patterns, a queue in steady state will "run out of room" more often under the contiguous gimmick than the circular one (I gave a simple example of that before, which in fact is probably a best case for the circular implementation and a worst case for the other one). > The position of the pointers is calculated modulo the blocksize, so no > test/branch is involved. Integer mod is much worse than test/branch -- Python runs on boxes where an integer divide takes more than 100 cycles. So I was assuming the cheaper way to implement this ("if (physical_pointer >= fence) physical_pointer -= block_size;"; test-and-conditionally-add/subtract instead of integer mod; and don't suggest that allocated sizes be restricted to powers of 2 just to make integer mod efficient (we can't afford that much memory waste)). > If there is some space at the beginning of the block which has been > freed up by pop(0), the list will automatically wrap around to fill > it, even when the pop(0) has been done in a non-queue context. You can trust that my objection isn't that I don't understand how to implement a circular queue . > With a circular buffer, upon resize the gap between tail and head is > widened by the amount added to the block. With a queue, that would on > average amount to copying half the data, but with a known stride. I don't know what "known stride" means in this context. Regardless of whether a circular implementation is used, a Python list is a contiguous vector of C pointers, and they're all the same size. Only these pointers are moved. With a reasonably careful circular implementation, needing to move half the pointers would be the worst case; assuming that the head and tail are equally likely to bump into each other at each possible position, the average number of pointer moves would be len/4. > One could take a conservative approach and rotate the buffer every > resize so that ob_item is at the beginning of the block, but doing > that seems to be unnecessary. Also unhelpful -- every line of C code mucking with lists would still have to cater to the possibility that the list is spread across two disjoint memory slices. ... >> As above, I don't see much to recommend that over the speculative >> Python list gimmick: indexing is slower and more complicated, > pointer arithmetic is cheap compared to memory access. Let's inject some common sense here. Indexed list access is an extremely frequent operation, and cache-friendly sequential indexed list access is also extremely frequent. Moving memory around is a rare operation, and neither scheme can avoid it: the overwhelmingly most common kind of growing list will continue to be append-at-the-end, where the circular scheme buys nothing, but slows down all list operations. Growing lists are in turn rare compared to fixed-size lists, where the circular implementation is pure loss, in both time and space. The speculative gimmick is a loss in space there too, but doesn't slow it. >> resizing is more complicated, > I'm not sure that's the case. Neither am I, really . > And the gimmick approach has a 'complication' when the tail hits the > end of the block even when the queue lenth is constant. That appears to be its only drawback. >> and all the listobject.c internals would get more complicated by >> needing to deal with that the list contents are no longer >> necessarily in a contiguous slice of a C vector (from the C POV, the >> list contents could be in two disjoint contiguous slices). > Not if modulo pointer arithmetic is used. Sorry, but I consider this one too unrealistic to continue talking about. Start here: #define PyList_GET_ITEM(op, i) (((PyListObject *)(op))->ob_item[i]) #define PyList_SET_ITEM(op, i, v) \ (((PyListObject *)(op))->ob_item[i] = (v)) What are you going to replace those with? They must be fast, and they're expanded in over 100 places in the core. Then move on to here (the simplest function in listobject.c that actually does something ): /* Reverse a slice of a list in place, from lo up to (exclusive) hi. */ static void reverse_slice(PyObject **lo, PyObject **hi) { assert(lo && hi); --hi; while (lo < hi) { PyObject *t = *lo; *lo = *hi; *hi = t; ++lo; --hi; } } How are you going to rewrite that? For that matter, how are you going to change the function *signature* when "a slice of a Python list" can no longer be represented by a low and high pointer? When you're done with that, do this one: /* Merge the na elements starting at pa with the nb elements starting at pb * in a stable way, in-place. na and nb must be > 0, and pa + na == pb. * Must also have that *pb < *pa, that pa[na-1] belongs at the end of the * merge, and should have na <= nb. See listsort.txt for more info. * Return 0 if successful, -1 if error. */ static int merge_lo(MergeState *ms, PyObject **pa, int na, PyObject **pb, int nb) { int k; PyObject *compare; PyObject **dest; int result = -1; /* guilty until proved innocent */ int min_gallop = ms->min_gallop; assert(ms && pa && pb && na > 0 && nb > 0 && pa + na == pb); if (MERGE_GETMEM(ms, na) < 0) return -1; memcpy(ms->a, pa, na * sizeof(PyObject*)); dest = pa; pa = ms->a; *dest++ = *pb++; --nb; if (nb == 0) goto Succeed; if (na == 1) goto CopyB; compare = ms->compare; for (;;) { int acount = 0; /* # of times A won in a row */ int bcount = 0; /* # of times B won in a row */ /* Do the straightforward thing until (if ever) one run * appears to win consistently. */ for (;;) { assert(na > 1 && nb > 0); k = ISLT(*pb, *pa, compare); if (k) { if (k < 0) goto Fail; *dest++ = *pb++; ++bcount; acount = 0; --nb; if (nb == 0) goto Succeed; if (bcount >= min_gallop) break; } else { *dest++ = *pa++; ++acount; bcount = 0; --na; if (na == 1) goto CopyB; if (acount >= min_gallop) break; } } /* One run is winning so consistently that galloping may * be a huge win. So try that, and continue galloping until * (if ever) neither run appears to be winning consistently * anymore. */ ++min_gallop; do { assert(na > 1 && nb > 0); min_gallop -= min_gallop > 1; ms->min_gallop = min_gallop; k = gallop_right(*pb, pa, na, 0, compare); acount = k; if (k) { if (k < 0) goto Fail; memcpy(dest, pa, k * sizeof(PyObject *)); dest += k; pa += k; na -= k; if (na == 1) goto CopyB; /* na==0 is impossible now if the comparison * function is consistent, but we can't assume * that it is. */ if (na == 0) goto Succeed; } *dest++ = *pb++; --nb; if (nb == 0) goto Succeed; k = gallop_left(*pa, pb, nb, 0, compare); bcount = k; if (k) { if (k < 0) goto Fail; memmove(dest, pb, k * sizeof(PyObject *)); dest += k; pb += k; nb -= k; if (nb == 0) goto Succeed; } *dest++ = *pa++; --na; if (na == 1) goto CopyB; } while (acount >= MIN_GALLOP || bcount >= MIN_GALLOP); ++min_gallop; /* penalize it for leaving galloping mode */ ms->min_gallop = min_gallop; } Succeed: result = 0; Fail: if (na) memcpy(dest, pa, na * sizeof(PyObject*)); return result; CopyB: assert(na == 1 && nb > 0); /* The last element of pa belongs at the end of the merge. */ memmove(dest, pb, nb * sizeof(PyObject *)); dest[nb] = *pa; return 0; } There are thousands of additional lines that assume a list is one contiguous vector of pointers, and that the dead-simple and killer-fast ordinary contiguous-vector C idioms work. >> It would have the advantage that resizing is never needed until >> you're wholly out of room, but that seems minor compared to the >> drawbacks. Just thought I'd repeat that . From martin at v.loewis.de Sun Jan 11 03:54:25 2004 From: martin at v.loewis.de (Martin v. Loewis) Date: Sun Jan 11 03:57:04 2004 Subject: [Python-Dev] collections module In-Reply-To: <87ad4vcpo1.fsf@hydra.localdomain> References: <87n08vcrvh.fsf@hydra.localdomain> <40008C05.7030803@v.loewis.de> <87ad4vcpo1.fsf@hydra.localdomain> Message-ID: <40010F41.1020206@v.loewis.de> Kurt B. Kaiser wrote: > I am talking about a modification to listobject.c, also. So you suggest that the list type become circular??? That will cause a major ABI incompatibility, as all modules that use PyList_GET_SIZE need to be recompiled. In addition, it causes a slow-down for the standard use of lists. So you would make the general case slower, just to support a special case. Why are you proposing that? Regards, Martin From pf_moore at yahoo.co.uk Sun Jan 11 07:48:40 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sun Jan 11 07:48:42 2004 Subject: [Python-Dev] Re: collections module References: <200401042241.i04MfqE06932@c-24-5-183-134.client.comcast.net> <000a01c3d672$097e6500$95b52c81@oemcomputer> Message-ID: "Raymond Hettinger" writes: > I would like to establish a new module for some collection classes. > > The first type would be a bag, modeled after one of Smalltalk's > key collection classes (similar to MultiSets in C++ and bags in > Objective C). [...] Would it make sense to move the set type from builtins into this module? As set is moving in 2.4 anyway (from the sets module to a builtin) this isn't likely to cause any additional compatibility issues. Paul -- This signature intentionally left blank From skip at manatee.mojam.com Sun Jan 11 08:01:36 2004 From: skip at manatee.mojam.com (Skip Montanaro) Date: Sun Jan 11 08:01:43 2004 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200401111301.i0BD1aCb030583@manatee.mojam.com> Bug/Patch Summary ----------------- 641 open / 4495 total bugs (+128) 231 open / 2529 total patches (+45) New Bugs -------- Scripts need platform-dependent handling (2004-01-04) http://python.org/sf/870479 Standardize on systemwide bgenlocations INCLUDEDIR (2004-01-05) http://python.org/sf/870972 bgenlocationscustomize: per-user configuration of bgen (2004-01-05) http://python.org/sf/870997 PyOS_snprintf segfaults on missing native snprintf (2004-01-05) http://python.org/sf/871026 Py_SequenceFast can mask errors (2004-01-06) http://python.org/sf/871704 sys.exec_prefix does not work (2004-01-06) http://python.org/sf/871747 README build instructions for fpectl (2004-01-06) http://python.org/sf/872175 fpectl module needs updating (2004-01-07) http://python.org/sf/872265 Finding X11 fails on SuSE Opteron (2004-01-07) http://python.org/sf/872392 Python 2.3.3 test_tempfile test_mode() fails on AIX 5.2 (2004-01-07) http://python.org/sf/872686 Python 2.3.3 urllib proxy support fails (2004-01-07) http://python.org/sf/872736 os.access() documentation should stress race conditions (2004-01-07) http://python.org/sf/872769 How to pass the proxy server use socket (2004-01-07) http://python.org/sf/872815 platform module needs update (2004-01-08) http://python.org/sf/873005 Wrong URL for external documentation link (Lib ref 11.9) (2004-01-08) http://python.org/sf/873205 Platform module not documented (2004-01-09) http://python.org/sf/873652 wrong answers from ctime (2004-01-09) http://python.org/sf/874042 ConfigParser's get method gives utf-8 for a utf-16 config... (2004-01-10) http://python.org/sf/874354 Errors testing python 2.3.3 (2004-01-10) http://python.org/sf/874534 httplib fails on Akamai URLs (2004-01-11) http://python.org/sf/874842 New Patches ----------- make test_urllib2 work on Windows (2004-01-04) http://python.org/sf/870606 optparse int, long types support binary, octal and hex forma (2004-01-05) http://python.org/sf/870807 add link to DDJ's article (2004-01-05) http://python.org/sf/871402 trace.py: simple bug in write_results_file (2004-01-08) http://python.org/sf/873211 Marshal clean-up (2004-01-08) http://python.org/sf/873224 list.__setitem__(slice) behavior (2004-01-08) http://python.org/sf/873305 email/Message.py: del_param fails when specifying a header (2004-01-08) http://python.org/sf/873418 Bluetooth support for socket module (2004-01-09) http://python.org/sf/874083 Closed Bugs ----------- Configure error with 2.3.3 on AIX 5.2 using gcc 3.3.2 (2003-12-31) http://python.org/sf/868529 Closed Patches -------------- tests and processors patch for urllib2 (2003-12-02) http://python.org/sf/852995 From edcjones at erols.com Sun Jan 11 12:10:40 2004 From: edcjones at erols.com (Edward C. Jones) Date: Sun Jan 11 12:14:44 2004 Subject: [Python-Dev] Tracebacks into C code Message-ID: <40018390.10303@erols.com> Suppose: I have a Python extension written in C. "My_Func" is a function in the extension which is visible at the Python level. "My_Func" calls the C function "C_func1". "C_func1" calls the C function "C_func2". "C_func2" raises an exception by calling something like "PyErr_SetString". How do I make the Python traceback include "C_func1" and "C_func2"? From python at bobs.org Sun Jan 11 13:19:14 2004 From: python at bobs.org (Bob Arnson) Date: Sun Jan 11 13:19:41 2004 Subject: [Python-Dev] Python MSI Message-ID: <443093438.20040111131914@bobs.org> "Martin v. Loewis" writes: > I have committed python/nondist/sandbox/msi to the CVS. > This is a MSI generator for Python on Windows, which I'd > like to propose as a replacement for Wise. It looks good! There are a couple of UI quirks, as most MSIs have, when the system font settings are tweaked. One suggestion: You should author the InitialTargetDir custom action into the InstallExecuteSequence table so a silent install supplies a default directory. You can already run a silent install by passing in a TARGETDIR property setting to the MSIEXEC command line, but if you forget, everything lands in the root of your boot drive. -- sig://boB http://foobob.com From tjreedy at udel.edu Sun Jan 11 13:28:57 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Sun Jan 11 13:28:56 2004 Subject: [Python-Dev] Re: collections module References: <87n08vcrvh.fsf@hydra.localdomain> Message-ID: On enhancing list object: queue vs. deque. My general impression is that queues, while less common than stacks, are used enough in enough different algorithms that adding one implementation field to list headers so that pop(0) is an O(1) pointer move might be an overall win for Python. The current O(n) move everything every time behavior is certainly a nuisance when queues are needed. Deques, on the other hand, seem to be more a theoretical construct. Algorithm books include them for completeness in describing possible list disciplines, but generally ignore them thereafter. I am hard put to remember an algorithm that actually needs them. They are certainly much rarer than queues. So I do not think trying to make prepends O(1) should be considered further. > Possibly -- nobody has given enough detail to say. I'd be inclined to > continue doing what we do now when an append runs out of room, which is to > ask for more room, overallocating by a small percentage of the current list > size. If there's free room "at the left end", though, it may or may not be > more efficient to memmove the guts left. Those are important details that > haven't gotten beyond the hand-wavy stage. My possibly naive thought is that before reallocating, look at the ratio of left free to total. If memmove would result in as much, or perhaps nearly as much, over-allocation as realloc, then just do that. Certainly if the ratio is small (like say .1), don't bother with memmove. The exact cutoff is a tuning matter. Separate issue: for best performance, one would like to be able trade space for time and preallocate a large enough block so that most append faults result in a memmove back to the left. Does the current builtin shrinkage discipline would allow this. Or does "q=[0]*1000; q[:]=[]" shrink the space for q back to nearly nothing? (One might also want to do same for stack. And yes, I remember: 'user tuning' is a 'bad idea'. But....) Terry J. Reedy From tim.one at comcast.net Sun Jan 11 13:30:28 2004 From: tim.one at comcast.net (Tim Peters) Date: Sun Jan 11 13:30:32 2004 Subject: [Python-Dev] collections module In-Reply-To: <40010F41.1020206@v.loewis.de> Message-ID: [Kurt B. Kaiser] >> I am talking about a modification to listobject.c, also. [Martin v. Loewis] > So you suggest that the list type become circular??? That's my understanding, but already belabored at length . > That will cause a major ABI incompatibility, as all modules > that use PyList_GET_SIZE need to be recompiled. Well, the macros aren't supposed to be used outside the core -- external extensions are supposed to be using the representation-hiding PyList_Size, PyList_GetItem, PyList_SetItem functions. If we keep it contiguous, users of the macros may need to be recompiled anyway, since the layout of the list header would change. > In addition, it causes a slow-down for the standard use of lists. That's one of my two major objections. The other is the enormous amount of code changes that would be required throughout listobject.c -- all "the usual" C idioms for mucking with vectors would no longer work correctly as-is, whether it's the typical "for (p = list->ob_item: p < fence; ++p)" loop, or use of library functions like memmove and memcpy. That's a ton of additional complication, with obvious bad implications for transparency, stability, maintainability, code bloat, and efficiency. > So you would make the general case slower, just to support a > special case. Why are you proposing that? Kurt's worried about the expense of memmove(), but not about the expense of integer mod. To be fair(er >), I'm not convinced that anyone has dreamt up *detailed* suggestions sufficient to allow common queue access patterns to be reasonably zippy. Think about this one: q = [incoming() for i in range(1000)] while True: q.append(incoming()) consume(q.pop(0)) That's just a queue in steady-state, with a backlog of 1000 but thereafter with balanced entry and exit rates. We currently realloc() q on every append(), rounding up its size, where "its size" is taken from ob_size. That would no longer be *correct*: ob_size stays steady at 1000, but append needs ever-larger indices relative to the start of the allocated memory. IOW, while ob_size bears a useful relationship to the amount of memory we actually allocate tody, that relationship would no longer hold (ob_size would become an arbitrarily poor lower bound). So what are we going to do instead, exactly? We actually allocate enough room for 1024 elements now (although Python doesn't currently remember that). If we start to remember when we're out of room, and memmove() the whole blob left 24 slots when append hits the end of available room, then every 24 (+-1 -- doesn't matter at this level) iterations of the loop, we end up memmove'ing 1000 pointers. If the steady-state backlog were, say, 1023 elements instead, we'd end up memmove'ing 1023 pointers on *every* loop iteration, and each append becomes a linear-time operation. OTOH, if we "round up the size" based on the number of allocated (not used) slots, then we get back amortized O(1) time per append -- but the amount of allocated memory needed by the example grows without bound, despite that the number of used slots remains constant. No "pure" strategy I've seen seems good enough. A circular implementation has no problem with the example, or, indeed, with any queue access pattern, although a circular implementation would be reluctant to reduce the amount of allocated space when the number of used slots decreases (since the unused slots are "a hole in the middle" of the allocated slots, and we can't give back a hole to the system malloc). From martin at v.loewis.de Sun Jan 11 14:00:19 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 11 14:04:16 2004 Subject: [Python-Dev] Tracebacks into C code In-Reply-To: <40018390.10303@erols.com> References: <40018390.10303@erols.com> Message-ID: <40019D43.4090303@v.loewis.de> Edward C. Jones wrote: > Suppose: > I have a Python extension written in C. > "My_Func" is a function in the extension which is visible at the > Python level. > "My_Func" calls the C function "C_func1". > "C_func1" calls the C function "C_func2". > "C_func2" raises an exception by calling something like > "PyErr_SetString". > How do I make the Python traceback include "C_func1" and "C_func2"? You need to create frame objects. See Modules/pyexpat.c for an example. Regards, Martin P.S. This is OT for python-dev. From martin at v.loewis.de Sun Jan 11 14:13:49 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 11 14:14:14 2004 Subject: [Python-Dev] Python MSI In-Reply-To: <443093438.20040111131914@bobs.org> References: <443093438.20040111131914@bobs.org> Message-ID: <4001A06D.6000608@v.loewis.de> Bob Arnson wrote: > One suggestion: You should author the InitialTargetDir custom action > into the InstallExecuteSequence table so a silent install supplies > a default directory. You can already run a silent install by passing in > a TARGETDIR property setting to the MSIEXEC command line, but if > you forget, everything lands in the root of your boot drive. Can you please elaborate? I don't quite understand the relationship between the UI sequence and the execute sequence: - Is it true that, with full UI, it first executes the entire UI sequence, and then the full execute sequence? Then why do people typically duplicate actions in both sequences? - If not: what parts of the execute sequence are executed under what conditions? I'm somewhat worried that the action would be executed even after the user was already queried for the targetdir (although the condition should avoid TARGETDIR being reset). Also, I currently install by default into [WindowsVolume]PythonXY. Would it be better to install into [ROOTDIR]PythonXY, as this would be the drive with most space? Regards, Martin From edcjones at erols.com Sun Jan 11 14:45:51 2004 From: edcjones at erols.com (Edward C. Jones) Date: Sun Jan 11 14:49:55 2004 Subject: [Python-Dev] Re: Tracebacks into C code Message-ID: <4001A7EF.1010503@erols.com> *Martin v. L?wis wrote:* > Edward C. Jones wrote: > >/ Suppose: /> >/ I have a Python extension written in C. /> >/ "My_Func" is a function in the extension which is visible at the /> >/ Python level. /> >/ "My_Func" calls the C function "C_func1". /> >/ "C_func1" calls the C function "C_func2". /> >/ "C_func2" raises an exception by calling something like /> >/ "PyErr_SetString". />/ How do I make the Python traceback include "C_func1" and "C_func2"? /> > You need to create frame objects. See Modules/pyexpat.c for an example. That looks complicated. > Regards, > Martin > > P.S. This is OT for python-dev. I should have directly asked: Should some traceback-through-C capability be put in the Python API? From barry at barrys-emacs.org Sun Jan 11 15:25:01 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Sun Jan 11 15:25:39 2004 Subject: [Python-Dev] re: The os module, unix and win32 In-Reply-To: <0HR7006O9F7GKD@mta1.srv.hcvlny.cv.net> References: <0HR7006O9F7GKD@mta1.srv.hcvlny.cv.net> Message-ID: <6.0.1.1.2.20040111201249.02129328@torment.chelsea.private> At 09-01-2004 04:02, Arthur wrote: >But tangentially on topic - is the possibility of getting enough of the >win32all API functions into the core distribution to give more Windows >standard functionality to disutils - start menu entries in particular. That needs COM support and that's a lot more API that is proposed for popen5 and similar. By the time you have COM support you have committed to include all of win32all in standard python. Which is basically what ActiveState Python is. Barry From martin at v.loewis.de Sun Jan 11 15:43:22 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 11 15:43:51 2004 Subject: [Python-Dev] Re: Tracebacks into C code In-Reply-To: <4001A7EF.1010503@erols.com> References: <4001A7EF.1010503@erols.com> Message-ID: <4001B56A.2020607@v.loewis.de> Edward C. Jones wrote: > Should some traceback-through-C capability be put in the Python API? But there is traceback-through-C capability in the Python API! If you have a specific patch in mind, let us know. Martin From python at bobs.org Sun Jan 11 15:56:59 2004 From: python at bobs.org (Bob Arnson) Date: Sun Jan 11 15:58:56 2004 Subject: [Python-Dev] Python MSI In-Reply-To: <4001A06D.6000608@v.loewis.de> References: <443093438.20040111131914@bobs.org> <4001A06D.6000608@v.loewis.de> Message-ID: <426179714.20040111155659@bobs.org> Sunday, January 11, 2004, 2:13:49 PM, you wrote: > Can you please elaborate? I don't quite understand the relationship > between the UI sequence and the execute sequence: You're not alone! At work, we have a wall where we post clippings about MSI weirdnesses; we call it the "MSI Hall of Misery" because it usually represents days of research and experimenting without much to show for the effort... > - Is it true that, with full UI, it first executes the entire UI > sequence, and then the full execute sequence? Then why do people > typically duplicate actions in both sequences? Yes. The UI sequence goes through the "ready to install" dialog and then the execute sequence takes over. That's where stuff actually gets installed. The UI sequence then picks back up for the "install complete" dialog. You need to duplicate many actions so that the UI accurately reflects what the actions do (usually setting properties like directory names). They have to exist in the execute sequence if they affect how the system is modified and also if you want a silent install to work. Interesting sidenote: Both Office (2000 and later) and Visual Studio (2002 and later) use MSI but only for the execute sequence. The UIs were coded in C/C++. > I'm somewhat worried that the action would be executed even after > the user was already queried for the targetdir (although the > condition should avoid TARGETDIR being reset). Yes, you want to keep the same condition in both sequences. That lets a user pass in a default installation directory for both full-UI and silent installs. > Also, I currently install by default into [WindowsVolume]PythonXY. > Would it be better to install into [ROOTDIR]PythonXY, as this would > be the drive with most space? I assume you mean ROOTDRIVE and I'd definitely agree that's the right default directory. If anyone still uses a network install of Windows, WindowsVolume would be on the network. -- sig://boB http://foobob.com From allison at sumeru.stanford.EDU Sun Jan 11 17:02:13 2004 From: allison at sumeru.stanford.EDU (Dennis Allison) Date: Sun Jan 11 17:02:32 2004 Subject: [Python-Dev] htmllib & formatter & rendering real htmldocs Message-ID: I'm pondering the problem of rendering webpages into pdf. The HTML involved is 4.01+ with some minimal Javascript with can either be special cased or ignored. Htmllib has not been brought up to date although I see from traffic in this list's archive from amk (amk at amk.ca) saying that he/she was working on it. Updating htmllib to 4.01+ does not see to be a big task--updating the formatter and writer modules to handle things like CSS looks to be significant. Has anyone stepped up to this task? BTW, (this in answer to amk's question of a while back) I do use both the AbstractFormatter and the DumbWriter in production code where I need to render a text version of a web page to be sent as email. From kbk at shore.net Sun Jan 11 17:10:36 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun Jan 11 17:10:43 2004 Subject: [Python-Dev] collections module In-Reply-To: <40010F41.1020206@v.loewis.de> (Martin v. Loewis's message of "Sun, 11 Jan 2004 09:54:25 +0100") References: <87n08vcrvh.fsf@hydra.localdomain> <40008C05.7030803@v.loewis.de> <87ad4vcpo1.fsf@hydra.localdomain> <40010F41.1020206@v.loewis.de> Message-ID: <87k73y16hv.fsf@hydra.localdomain> "Martin v. Loewis" writes: > That will cause a major ABI incompatibility, as all modules > that use PyList_GET_SIZE need to be recompiled. In addition, > it causes a slow-down for the standard use of lists. So you > would make the general case slower, just to support a > special case. Why are you proposing that? I'm not proposing it, I'd have to provide an implementaton to do that :-) I just asked the question, "Why not a circular buffer implementation?" Tim has been patient enough to get my education on that subject started. ob_size should continue to be the number of pointers to objects. The GET_ITEM/SET_ITEM macros are more problematic in terms of overhead because an additional memory access would be required to get the current size of the block containing the pointer vector. As Tim said, other methods under consideration could also change the ABI. If the overall slowdown of lists to get fast queues was 5%, that would no doubt be unacceptable. If it was 0.5%, I imagine it would be acceptable? Note that if no pop(0)'s are ever done on the list there is no wrap around and the slowdown is limited to the access of blocksize plus a test; no modulo arithmetic is necessary. This is also the case if the list doesn't grow and items are only removed, including list[0]. And in the latter case, the memmoves associated with internal deletions will dominate the performance in all implementations. -- KBK From amk at amk.ca Sun Jan 11 17:54:27 2004 From: amk at amk.ca (A.M. Kuchling) Date: Sun Jan 11 17:55:42 2004 Subject: [Python-Dev] Re: htmllib & formatter & rendering real htmldocs In-Reply-To: References: Message-ID: <20040111225427.GB2140@rogue.amk.ca> On Sun, Jan 11, 2004 at 02:02:13PM -0800, Dennis Allison wrote: > Htmllib has not been brought up to date although I see from traffic in > this list's archive from amk (amk at amk.ca) saying that he/she was > working on it. It's patch #836088 on SourceForge. Updating the formatter and writer modules is not in the plan, though, only adding methods for elements that aren't currently handled. --amk From allison at sumeru.stanford.EDU Sun Jan 11 18:03:51 2004 From: allison at sumeru.stanford.EDU (Dennis Allison) Date: Sun Jan 11 18:04:01 2004 Subject: [Python-Dev] Re: htmllib & formatter & rendering real htmldocs In-Reply-To: <20040111225427.GB2140@rogue.amk.ca> Message-ID: Thanks... I'll fetch it. It's a place to start (-: -d On Sun, 11 Jan 2004, A.M. Kuchling wrote: > On Sun, Jan 11, 2004 at 02:02:13PM -0800, Dennis Allison wrote: > > Htmllib has not been brought up to date although I see from traffic in > > this list's archive from amk (amk at amk.ca) saying that he/she was > > working on it. > > It's patch #836088 on SourceForge. Updating the formatter and writer > modules is not in the plan, though, only adding methods for elements that > aren't currently handled. > > --amk > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu > From ajsiegel at optonline.net Sun Jan 11 19:07:59 2004 From: ajsiegel at optonline.net (Arthur) Date: Sun Jan 11 19:08:02 2004 Subject: [Python-Dev] re: The os module, unix and win32 Message-ID: <0HRC00H9FODC7D@mta7.srv.hcvlny.cv.net> Barry writes >At 09-01-2004 04:02, Arthur wrote: >>But tangentially on topic - is the possibility of getting enough of the >>win32all API functions into the core distribution to give more Windows >>standard functionality to disutils - start menu entries in particular. >That needs COM support and that's a lot more API that is proposed for >popen5 and similar. By the time you have COM support you have committed >to include all of win32all in standard python. Which is basically what >ActiveState Python is. But Thomas (who should know) wrote: >bdist_wininst can create Windows shortcuts in it's postinstall script. Thomas (off-list) had also given me a cite to some code, which I haven't had the opportunity to try to digest. Is this a semantic issue as to what can and cannot be done on Windows with disutils? I thought I gathered from Thomas that the functionality he is referencing is new as of 2.3. Is this the wrong place to be pursuing this? Art From tim.one at comcast.net Sun Jan 11 19:31:28 2004 From: tim.one at comcast.net (Tim Peters) Date: Sun Jan 11 19:31:30 2004 Subject: [Python-Dev] collections module In-Reply-To: <87k73y16hv.fsf@hydra.localdomain> Message-ID: [Kurt B. Kaiser] > ... > If the overall slowdown of lists to get fast queues was 5%, that would > no doubt be unacceptable. If it was 0.5%, I imagine it would be > acceptable? Probably not -- it's too much the tail wagging the dog, as Python lists, used for what they're best at, are core language workhorses already. Timing changes under 10% are very hard to establish convincingly, though, especially because the relative performance of Python features varies so widely across platforms (some mix of the code the platform C compiler generates, the quality of the platform C libraries, and pure accidents, like whether the top of the main eval loop happens to end up fighting severe I-cache conflicts with other high-frequency code). > Note that if no pop(0)'s are ever done on the list there is no wrap > around and the slowdown is limited to the access of blocksize plus a > test; no modulo arithmetic is necessary. This is also the case if the > list doesn't grow and items are only removed, including list[0]. And > in the latter case, the memmoves associated with internal deletions > will dominate the performance in all implementations. Neither case holds for a queue; the question that started this was whether to provide a fancy queue implementation in C, and (of course) Guido would rather see Python lists efficiently usable for that purpose too. I suspected that might not be a realistic hope, and now I'm starting to *think* it too . From tim.one at comcast.net Sun Jan 11 19:31:29 2004 From: tim.one at comcast.net (Tim Peters) Date: Sun Jan 11 19:31:36 2004 Subject: [Python-Dev] Python MSI In-Reply-To: <426179714.20040111155659@bobs.org> Message-ID: [Martin v. L?wis] >> Can you please elaborate? I don't quite understand the relationship >> between the UI sequence and the execute sequence: [Bob Arnson] > You're not alone! At work, we have a wall where we post > clippings about MSI weirdnesses; we call it the "MSI Hall of > Misery" because it usually represents days of research and > experimenting without much to show for the effort... Hey, how would you like to join the SpamBayes project? We have a similar thing going with reverse-engineering a billion and one un- and mis-documented features of the Outlook and MAPI APIs. We could really use someone with experience in recording them all on one wall . From tim.one at comcast.net Sun Jan 11 20:30:11 2004 From: tim.one at comcast.net (Tim Peters) Date: Sun Jan 11 20:30:15 2004 Subject: [Python-Dev] Re: collections module In-Reply-To: Message-ID: [Terry Reedy] > On enhancing list object: queue vs. deque. > > My general impression is that queues, while less common than stacks, > are used enough in enough different algorithms that adding one > implementation field to list headers so that pop(0) is an O(1) > pointer move might be an overall win for Python. The current O(n) > move everything every time behavior is certainly a nuisance when > queues are needed. See the other msgs in this thread today -- I'm not convinced that queue-like behavior can be supported that efficiently. > Deques, on the other hand, seem to be more a theoretical construct. > Algorithm books include them for completeness in describing possible > list disciplines, but generally ignore them thereafter. I am hard > put to remember an algorithm that actually needs them. They are > certainly much rarer than queues. So I do not think trying to make > prepends O(1) should be considered further. I expect that if queue-like behavior *can* be supported efficiently using an always-contiguous vector, efficient dequeue-like behavior will fall out for free. > My possibly naive thought is that before reallocating, look at the > ratio of left free to total. I don't know what that means (neither the "left free" part nor what exactly "total" is counting). The devil is in the details here. > If memmove would result in as much, or perhaps nearly as much, > over-allocation as realloc, memmove() leaves the total number of unused slots unchanged. realloc can do that too, or increase it by any amount, or decrease it by any amount. I just don't know what you're trying to say ... explain how it works, step by step, in the q = [incoming() for i in range(1000)] while True: q.append(incoming()) consume(q.pop(0)) example. > then just do that. Certainly if the ratio is small (like say .1), > don't bother with memmove. The exact cutoff is a tuning matter. > > Separate issue: for best performance, one would like to be able trade > space for time and preallocate a large enough block so that most > append faults result in a memmove back to the left. Does the current > builtin shrinkage discipline would allow this. Or does "q=[0]*1000; > q[:]=[]" shrink the space for q back to nearly nothing? Python *offers* to give 992 slots back; whether the platform realloc() accepts the offer is up to it. In the "details are everything" department, Python *also* creates a temp vector of length 1000, and copies all of q[:] into it before it deletes any of q: emptying a list is a linear-time operation, not constant-time. That's because it's not safe to decref anything in a list while tearing it down (e.g., the list may be visible to a __del__ method, which must not be allowed to see NULL pointers, or reclaimed objects, in the list). > (One might also want to do same for stack. And yes, I remember: > 'user tuning' is a 'bad idea'. But....) From python at rcn.com Sun Jan 11 20:38:14 2004 From: python at rcn.com (Raymond Hettinger) Date: Sun Jan 11 20:39:04 2004 Subject: [Python-Dev] collections module In-Reply-To: Message-ID: <003901c3d8ac$bf750e40$a226c797@oemcomputer> > [Kurt B. Kaiser] > > Note that if no pop(0)'s are ever done on the list there is no wrap > > around and the slowdown is limited to the access of blocksize plus a > > test; no modulo arithmetic is necessary. This is also the case if the > > list doesn't grow and items are only removed, including list[0]. And > > in the latter case, the memmoves associated with internal deletions > > will dominate the performance in all implementations. [Timbot] > Neither case holds for a queue; the question that started this was whether > to provide a fancy queue implementation in C, and (of course) Guido would > rather see Python lists efficiently usable for that purpose too. I > suspected that might not be a realistic hope, and now I'm starting to > *think* it too . It needn't be especially fancy. Here is what I had in mind: n = 50 class fifo(object): def __init__(self): # block[0:n] is for data and block[n] is a link field self.head = self.tail = [None] * (n+1) self.headptr = self.tailptr = 0 def push(self, x): if self.headptr == n: newblock = [None] * (n+1) self.head[n] = newblock self.head = newblock self.headptr = 0 self.head[self.headptr] = x self.headptr += 1 def pop(self): if self.tail is self.head and self.tailptr >= self.headptr: raise IndexError if self.tailptr == n: self.tail = self.tail[n] self.tailptr = 0 x = self.tail[self.tailptr] self.tailptr += 1 return x The memory manager is called once every for 50 queue insertions and memory is freed after every 50 pops. Outside of that, every push and pop just a fast array access and pointer increment. Long queues incur about a 15-20% space overhead as compared to a straight-list implementation. Speedwise, this beats the socks off of the current approach. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From kbk at shore.net Sun Jan 11 21:07:47 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun Jan 11 21:07:52 2004 Subject: [Python-Dev] collections module References: Message-ID: <87r7y5zzpo.fsf@hydra.localdomain> "Tim Peters" writes: > [Kurt B. Kaiser] >> If the overall slowdown of lists to get fast queues was 5%, that would >> no doubt be unacceptable. If it was 0.5%, I imagine it would be >> acceptable? > > Probably not -- it's too much the tail wagging the dog, as Python lists, > used for what they're best at, are core language workhorses already. Certainly true. > Timing changes under 10% are very hard to establish convincingly, > though, especially because the relative performance of Python > features varies so widely across platforms (some mix of the code the > platform C compiler generates, the quality of the platform C > libraries, and pure accidents, like whether the top of the main eval > loop happens to end up fighting severe I-cache conflicts with other > high-frequency code). If 0.5% is too much, and if the impact can't be measured reliably at 20x that level it's Catch 22. >> Note that if no pop(0)'s are ever done on the list there is no wrap >> around and the slowdown is limited to the access of blocksize plus a >> test; no modulo arithmetic is necessary. This is also the case if the >> list doesn't grow and items are only removed, including list[0]. And >> in the latter case, the memmoves associated with internal deletions >> will dominate the performance in all implementations. > > Neither case holds for a queue; I was trying to make the point that in a non-queue situation the impact would be minimized. > the question that started this was whether to provide a fancy queue > implementation in C, and (of course) Guido would rather see Python > lists efficiently usable for that purpose too. I suspected that > might not be a realistic hope, and now I'm starting to *think* it > too . Well, the speculative gimmick appears to put almost the whole load on the queues and would surely get a significant improvement over Queue without (apparently) unduly affecting other list operations. Interesting discussion! If I need a queue extension I know where to start. -- KBK From tdelaney at avaya.com Sun Jan 11 21:08:33 2004 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Sun Jan 11 21:08:40 2004 Subject: [Python-Dev] collections module Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE01063A16@au3010avexu1.global.avaya.com> > From: Tim Peters > > q = [incoming() for i in range(1000)] > while True: > q.append(incoming()) > consume(q.pop(0)) > > We currently realloc() q on every append(), rounding up its > size, where "its size" is taken from ob_size. That would no longer be > *correct*: ob_size > stays steady at 1000, but append needs ever-larger indices > relative to the > start of the allocated memory. IOW, while ob_size bears a useful > relationship to the amount of memory we actually allocate tody, that > relationship would no longer hold (ob_size would become an > arbitrarily poor > lower bound). > > So what are we going to do instead, exactly? We actually > allocate enough > room for 1024 elements now (although Python doesn't currently remember > that). If we start to remember when we're out of room, and > memmove() the > whole blob left 24 slots when append hits the end of > available room, then > every 24 (+-1 -- doesn't matter at this level) iterations of > the loop, we > end up memmove'ing 1000 pointers. If the steady-state > backlog were, say, > 1023 elements instead, we'd end up memmove'ing 1023 pointers > on *every* loop > iteration, and each append becomes a linear-time operation. Hmm - could we do both? We could choose what to do based on the percentage of the memory block actually in use at the time we hit the end of the block. For example, if we hit the end of the block, and over 75% of it is used, we realloc, then memmove to the beginning of the block. If less than 75% of the memory block is used, we just do the memmove. So in the above case, when we've got 1024 elements allocated, with 1000 entries in the list, we hit the end. There's more than 75% of the block in use, so we realloc to (e.g.) 2048 elements and memmove the 1000 entries to the start of the block. After another 1048 times around the loop we hit the end of the block again. This time there's less than 75% of the block used, so we just memmove all the entries to the start of the block. This then becomes a new steady state, but instead of memmoveing 1000 pointers every 24 times round the loop, we memmove 1000 pointers every 1048 times round. I think that takes care of the steady-state problem (or at least reduces the problem by an order of magnitude or 2). The question is whether we would end up with more or less memory in use in the normal case. I suspect we wouldn't. Tim Delaney From tim.one at comcast.net Sun Jan 11 21:42:15 2004 From: tim.one at comcast.net (Tim Peters) Date: Sun Jan 11 21:42:18 2004 Subject: [Python-Dev] collections module In-Reply-To: <003901c3d8ac$bf750e40$a226c797@oemcomputer> Message-ID: [Tim] >> Neither case holds for a queue; the question that started this was >> whether to provide a fancy queue implementation in C, and (of >> course) Guido would rather see Python lists efficiently usable for >> that purpose too. I suspected that might not be a realistic hope, >> and now I'm starting to *think* it too . [Raymond Hettinger] > It needn't be especially fancy. Here is what I had in mind: > > n = 50 > class fifo(object): > This is a different issue. There are many ways to *use* lists to build a distinct fifo class. What I've been writing about is the possibility of making Python lists *directly* efficient for use as queues (the realistic possibility of making list.append and list.pop(0) both at worst O(1) amortized time). That's what I remain skeptical about. > def __init__(self): > # block[0:n] is for data and block[n] is a link field > self.head = self.tail = [None] * (n+1) That *looks* so much like a common newbie coding error that it deserves a comment. > self.headptr = self.tailptr = 0 > > def push(self, x): > if self.headptr == n: > newblock = [None] * (n+1) > self.head[n] = newblock > self.head = newblock > self.headptr = 0 > self.head[self.headptr] = x > self.headptr += 1 Please don't push at "the head" -- this reverses the common (and sensible) meanings of "head" and "tail" for queues. You join a queue at its tail, and leave the queue when you reach its head. > def pop(self): > if self.tail is self.head and self.tailptr >= self.headptr: > raise IndexError > if self.tailptr == n: > self.tail = self.tail[n] > self.tailptr = 0 > x = self.tail[self.tailptr] > self.tailptr += 1 > return x > > The memory manager is called once every for 50 queue insertions and > memory is freed after every 50 pops. Outside of that, every push and > pop just a fast array access and pointer increment. Long queues incur > about a 15-20% space overhead as compared to a straight-list > implementation. Speedwise, this beats the socks off of the current > approach. Meaning pairing append() and pop(0)? Sure -- it would be hard not to beat that . You can save more by ensuring the blocks are small enough for pymalloc to satisfy the memory requests, more again by keeping a dedicated freelist of nodes, and more again by switching to a circular implementation when the queue appears to have reached steady-state. You might also look at the implementation of the Icon language, where the native list type is implemented similarly as a linked list of blocks, and supports efficient general deque operation. Note that Marc-Andre's mxQueue implements a queue as a circular dynamic array; when the queue grows at an enormous rate, he has to endure some reallocs and moving, but in steady-state it can run forever without any memory-management operations. I'll note it looks like an mxQueue never shrinks, though, which may or may not be "a problem", depending on the app. As to "fancy", Marc-Andre ended up with over 1,000 lines of C, and his implementation takes a simpler approach than yours. OTOH, it added lots of marginal features (like "q << x" adds x to q). I've never needed a queue in Python where "the standard" two-list trick (as shown in Josiah's Cookbook comment) wasn't more than good enough, so my interest in this is really limited to whether Python's native list type can (realistically) be made efficient for general deque operation. From tjreedy at udel.edu Sun Jan 11 23:21:30 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Sun Jan 11 23:21:26 2004 Subject: [Python-Dev] Re: Re: collections module References: Message-ID: "Tim Peters" wrote in message news:LNBBLJKPBEHFEDALKOLCKEOHIEAB.tim.one@comcast.net... > [Terry Reedy] >> So I do not think trying to make > > prepends O(1) should be considered further. > > I expect that if queue-like behavior *can* be supported efficiently using an > always-contiguous vector, efficient dequeue-like behavior will fall out for > free. Efficient inserts at 0 would seem to me to require space at the front which is usually wasted. But if you can avoid this, perhaps by only adding front space when prepends are attempted, great. > > My possibly naive thought is that before reallocating, look at the > > ratio of left free to total. > > I don't know what that means (neither the "left free" part nor what exactly > "total" is counting). The devil is in the details here. Whoops, I mixed Python and C viewpoints. By total, I meant the C size of the pointer array. By 'free', I meant the popped pointers 'below' the current 0 element, which from a C view, are not free but merely overwriteable. > > If memmove would result in as much, or perhaps nearly as much, > > over-allocation as realloc, > > memmove() leaves the total number of unused slots unchanged. realloc can do > that too, or increase it by any amount, or decrease it by any amount. I > just don't know what you're trying to say ... explain how it works, step by > step, in the > > q = [incoming() for i in range(1000)] Lets assume that C array is 1024 at this point. > while True: > q.append(incoming()) At 25th append, C array is full and there are 24 passe pointers. 24/1024 is small, so ignore them and expand as currently. Assume array is doubled to 2048 (since I don't know current actual value) and no memmove. > consume(q.pop(0)) After additional 1024 appends, array appears full again. However, there are 1048 unneeded pointers to 'left' of 0 position and 1048//2048 > .5 >= threshhold, so memmove top 1000 back to beginning of array which stays at size 2048. Every 1048 appends thereafter, repeat memmove in array that never again gets resized. In the long run, the average pointer moves per push/pop is less that 1. > > then just do that. Certainly if the ratio is small (like say .1), > > don't bother with memmove. The exact cutoff is a tuning matter. If expansion is currently by fraction f, then after expansion, f/(1+f) of slots are free at end of array. Let threshhold t be this fraction or perhaps a bit less. With proposed l[0] popping, suppose that at append fault, p of s slots are below l[0] pointer and that p/s >= t. Then do memmove instead of expansion. Rationale: after memmove, there will be as large (or almost as large) a fraction of slots available at end of list for append as there now are. I hope this is somewhat clearer. Terry J. Reedy From python at bobs.org Sun Jan 11 23:31:45 2004 From: python at bobs.org (Bob Arnson) Date: Sun Jan 11 23:32:27 2004 Subject: [Python-Dev] Python MSI In-Reply-To: References: <426179714.20040111155659@bobs.org> Message-ID: <537559422.20040111233145@bobs.org> Sunday, January 11, 2004, 7:31:29 PM, you wrote: > Hey, how would you like to join the SpamBayes project? We have a similar > thing going with reverse-engineering a billion and one un- and > mis-documented features of the Outlook and MAPI APIs. We could really use > someone with experience in recording them all on one wall . Well, we use a very small font... -- sig://boB http://foobob.com From python at rcn.com Mon Jan 12 00:32:54 2004 From: python at rcn.com (Raymond Hettinger) Date: Mon Jan 12 00:33:43 2004 Subject: [Python-Dev] collections module In-Reply-To: Message-ID: <006101c3d8cd$8742c0a0$a226c797@oemcomputer> > > The memory manager is called once every for 50 queue insertions and > > memory is freed after every 50 pops. Outside of that, every push and > > pop just a fast array access and pointer increment. Long queues incur > > about a 15-20% space overhead as compared to a straight-list > > implementation. Speedwise, this beats the socks off of the current > > approach. > > Meaning pairing append() and pop(0)? Sure -- it would be hard not to beat > that . No, I meant that it is *much* faster to do an indexed write to a pre-allocated list than to use anything relying on PyList_Append() or ins1(). All of those individual calls to realloc are expensive. The 50 to 1 dilution of that expense is how it beats Knuth's two stack trick. Guess which of these two statements is faster and by how much: list(itertools.repeat(None, 5000)) list(tuple(itertools.repeat(None, 5000))) The answer is important because it points the way to faster list comprehensions and other list building applications. > I've never needed a queue in Python where "the standard" two-list trick > (as > shown in Josiah's Cookbook comment) wasn't more than good enough, so my > interest in this is really limited to whether Python's native list type > can > (realistically) be made efficient for general deque operation. And my interest is in a collections module which should probably published elsewhere (too much negativity on the subject here). I am frankly surprised that so many here think that a collections module would be bad for the language. Go figure. It's not like these are new, untried ideas -- the goal was supposed be improving productivity by providing higher level tools. A fast underlying implementation was just icing on the cake. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From jcarlson at uci.edu Mon Jan 12 04:28:26 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon Jan 12 04:31:23 2004 Subject: [Python-Dev] collections module In-Reply-To: <003901c3d8ac$bf750e40$a226c797@oemcomputer> References: <003901c3d8ac$bf750e40$a226c797@oemcomputer> Message-ID: <20040112011231.FE58.JCARLSON@uci.edu> > It needn't be especially fancy. Here is what I had in mind: > > n = 50 > class fifo(object): > > def __init__(self): > # block[0:n] is for data and block[n] is a link field > self.head = self.tail = [None] * (n+1) > self.headptr = self.tailptr = 0 ...etc... I've got a few of the standard FIFO implementations sitting here, and I thought I would give them a run against each other. Testing methods are as follows: Initially create a FIFO of n elements. Alternate between pushing and popping 2n elements from the FIFO. Empty the FIFO. Timing each block, we get the following: Create Steady Empty n = 10000: Fifo 0.047 0.328 0.141 Fifo_list 0.171 0.485 0.14 fast_list_fifo 0.0 0.297 0.157 doublelistfifo 0.0 0.359 0.172 fifo_raymond 0.109 0.422 0.172 n = 20000: Fifo 0.093 0.657 0.297 Fifo_list 0.375 0.984 0.266 fast_list_fifo 0.0 0.594 0.312 doublelistfifo 0.0 0.719 0.344 fifo_raymond 0.203 0.859 0.329 n = 30000: Fifo 0.157 0.984 0.437 Fifo_list 0.563 1.453 0.422 fast_list_fifo 0.0 0.89 0.469 doublelistfifo 0.016 1.078 0.515 fifo_raymond 0.297 1.297 0.5 n = 40000: Fifo 0.203 1.328 0.594 Fifo_list 0.984 1.953 0.594 fast_list_fifo 0.015 1.188 0.625 doublelistfifo 0.0 1.437 0.687 fifo_raymond 0.438 1.734 0.656 Fifo: dictionary used as a fifo fifo_list: [cur, [next, [...]]] fast_list_fifo: push appends, pop increments a pointer, but never removes element doublelistfifo: self explanatory fifo_raymond: the fifo described by raymond. Initialization for Raymond's fifo can be ignored, it is merely inserting objects one at a time (I am short on time this evening). Interpret the rest of the numbers as is reasonable, but consider that there is likely a few percent margin of error on each number, unless something always beats another. I must be going to bed now, the wife is yelling at me. - Josiah From arigo at tunes.org Mon Jan 12 05:58:43 2004 From: arigo at tunes.org (Armin Rigo) Date: Mon Jan 12 06:01:54 2004 Subject: [Python-Dev] [872326] generator expression implementation Message-ID: <20040112105843.GA1655@vicky.ecs.soton.ac.uk> Hello, Sorry to bring this out again, I didn't find a reference about the following issue in generator expressions: (for x in expr) When should 'expr' be evaluated? A priori, one would expect it to be evaluated immediately, but it is not what the suggested implementation does: def __gen(): for x in expr: yield x __gen() This only computes expr when the first element of the generator is needed (so it may even never be computed). Is it done purposefully? It may also be surprising to see the anonymous generator in tracebacks if an exception is raised within expr. A bientot, Armin. From Paul.Moore at atosorigin.com Mon Jan 12 08:15:07 2004 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Mon Jan 12 08:15:15 2004 Subject: [Python-Dev] collections module Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C09B75@UKDCX001.uk.int.atosorigin.com> From: Raymond Hettinger > I am frankly surprised that so many here think that a collections module > would be bad for the language. Go figure. It's not like these are new, > untried ideas -- the goal was supposed be improving productivity by > providing higher level tools. A fast underlying implementation was just > icing on the cake. I think the "fast implementation" issue has triggered people's comments, rather than the "higher level tools" one. For the record, I'm +1 on the idea (as far as "higher level tools" goes). I'm not too bothered, in practice, how efficient the implementation is, as long as it's not visibly slower than using lists/dicts directly :-) Paul. From barry at python.org Mon Jan 12 10:24:44 2004 From: barry at python.org (Barry Warsaw) Date: Mon Jan 12 10:24:52 2004 Subject: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <3FFF3BCC.7020200@v.loewis.de> References: <20040109083021.GA4128@i18n.org> <3FFF1C66.5030902@v.loewis.de> <1073689591.3926.62.camel@anthem> <3FFF3BCC.7020200@v.loewis.de> Message-ID: <1073921083.17326.49.camel@anthem> On Fri, 2004-01-09 at 18:39, Martin v. Loewis wrote: > Barry Warsaw wrote: > > But also it seems like there isn't universal preference for cjkcodecs > > over JapaneseCodecs: > > > > http://mail.python.org/pipermail/mailman-i18n/2003-December/001084.html > > There won't be universal preference for any package that we provide, > or, for any kind of change made to Python. Unfortunately, the message > you quote is hearsay, so we don't know what the specific objections > are. I don't expect there to be universal acceptance, but there should be enough momentum to make cjkcodecs the one way to do it for the large majority of users. But we still haven't heard any concrete objections, so you should go for it. -Barry From mal at egenix.com Mon Jan 12 10:28:48 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Jan 12 10:29:20 2004 Subject: [I18n-sig] Re: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <1073921083.17326.49.camel@anthem> References: <20040109083021.GA4128@i18n.org> <3FFF1C66.5030902@v.loewis.de> <1073689591.3926.62.camel@anthem> <3FFF3BCC.7020200@v.loewis.de> <1073921083.17326.49.camel@anthem> Message-ID: <4002BD30.9060707@egenix.com> Barry Warsaw wrote: > On Fri, 2004-01-09 at 18:39, Martin v. Loewis wrote: > >>Barry Warsaw wrote: >> >>>But also it seems like there isn't universal preference for cjkcodecs >>>over JapaneseCodecs: >>> >>>http://mail.python.org/pipermail/mailman-i18n/2003-December/001084.html >> >>There won't be universal preference for any package that we provide, >>or, for any kind of change made to Python. Unfortunately, the message >>you quote is hearsay, so we don't know what the specific objections >>are. > > I don't expect there to be universal acceptance, but there should be > enough momentum to make cjkcodecs the one way to do it for the large > majority of users. > > But we still haven't heard any concrete objections, so you should go for > it. +1 Users who want to use other codec packages can easily do so, installing them in packages and then redirecting Python to use these by patching encoding.aliases.aliases to point to the codecs in the package rather than Python's default one. This is best done in sitecustomize.py. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 06 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From perky at i18n.org Mon Jan 12 10:37:31 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Mon Jan 12 10:37:31 2004 Subject: [I18n-sig] Re: [Python-Dev] CJKCodecs integration into Python In-Reply-To: <4002BD30.9060707@egenix.com> References: <20040109083021.GA4128@i18n.org> <3FFF1C66.5030902@v.loewis.de> <1073689591.3926.62.camel@anthem> <3FFF3BCC.7020200@v.loewis.de> <1073921083.17326.49.camel@anthem> <4002BD30.9060707@egenix.com> Message-ID: <20040112153731.GA83022@i18n.org> On Mon, Jan 12, 2004 at 04:28:48PM +0100, M.-A. Lemburg wrote: > Barry Warsaw wrote: > >On Fri, 2004-01-09 at 18:39, Martin v. Loewis wrote: > >>There won't be universal preference for any package that we provide, > >>or, for any kind of change made to Python. Unfortunately, the message > >>you quote is hearsay, so we don't know what the specific objections > >>are. > > > >I don't expect there to be universal acceptance, but there should be > >enough momentum to make cjkcodecs the one way to do it for the large > >majority of users. > > > >But we still haven't heard any concrete objections, so you should go for > >it. > > +1 > > Users who want to use other codec packages can easily do so, > installing them in packages and then redirecting Python to use > these by patching encoding.aliases.aliases to point to the > codecs in the package rather than Python's default one. > > This is best done in sitecustomize.py. > Thanks! Then, I'll apply the patch into CVS if no strong objection is raised until this Saturday. Hye-Shik From jeremy at alum.mit.edu Mon Jan 12 10:41:48 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon Jan 12 10:46:43 2004 Subject: [Python-Dev] [872326] generator expression implementation In-Reply-To: <20040112105843.GA1655@vicky.ecs.soton.ac.uk> References: <20040112105843.GA1655@vicky.ecs.soton.ac.uk> Message-ID: <1073922107.6341.357.camel@localhost.localdomain> On Mon, 2004-01-12 at 05:58, Armin Rigo wrote: > Sorry to bring this out again, I didn't find a reference about the following > issue in generator expressions: > > (for x in expr) > > When should 'expr' be evaluated? A priori, one would expect it to be > evaluated immediately, but it is not what the suggested implementation does: > > def __gen(): > for x in expr: > yield x > __gen() First a quick clarification: Is the "suggested implementation" SF patch 872326 or is it that implementation sketch in PEP 289? The latter certainly suggests it, but it's hard to tell even then if it's intentional. Perhaps we should work out the formal specification part of the PEP in parallel with the implementation. The PEP currently says that the reference manual should contain a specification, but that spec is supposed to be part of the PEP, too. I agree, at any rate, that the expression should be evaluated immediately. That's easy enough to implement: def __gen(it): for x in it: yield x __gen() Jeremy From mwh at python.net Mon Jan 12 10:55:50 2004 From: mwh at python.net (Michael Hudson) Date: Mon Jan 12 10:55:53 2004 Subject: [Python-Dev] [872326] generator expression implementation In-Reply-To: <1073922107.6341.357.camel@localhost.localdomain> (Jeremy Hylton's message of "Mon, 12 Jan 2004 10:41:48 -0500") References: <20040112105843.GA1655@vicky.ecs.soton.ac.uk> <1073922107.6341.357.camel@localhost.localdomain> Message-ID: <2mhdz1w48p.fsf@starship.python.net> Jeremy Hylton writes: > On Mon, 2004-01-12 at 05:58, Armin Rigo wrote: >> Sorry to bring this out again, I didn't find a reference about the following >> issue in generator expressions: >> >> (for x in expr) >> >> When should 'expr' be evaluated? A priori, one would expect it to be >> evaluated immediately, but it is not what the suggested implementation does: >> >> def __gen(): >> for x in expr: >> yield x >> __gen() > > First a quick clarification: Is the "suggested implementation" SF patch > 872326 or is it that implementation sketch in PEP 289? The latter > certainly suggests it, but it's hard to tell even then if it's > intentional. Perhaps we should work out the formal specification part > of the PEP in parallel with the implementation. The PEP currently says > that the reference manual should contain a specification, but that spec > is supposed to be part of the PEP, too. > > I agree, at any rate, that the expression should be evaluated > immediately. That's easy enough to implement: > > def __gen(it): > for x in it: > yield x > __gen() Should be __gen(expr), right? Cheers, mwh -- For their next act, they'll no doubt be buying a firewall running under NT, which makes about as much sense as building a prison out of meringue. -- -:Tanuki:- -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html From arigo at tunes.org Mon Jan 12 11:13:04 2004 From: arigo at tunes.org (Armin Rigo) Date: Mon Jan 12 11:15:59 2004 Subject: [Python-Dev] [872326] generator expression implementation In-Reply-To: <1073922107.6341.357.camel@localhost.localdomain> References: <20040112105843.GA1655@vicky.ecs.soton.ac.uk> <1073922107.6341.357.camel@localhost.localdomain> Message-ID: <20040112161304.GA222@vicky.ecs.soton.ac.uk> Hello Jeremy, On Mon, Jan 12, 2004 at 10:41:48AM -0500, Jeremy Hylton wrote: > First a quick clarification: Is the "suggested implementation" SF patch > 872326 or is it that implementation sketch in PEP 289? That's both the implementation in PEP 289 and in the latest version of the SF patch. > I agree, at any rate, that the expression should be evaluated > immediately. If nobody disagrees I will change this in the PEP and post a note on the SF patch page. Armin From jeremy at zope.com Mon Jan 12 11:13:45 2004 From: jeremy at zope.com (Jeremy Hylton) Date: Mon Jan 12 11:19:41 2004 Subject: [Python-Dev] [872326] generator expression implementation In-Reply-To: <2mhdz1w48p.fsf@starship.python.net> References: <20040112105843.GA1655@vicky.ecs.soton.ac.uk> <1073922107.6341.357.camel@localhost.localdomain> <2mhdz1w48p.fsf@starship.python.net> Message-ID: <1073924025.6341.375.camel@localhost.localdomain> On Mon, 2004-01-12 at 10:55, Michael Hudson wrote: > > def __gen(it): > > for x in it: > > yield x > > __gen() > > Should be __gen(expr), right? Right. Jeremy From aahz at pythoncraft.com Mon Jan 12 11:28:18 2004 From: aahz at pythoncraft.com (Aahz) Date: Mon Jan 12 11:28:22 2004 Subject: [Python-Dev] collections module In-Reply-To: <006101c3d8cd$8742c0a0$a226c797@oemcomputer> References: <006101c3d8cd$8742c0a0$a226c797@oemcomputer> Message-ID: <20040112162818.GB8049@panix.com> On Mon, Jan 12, 2004, Raymond Hettinger wrote: > > I am frankly surprised that so many here think that a collections module > would be bad for the language. Go figure. It's not like these are new, > untried ideas -- the goal was supposed be improving productivity by > providing higher level tools. A fast underlying implementation was just > icing on the cake. I'm suspecting that most of the negativity came from the lack of use cases for bags. Let me suggest an alternate idea: in the past, some people have suggested a "data structures" module to give reference implementations of standard data structures and algorithms. This would be a tool aimed more at making it easy to learn computer science with Python than providing the highest-speed quirky implementations; the focus would be on documenting the code. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From guido at python.org Mon Jan 12 12:31:14 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 12 12:31:24 2004 Subject: [Python-Dev] [872326] generator expression implementation In-Reply-To: Your message of "Mon, 12 Jan 2004 10:58:43 GMT." <20040112105843.GA1655@vicky.ecs.soton.ac.uk> References: <20040112105843.GA1655@vicky.ecs.soton.ac.uk> Message-ID: <200401121731.i0CHVFC03119@c-24-5-183-134.client.comcast.net> > Sorry to bring this out again, I didn't find a reference about the following > issue in generator expressions: > > (for x in expr) > > When should 'expr' be evaluated? A priori, one would expect it to be > evaluated immediately, but it is not what the suggested implementation does: > > def __gen(): > for x in expr: > yield x > __gen() > > This only computes expr when the first element of the generator is needed (so > it may even never be computed). Is it done purposefully? It may also be > surprising to see the anonymous generator in tracebacks if an exception is > raised within expr. Looks like a bug to me; I'd like to hear arguments to the contrary. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jan 12 12:42:28 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 12 12:42:43 2004 Subject: [Python-Dev] collections module In-Reply-To: Your message of "Mon, 12 Jan 2004 13:15:07 GMT." <16E1010E4581B049ABC51D4975CEDB8802C09B75@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C09B75@UKDCX001.uk.int.atosorigin.com> Message-ID: <200401121742.i0CHgTN03146@c-24-5-183-134.client.comcast.net> [Raymond Hettinger] > > I am frankly surprised that so many here think that a collections > > module would be bad for the language. Go figure. It's not like > > these are new, untried ideas -- the goal was supposed be improving > > productivity by providing higher level tools. A fast underlying > > implementation was just icing on the cake. > > I think the "fast implementation" issue has triggered people's > comments, rather than the "higher level tools" one. > > For the record, I'm +1 on the idea (as far as "higher level tools" > goes). I'm not too bothered, in practice, how efficient the > implementation is, as long as it's not visibly slower than using > lists/dicts directly :-) I may be the source of some of this kind of feedback. My main concern is that when we offer too many fast (*fast*!) data types competing with list and dict, inexperienced users will more easily be mislead into picking the wrong data type for their application, simply by the overwhelming choices. Something that advertises itself as *fast*! is incredibly attractive to many people who have absolutely no need for it -- just like automobiles that can go several times the speed limit. I'm not worried about the absolute newbies -- they will read the tutorial and use the training wheels provided by the examples. It's the middle tier of programmers who have very little understanding of algorithm complexity (that small amount of knowledge that can be such a dangerous thing) but are excited by all the new and wondrous stuff hidden in the standard library. Marking it with danger signs or "advanced use only" will have the inverse affect -- these are even more attractive than speed claims. Remember Knuth: the root of all evil lies in premature optimization. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jan 12 13:26:23 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 12 13:26:36 2004 Subject: [Python-Dev] PyCon Reminder: Proposal deadline 1/15 (LAST WARNING) Message-ID: <200401121826.i0CIQNI03412@c-24-5-183-134.client.comcast.net> This is the *last* reminder that the deadline for sending proposals for presentations at PyCon DC 2004 is January 15, 2004. That's upcoming Thursday! I'm also reminding everybody, speakers and non-speakers, of the upcoming deadline for Early Bird Registration: January 31. Until then, the registration fee is $175 (student rate $125); after that date, the registration price goes up to $250 (student rate $150). Don't forget to register! (Sorry, there's no speakers discount.) If you're coming to the conference, consider coming a few days early and participate in a coding sprint. Sprints are free (courtesy of the PSF!), and are held from Saturday March 20 through Tuesday March 23 (i.e. the four days leading up to the conference). For more info on sprints, see http://pycon.org/dc2004 . Now back to proposal submissions: We are interested in any and all submissions about uses of Python and the development of the language. Since there is expected to be a strong educational community presence for the next PyCon, teaching materials of various kinds are also encouraged. You can submit your proposal at: http://submit.pycon.org/ For more information about proposals, see: http://www.pycon.org/dc2004/cfp/ If you have further questions about the submission web interface or the format of submissions, please write to: pycon-organizers@python.org We would like to publish all accepted papers on the web. If your paper is accepted and you prepare an electronic presentation (in PDF, PythonPoint or PowerPoint) we will also happily publish that on the web site once PyCon is over. If you don't want to make a formal presentation, there will be a significant amount of Open Space to allow for informal and spur-of-the-moment presentations for which no formal submission is required. There will also be several Lightning Talk sessions (five minutes or less). About PyCon: PyCon is a community-oriented conference targeting developers (both those using Python and those working on the Python project). It gives you opportunities to learn about significant advances in the Python development community, to participate in a programming sprint with some of the leading minds in the Open Source community, and to meet fellow developers from around the world. The organizers work to make the conference affordable and accessible to all. PyCon DC 2004 will be held March 24-26, 2004 in Washington, D.C. The keynote speaker is Mitch Kapor of the Open Source Applications Foundation (http://www.osafoundation.org/). There will be a four-day development sprint before the conference. We're looking for volunteers to help run PyCon. If you're interested, subscribe to http://mail.python.org/mailman/listinfo/pycon-organizers Don't miss any PyCon announcements! Subscribe to http://mail.python.org/mailman/listinfo/pycon-announce You can discuss PyCon with other interested people by subscribing to http://mail.python.org/mailman/listinfo/pycon-interest The central resource for PyCon DC 2004 is http://www.pycon.org/ Pictures from last year's PyCon: http://www.python.org/cgi-bin/moinmoin/PyConPhotos I'm looking forward to seeing you all in DC in March!!! --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Pycon-organizers mailing list Pycon-organizers@python.org http://mail.python.org/mailman/listinfo/pycon-organizers From pf_moore at yahoo.co.uk Mon Jan 12 14:26:19 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Mon Jan 12 14:26:15 2004 Subject: [Python-Dev] Re: collections module References: <16E1010E4581B049ABC51D4975CEDB8802C09B75@UKDCX001.uk.int.atosorigin.com> <200401121742.i0CHgTN03146@c-24-5-183-134.client.comcast.net> Message-ID: Guido van Rossum writes: > [Raymond Hettinger] >> > I am frankly surprised that so many here think that a collections >> > module would be bad for the language. [...] >> I think the "fast implementation" issue has triggered people's >> comments, rather than the "higher level tools" one. [...] > I may be the source of some of this kind of feedback. My main concern > is that when we offer too many fast (*fast*!) data types competing > with list and dict, inexperienced users will more easily be mislead > into picking the wrong data type for their application, simply by the > overwhelming choices. Something that advertises itself as *fast*! is > incredibly attractive to many people who have absolutely no need for > it -- just like automobiles that can go several times the speed limit. Agreed. I think that simply creating a module called "collections", advertised as containing "a variety of data structures designed to handle collections of data" should be enough. Low-key enough to avoid being an attractive nuisance, accurately-named enough that people should find it easily enough if they need it. Add an "efficiency" section to the documentation, if it matters that much (and if so, I'd suggest that memory use be addressed as well as speed...) In my experience, though (assuming I classify in the not-newbie-but- not-expert-either category...), I prefer data types built into the language over ones in modules, unless there is a clear need to go further. For example, the mere fact that re is a module persuades me to save it for cases where it's really needed - I use string methods like startswith and find much more than I ever would in Perl, where REs are built in. This is a good thing - REs are too much of an "attractive nuisance" in Perl. I'd expect the same to be true with the collections module. I'd tend to use lists and dictionaries as my first choice, saving collections for cases where I really need them. So, if my perception is typical, I don't think there will be that big an issue. Paul. -- This signature intentionally left blank From brendano at stanford.edu Mon Jan 12 14:45:06 2004 From: brendano at stanford.edu (Brendan O'Connor) Date: Mon Jan 12 14:45:16 2004 Subject: [Python-Dev] patching webbrowser.py for OS X Message-ID: Hi wonderful python folks, I was trying to use the webbrowser module with OS X's preinstalled python; I'm not very familiar with OS X, but I just patched webbrowser.py to use the very generic "open" command, which works for the simple webbrowser.open(url). I've heard that Fink or another port would be more complete; on the other hand, I'm using computers where I can't install software myself, so this is useful for me. Any thoughts or issues? -Brendan (student) --- webbrowser.py Mon Jan 12 11:43:26 2004 +++ /usr/lib/python2.2/webbrowser.py Sun Jul 14 23:33:32 2002 @@ -1,6 +1,4 @@ """Interfaces for launching and remotely controlling Web browsers.""" -#patched to work on os x native python - import os import sys @@ -299,15 +297,6 @@ # so don't mess with the default! _tryorder = ["internet-config"] register("internet-config", InternetConfig) - -# -# Platform support for OS X (builtin unix version) -# the os x 'open' command opens ANYTHING including url's. -# ISSUE: what if InternetConfig is present? Is this OK? - -if sys.platform[:6] == 'darwin' and _iscommand("open"): - _tryorder = ['open'] - register("osx", None, GenericBrowser("open %s")) # # Platform support for OS/2 From guido at python.org Mon Jan 12 14:46:24 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 12 14:46:32 2004 Subject: [Python-Dev] Re: collections module In-Reply-To: Your message of "Mon, 12 Jan 2004 19:26:19 GMT." References: <16E1010E4581B049ABC51D4975CEDB8802C09B75@UKDCX001.uk.int.atosorigin.com> <200401121742.i0CHgTN03146@c-24-5-183-134.client.comcast.net> Message-ID: <200401121946.i0CJkOL03741@c-24-5-183-134.client.comcast.net> > > [Raymond Hettinger] > >> > I am frankly surprised that so many here think that a collections > >> > module would be bad for the language. > [...] > >> I think the "fast implementation" issue has triggered people's > >> comments, rather than the "higher level tools" one. > [...] [Guido] > > I may be the source of some of this kind of feedback. My main concern > > is that when we offer too many fast (*fast*!) data types competing > > with list and dict, inexperienced users will more easily be mislead > > into picking the wrong data type for their application, simply by the > > overwhelming choices. Something that advertises itself as *fast*! is > > incredibly attractive to many people who have absolutely no need for > > it -- just like automobiles that can go several times the speed limit. [Paul] > Agreed. I think that simply creating a module called "collections", > advertised as containing "a variety of data structures designed to > handle collections of data" should be enough. Low-key enough to avoid > being an attractive nuisance, accurately-named enough that people > should find it easily enough if they need it. Add an "efficiency" > section to the documentation, if it matters that much (and if so, I'd > suggest that memory use be addressed as well as speed...) > > In my experience, though (assuming I classify in the not-newbie-but- > not-expert-either category...), I prefer data types built into the > language over ones in modules, unless there is a clear need to go > further. For example, the mere fact that re is a module persuades me > to save it for cases where it's really needed - I use string methods > like startswith and find much more than I ever would in Perl, where > REs are built in. This is a good thing - REs are too much of an > "attractive nuisance" in Perl. I'd expect the same to be true with the > collections module. I'd tend to use lists and dictionaries as my first > choice, saving collections for cases where I really need them. So, if > my perception is typical, I don't think there will be that big an > issue. That sounds reasonable. I think set and frozenset should also be moved into this new module then. (That would also make it easier for someone to provide a backwards compatible version for Python 2.3 and before as a 3rd party add-on.) --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Mon Jan 12 14:59:15 2004 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Jan 12 14:59:24 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: References: Message-ID: On Jan 12, 2004, at 8:45 PM, Brendan O'Connor wrote: > Hi wonderful python folks, > > I was trying to use the webbrowser module with OS X's preinstalled > python; I'm not very familiar with OS X, but I just patched > webbrowser.py > to use the very generic "open" command, which works for the simple > webbrowser.open(url). > > I've heard that Fink or another port would be more complete; on the > other > hand, I'm using computers where I can't install software myself, so > this > is useful for me. > > Any thoughts or issues? As a brand-new user of Mac OS X, "open" appears to be the right solution to me, picking up whatever settings one may have made for a different browser than Safari, not requiring fink, etc, etc. However, we should probably double-check on pythonmac-sig, where the REAL Mac Pythonistas hang out... Alex From skip at pobox.com Mon Jan 12 15:06:15 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 12 15:06:24 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: References: Message-ID: <16386.65079.223181.271143@montanaro.dyndns.org> (python-dev is not the best place for this. If you want to submit a proposed patch, please use the SourceForge bug tracker. For more "how do I do this" sorts of stuff, please use comp.lang.python (aka python-list@python.org). Your post sort of straddles both domains.) Brendan> I was trying to use the webbrowser module with OS X's Brendan> preinstalled python; I'm not very familiar with OS X, but I Brendan> just patched webbrowser.py to use the very generic "open" Brendan> command, which works for the simple webbrowser.open(url). ... Brendan> Any thoughts or issues? All you need to do to get this to work is set your BROWSER environment variable to "open": % env | grep BROWSER BROWSER=open No source mods necessary. Skip From pje at telecommunity.com Mon Jan 12 15:11:59 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 12 15:08:46 2004 Subject: [Python-Dev] Tracebacks into C code In-Reply-To: <40018390.10303@erols.com> Message-ID: <5.1.0.14.0.20040112151023.04518ae0@mail.telecommunity.com> At 12:10 PM 1/11/04 -0500, Edward C. Jones wrote: >Suppose: > I have a Python extension written in C. > "My_Func" is a function in the extension which is visible at the > Python level. > "My_Func" calls the C function "C_func1". > "C_func1" calls the C function "C_func2". > "C_func2" raises an exception by calling something like "PyErr_SetString". >How do I make the Python traceback include "C_func1" and "C_func2"? Pyrex generates code to do this, at least for C functions that it "compiles" from Python code; you might look at its generated C code for working examples. From bob at redivi.com Mon Jan 12 15:16:10 2004 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 12 15:13:56 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: References: Message-ID: <29BC9016-453C-11D8-97F6-000A95686CD8@redivi.com> On Jan 12, 2004, at 2:59 PM, Alex Martelli wrote: > > On Jan 12, 2004, at 8:45 PM, Brendan O'Connor wrote: > >> I was trying to use the webbrowser module with OS X's preinstalled >> python; I'm not very familiar with OS X, but I just patched >> webbrowser.py >> to use the very generic "open" command, which works for the simple >> webbrowser.open(url). >> >> I've heard that Fink or another port would be more complete; on the >> other >> hand, I'm using computers where I can't install software myself, so >> this >> is useful for me. >> >> Any thoughts or issues? > > As a brand-new user of Mac OS X, "open" appears to be the right > solution to me, picking up whatever settings one may have made for a > different browser than Safari, not requiring fink, etc, etc. However, > we should probably double-check on pythonmac-sig, where the REAL Mac > Pythonistas hang out... I don't think that this patch is necessary.. On my Mac OS X 10.3 machine (comes with Python 2.3.0), webbrowser.py uses Internet Config to launch the url (the 'ic' module). This is basically equivalent to the open command. It works like this: * Internet Config is an old MacOS API that's mapped to Launch Services which launches the URL with your preferred browser * open is a command line utility that uses the NSWorkspace Cocoa API which uses Launch Services to launch the URL with your preferred browser I don't think your patch fixes anything, it's launched by Launch Services either way.. at least on OS X 10.3. A patch doesn't do us any good against Python 2.2.0 on Mac OS 10.2 if its behavior is broken. Apple doesn't update Python between major releases. You should be able to install MacPython 2.3 in your home directory. Don't bother with Fink. -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2357 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040112/00c44b7d/smime-0001.bin From skip at pobox.com Mon Jan 12 15:22:54 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 12 15:23:00 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: <29BC9016-453C-11D8-97F6-000A95686CD8@redivi.com> References: <29BC9016-453C-11D8-97F6-000A95686CD8@redivi.com> Message-ID: <16387.542.986999.429532@montanaro.dyndns.org> Bob> * Internet Config is an old MacOS API that's mapped to Launch Bob> Services which launches the URL with your preferred browser If Internet Config is an "old MacOS API" will it eventually disappear? Skip From bob at redivi.com Mon Jan 12 15:39:30 2004 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 12 15:37:16 2004 Subject: [Pythonmac-SIG] Re: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: <16387.542.986999.429532@montanaro.dyndns.org> References: <29BC9016-453C-11D8-97F6-000A95686CD8@redivi.com> <16387.542.986999.429532@montanaro.dyndns.org> Message-ID: <6BEA9533-453F-11D8-97F6-000A95686CD8@redivi.com> On Jan 12, 2004, at 3:22 PM, Skip Montanaro wrote: > Bob> * Internet Config is an old MacOS API that's mapped to Launch > Bob> Services which launches the URL with your preferred browser > > If Internet Config is an "old MacOS API" will it eventually disappear? In a few years, and if hell freezes over, maybe. It's recommended that new software uses System Configuration and Launch Services to replace Internet Config usage, but disabling Internet Config would break a whole lot of Carbon software. Python doesn't have an official wrapper for System Configuration (though I have one available), and the Launch Services wrapper is new for Python 2.4. -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2357 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040112/ce2201cc/smime.bin From tim.one at comcast.net Mon Jan 12 16:17:39 2004 From: tim.one at comcast.net (Tim Peters) Date: Mon Jan 12 16:17:53 2004 Subject: [Python-Dev] collections module In-Reply-To: <006101c3d8cd$8742c0a0$a226c797@oemcomputer> Message-ID: [Raymond Hettinger] > No, I meant that it is *much* faster to do an indexed write to a > pre-allocated list than to use anything relying on PyList_Append() or > ins1(). All of those individual calls to realloc are expensive. The > 50 to 1 dilution of that expense is how it beats Knuth's two stack > trick. That's fine. As I said before, I've never had a use for a queue in Python where the two-stack trick (coded in Python, even) wasn't more than fast enough for me. If I did, there are a universe of possibilities, including Marc-Andre's mxQueue. That doesn't use Python lists; doesn't call realloc on every queue put (only when it's out of room, which is "never" when a queue reaches steady state); and, when it does realloc, overallocates by a much larger amount than Python's list type. > Guess which of these two statements is faster and by how much: > > list(itertools.repeat(None, 5000)) > list(tuple(itertools.repeat(None, 5000))) I did guess, and was correct to 6 significant decimal digits . > The answer is important because it points the way to faster list > comprehensions and other list building applications. That Python's builtin list calls realloc() on every append is pretty gross, but it's also a tradeoff, saving the bytes in the list header that would be required to remember how many allocated-but-unused slots exist. If you've got a million tiny lists (and some apps do), burning an additional 8MB for something your app doesn't really need isn't attractive (on a 32-bit box today, the list header is 16 bytes; pymalloc aligns to 8-byte boundaries, so the next possible size is 24 bytes). list's very small overallocation (in the ballpark of 5%) is also aimed at minimizing memory burden. Speed isn't everything, tradeoffs are hard, and something loses when something else is favored. (BTW, I never would have considered implementing a resizable list type *without* keeping a field to record the amount of overallocation, so this isn't the tradeoff I would have made .) > ... > And my interest is in a collections module which should probably > published elsewhere (too much negativity on the subject here). A collections *package* is a fine idea. I say "package" instead of "module" because there are literally hundreds of container types, and queues and bags are a tiny corner of that space. An example dear to my heart is BTrees (I maintain Zope's BTree implementation), which is a wonderful way to get an associative mapping sorted by key value. BTrees have a large API all by themselves, so it wouldn't make sense to try to squash them into a module along with 50 other container types. > I am frankly surprised that so many here think that a collections > module would be bad for the language. I was more of the impression that bags didn't have fans . > Go figure. It's not like these are new, untried ideas -- the goal > was supposed be improving productivity by providing higher level > tools. A fast underlying implementation was just icing on the cake. I'm strongly in favor of a collections package. From guido at python.org Mon Jan 12 16:43:56 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 12 16:44:02 2004 Subject: [Python-Dev] collections module In-Reply-To: Your message of "Mon, 12 Jan 2004 16:17:39 EST." References: Message-ID: <200401122143.i0CLhut04149@c-24-5-183-134.client.comcast.net> > > Guess which of these two statements is faster and by how much: > > > > list(itertools.repeat(None, 5000)) > > list(tuple(itertools.repeat(None, 5000))) > > I did guess, and was correct to 6 significant decimal digits . > > > The answer is important because it points the way to faster list > > comprehensions and other list building applications. > > That Python's builtin list calls realloc() on every append is pretty > gross, but it's also a tradeoff, saving the bytes in the list header > that would be required to remember how many allocated-but-unused > slots exist. If you've got a million tiny lists (and some apps do), > burning an additional 8MB for something your app doesn't really need > isn't attractive (on a 32-bit box today, the list header is 16 > bytes; pymalloc aligns to 8-byte boundaries, so the next possible > size is 24 bytes). list's very small overallocation (in the > ballpark of 5%) is also aimed at minimizing memory burden. Speed > isn't everything, tradeoffs are hard, and something loses when > something else is favored. (BTW, I never would have considered > implementing a resizable list type *without* keeping a field to > record the amount of overallocation, so this isn't the tradeoff I > would have made .) Just in case nobody had thought of it before, couldn't the realloc() call be avoided when roundupsize(oldsize) == roundupsize(newsize)??? > > ... > > And my interest is in a collections module which should probably > > published elsewhere (too much negativity on the subject here). > > A collections *package* is a fine idea. I say "package" instead of > "module" because there are literally hundreds of container types, > and queues and bags are a tiny corner of that space. An example > dear to my heart is BTrees (I maintain Zope's BTree implementation), > which is a wonderful way to get an associative mapping sorted by key > value. BTrees have a large API all by themselves, so it wouldn't > make sense to try to squash them into a module along with 50 other > container types. Right. +1 for a package. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at comcast.net Mon Jan 12 17:14:31 2004 From: tim.one at comcast.net (Tim Peters) Date: Mon Jan 12 17:14:53 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401122143.i0CLhut04149@c-24-5-183-134.client.comcast.net> Message-ID: [Guido] > Just in case nobody had thought of it before, couldn't the realloc() > call be avoided when roundupsize(oldsize) == roundupsize(newsize)??? I had thought of it before, but never acted on it . roundupsize() has gotten more expensive over the years too, and it may vary across boxes whether it's cheaper to call that a second time than it is to make the system realloc deduce that nothing needs to be done. roundupsize() is certainly cheaper than Microsoft's realloc() in the do-nothing case. Note that there's an endcase glitch here when oldsize==0 and newsize==1. roundupsize() returns 8 for both, but it doesn't follow that ob_item isn't NULL. realloc(p, n) special-cases the snot out of p==NULL to avoid crashing then. Note in passing that roundupsize() is expensive *because* lists don't remember the amount of overallocated space too: roundupsize() has to arrange to deliver the same result for many consecutive inputs, to prevent asking the platform for non-trivial reallocs() ("trivial realloc" == "do-nothing realloc" == realloc(p, n) called with fixed p and n multiple successive times). If we knew the point at which the "excess" slots were exhausted, then, e.g., doing realloc(oldsize + (oldsize >> 4)) at that point would give similar behavior with a much cheaper "new size" function (if we only computed a new size when a new size was actually needed, the "new size" function wouldn't have to arrange to remain constant over (eventually) unboundedly long stretches of consecutive inputs). From tim.one at comcast.net Mon Jan 12 19:46:58 2004 From: tim.one at comcast.net (Tim Peters) Date: Mon Jan 12 19:47:04 2004 Subject: [Python-Dev] collections module In-Reply-To: Message-ID: [Guido] > Just in case nobody had thought of it before, couldn't the realloc() > call be avoided when roundupsize(oldsize) == roundupsize(newsize)??? Followup: it's not quite that easy, because (at least) PyList_New(int size) can create a list whose allocated size hasn't been rounded up. If I "fix" that, then it's possible to get away with just one roundupsize() call on most list.insert() calls (including list.append()), while avoiding the realloc() call too. Patch attached. I timed it like so: def punchit(): from time import clock as now a = [] push = a.append start = now() for i in xrange(1000000): push(i) finish = now() return finish - start for i in range(10): print punchit() and got these elapsed times (this is Windows, so this is elapsed wall-clock time): before after ------------- -------------- 1.05227710823 1.02203188119 1.05532442716 0.569660068053 1.05461539751 0.568627533147 1.0564449622 0.569336562799 1.05964146231 0.573247959235 1.05679528655 0.568503494862 1.05569402772 0.569745553898 1.05383177727 0.569499991619 1.05528000805 0.569163914916 1.05636618113 0.570356526258 So pretty consistent here, apart from the glitch on the first "after" run, and about an 80% speedup. Yawn . If someone wants to run with this, be my guest. There may be other holes (c.f. PyList_New above), and it could sure use a comment about why it's correct (if it in fact is <0.9 wink>). -------------- next part -------------- Index: Objects/listobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/listobject.c,v retrieving revision 2.175 diff -u -r2.175 listobject.c --- Objects/listobject.c 4 Jan 2004 06:08:16 -0000 2.175 +++ Objects/listobject.c 13 Jan 2004 00:31:54 -0000 @@ -57,13 +57,15 @@ { PyListObject *op; size_t nbytes; + int allocated_size = roundupsize(size); + if (size < 0) { PyErr_BadInternalCall(); return NULL; } - nbytes = size * sizeof(PyObject *); + nbytes = allocated_size * sizeof(PyObject *); /* Check for overflow */ - if (nbytes / sizeof(PyObject *) != (size_t)size) { + if (nbytes / sizeof(PyObject *) != (size_t)allocated_size) { return PyErr_NoMemory(); } op = PyObject_GC_New(PyListObject, &PyList_Type); @@ -78,7 +80,7 @@ if (op->ob_item == NULL) { return PyErr_NoMemory(); } - memset(op->ob_item, 0, sizeof(*op->ob_item) * size); + memset(op->ob_item, 0, sizeof(*op->ob_item) * allocated_size); } op->ob_size = size; _PyObject_GC_TRACK(op); @@ -154,7 +156,8 @@ return -1; } items = self->ob_item; - NRESIZE(items, PyObject *, self->ob_size+1); + if (roundupsize(self->ob_size) - 1 == self->ob_size || items == NULL) + NRESIZE(items, PyObject *, self->ob_size+1); if (items == NULL) { PyErr_NoMemory(); return -1; From jiwon at softwise.co.kr Mon Jan 12 21:53:05 2004 From: jiwon at softwise.co.kr (jiwon) Date: Mon Jan 12 21:54:50 2004 Subject: [Python-Dev] [872326] generator expression implementation References: <20040112105843.GA1655@vicky.ecs.soton.ac.uk><1073922107.6341.357.camel@localhost.localdomain> <2mhdz1w48p.fsf@starship.python.net> Message-ID: <083001c3d980$706b0910$d70aa8c0@jiwon> I'm the originator of the patch (SF patch 872326 - generator expression). If "immediate evaluation of expr in generator expression" means that generator expression: (x for x in range()) should be equivalent to: def __foo(_range=range()): for x in _range: yield x it seems to me that generator expression has almost no merit over list comprehension. I think the merit of generator is that it does not need to allocate all the memory that it needs in the initial time, right? Or, am I confusing something? :) Regards, Jiwon. ----- Original Message ----- From: "Michael Hudson" To: Sent: Tuesday, January 13, 2004 12:55 AM Subject: Re: [Python-Dev] [872326] generator expression implementation > Jeremy Hylton writes: > > > On Mon, 2004-01-12 at 05:58, Armin Rigo wrote: > >> Sorry to bring this out again, I didn't find a reference about the following > >> issue in generator expressions: > >> > >> (for x in expr) > >> > >> When should 'expr' be evaluated? A priori, one would expect it to be > >> evaluated immediately, but it is not what the suggested implementation does: > >> > >> def __gen(): > >> for x in expr: > >> yield x > >> __gen() > > > > First a quick clarification: Is the "suggested implementation" SF patch > > 872326 or is it that implementation sketch in PEP 289? The latter > > certainly suggests it, but it's hard to tell even then if it's > > intentional. Perhaps we should work out the formal specification part > > of the PEP in parallel with the implementation. The PEP currently says > > that the reference manual should contain a specification, but that spec > > is supposed to be part of the PEP, too. > > > > I agree, at any rate, that the expression should be evaluated > > immediately. That's easy enough to implement: > > > > def __gen(it): > > for x in it: > > yield x > > __gen() > > Should be __gen(expr), right? > > Cheers, > mwh > > -- > For their next act, they'll no doubt be buying a firewall > running under NT, which makes about as much sense as > building a prison out of meringue. -- -:Tanuki:- > -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html From jeremy at alum.mit.edu Mon Jan 12 22:01:15 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon Jan 12 22:06:18 2004 Subject: [Python-Dev] [872326] generator expression implementation In-Reply-To: <083001c3d980$706b0910$d70aa8c0@jiwon> References: <20040112105843.GA1655@vicky.ecs.soton.ac.uk> <1073922107.6341.357.camel@localhost.localdomain> <2mhdz1w48p.fsf@starship.python.net> <083001c3d980$706b0910$d70aa8c0@jiwon> Message-ID: <1073962875.6341.1202.camel@localhost.localdomain> On Mon, 2004-01-12 at 21:53, jiwon wrote: > I'm the originator of the patch (SF patch 872326 - generator expression). > If "immediate evaluation of expr in generator expression" means that > generator expression: > > (x for x in range()) > > should be equivalent to: > > def __foo(_range=range()): > for x in _range: > yield x > > it seems to me that generator expression has almost no merit over list > comprehension. > I think the merit of generator is that it does not need to allocate all the > memory that it needs in the initial time, right? > Or, am I confusing something? :) The range() case isn't all the interesting, because you wouldn't delay the memory allocation very long. The first time an element was consumed from the g.expr it would materialize the whole list. So you would only delay for the likely brief time between creating the generator and consuming its first element. There are two more common cases. One is that the expression is already a fully materialized sequence. Most of the examples in PEP 289 fall into this category: You've got a list of objects and you want to sum their spam_score attributes. The other case is that the expression is itself a generator, e.g. dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector)) For all these cases, the benefit is that the result list is never materialized, not that the input list is never materialized. Jeremy From oussoren at cistron.nl Tue Jan 13 01:13:30 2004 From: oussoren at cistron.nl (Ronald Oussoren) Date: Tue Jan 13 01:13:37 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: <29BC9016-453C-11D8-97F6-000A95686CD8@redivi.com> References: <29BC9016-453C-11D8-97F6-000A95686CD8@redivi.com> Message-ID: <9C108902-458F-11D8-B390-0003931CFE24@cistron.nl> On 12 jan 2004, at 21:16, Bob Ippolito wrote: > On Jan 12, 2004, at 2:59 PM, Alex Martelli wrote: > >> >> On Jan 12, 2004, at 8:45 PM, Brendan O'Connor wrote: >> >>> I was trying to use the webbrowser module with OS X's preinstalled >>> python; I'm not very familiar with OS X, but I just patched >>> webbrowser.py >>> to use the very generic "open" command, which works for the simple >>> webbrowser.open(url). >>> >>> I've heard that Fink or another port would be more complete; on the >>> other >>> hand, I'm using computers where I can't install software myself, so >>> this >>> is useful for me. >>> >>> Any thoughts or issues? >> >> As a brand-new user of Mac OS X, "open" appears to be the right >> solution to me, picking up whatever settings one may have made for a >> different browser than Safari, not requiring fink, etc, etc. >> However, we should probably double-check on pythonmac-sig, where the >> REAL Mac Pythonistas hang out... > > I don't think that this patch is necessary.. It might even be insecure, open will open more than just URLs (try 'open /bin/ls'). Ronald From bob at redivi.com Tue Jan 13 01:27:43 2004 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 13 01:25:27 2004 Subject: [Pythonmac-SIG] Re: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: <9C108902-458F-11D8-B390-0003931CFE24@cistron.nl> References: <29BC9016-453C-11D8-97F6-000A95686CD8@redivi.com> <9C108902-458F-11D8-B390-0003931CFE24@cistron.nl> Message-ID: <98742090-4591-11D8-8D71-000A95686CD8@redivi.com> On Jan 13, 2004, at 1:13 AM, Ronald Oussoren wrote: > On 12 jan 2004, at 21:16, Bob Ippolito wrote: > >> On Jan 12, 2004, at 2:59 PM, Alex Martelli wrote: >> >>> >>> On Jan 12, 2004, at 8:45 PM, Brendan O'Connor wrote: >>> >>>> I was trying to use the webbrowser module with OS X's preinstalled >>>> python; I'm not very familiar with OS X, but I just patched >>>> webbrowser.py >>>> to use the very generic "open" command, which works for the simple >>>> webbrowser.open(url). >>>> >>>> I've heard that Fink or another port would be more complete; on the >>>> other >>>> hand, I'm using computers where I can't install software myself, so >>>> this >>>> is useful for me. >>>> >>>> Any thoughts or issues? >>> >>> As a brand-new user of Mac OS X, "open" appears to be the right >>> solution to me, picking up whatever settings one may have made for a >>> different browser than Safari, not requiring fink, etc, etc. >>> However, we should probably double-check on pythonmac-sig, where the >>> REAL Mac Pythonistas hang out... >> >> I don't think that this patch is necessary.. > > It might even be insecure, open will open more than just URLs (try > 'open /bin/ls'). insecure? >>> from __future__ import audited_code File "", line 1 SyntaxError: future feature audited_code is not defined FWIW, ic.launchurl('file:///bin/ls') will do the same thing: open a Terminal.app window and execute /bin/ls. Safari will open a Finder window at "/bin" and select the "ls" executable. I'm not sure where the code is that handles unsafe URLs differently, but it's probably somewhere near WebCore.. probably not in Safari itself? I'd have to care, then I'd have to look :) -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2357 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040113/5660f28c/smime.bin From brendano at stanford.edu Tue Jan 13 04:00:15 2004 From: brendano at stanford.edu (Brendan O'Connor) Date: Tue Jan 13 04:00:32 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: <29BC9016-453C-11D8-97F6-000A95686CD8@redivi.com> Message-ID: > I don't think that this patch is necessary.. On my Mac OS X 10.3 > machine (comes with Python 2.3.0), webbrowser.py uses Internet Config > to launch the url (the 'ic' module). This is basically equivalent to > the open command. It works like this: > * Internet Config is an old MacOS API that's mapped to Launch Services > which launches the URL with your preferred browser > * open is a command line utility that uses the NSWorkspace Cocoa API > which uses Launch Services to launch the URL with your preferred > browser > > I don't think your patch fixes anything, it's launched by Launch > Services either way.. at least on OS X 10.3. A patch doesn't do us any > good against Python 2.2.0 on Mac OS 10.2 if its behavior is broken. > Apple doesn't update Python between major releases. > > You should be able to install MacPython 2.3 in your home directory. > Don't bother with Fink. That's it, I'm on 10.2; I guess this is just a quick fix for that one version. Well, thanks all for the pointers. -Brendan From jiwon at softwise.co.kr Tue Jan 13 05:33:52 2004 From: jiwon at softwise.co.kr (jiwon) Date: Tue Jan 13 05:36:26 2004 Subject: [Python-Dev] [872326] generator expression implementation References: <20040112105843.GA1655@vicky.ecs.soton.ac.uk><1073922107.6341.357.camel@localhost.localdomain><2mhdz1w48p.fsf@starship.python.net><083001c3d980$706b0910$d70aa8c0@jiwon> <1073962875.6341.1202.camel@localhost.localdomain> Message-ID: <091201c3d9c0$d01967e0$d70aa8c0@jiwon> ok. I was confused, sorry. :) (I thought range() is evaluated somewhat like xrange() if it's in "for ... in range()" ) I'll fix the patch so that expr is evaluated when generator expression is created. Regards, Jiwon ----- Original Message ----- From: "Jeremy Hylton" To: "jiwon" Cc: "Michael Hudson" ; Sent: Tuesday, January 13, 2004 12:01 PM Subject: Re: [Python-Dev] [872326] generator expression implementation > On Mon, 2004-01-12 at 21:53, jiwon wrote: > > I'm the originator of the patch (SF patch 872326 - generator expression). > > If "immediate evaluation of expr in generator expression" means that > > generator expression: > > > > (x for x in range()) > > > > should be equivalent to: > > > > def __foo(_range=range()): > > for x in _range: > > yield x > > > > it seems to me that generator expression has almost no merit over list > > comprehension. > > I think the merit of generator is that it does not need to allocate all the > > memory that it needs in the initial time, right? > > Or, am I confusing something? :) > > The range() case isn't all the interesting, because you wouldn't delay > the memory allocation very long. The first time an element was consumed > from the g.expr it would materialize the whole list. So you would only > delay for the likely brief time between creating the generator and > consuming its first element. > > There are two more common cases. One is that the expression is already > a fully materialized sequence. Most of the examples in PEP 289 fall > into this category: You've got a list of objects and you want to sum > their spam_score attributes. The other case is that the expression is > itself a generator, e.g. > > dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector)) > > For all these cases, the benefit is that the result list is never > materialized, not that the input list is never materialized. > > Jeremy > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jiwon%40softwise.co.kr > From mwh at python.net Tue Jan 13 06:08:43 2004 From: mwh at python.net (Michael Hudson) Date: Tue Jan 13 06:08:47 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401122143.i0CLhut04149@c-24-5-183-134.client.comcast.net> (Guido van Rossum's message of "Mon, 12 Jan 2004 13:43:56 -0800") References: <200401122143.i0CLhut04149@c-24-5-183-134.client.comcast.net> Message-ID: <2m7jzww1fo.fsf@starship.python.net> Guido van Rossum writes: >> A collections *package* is a fine idea. I say "package" instead of >> "module" because there are literally hundreds of container types, >> and queues and bags are a tiny corner of that space. An example >> dear to my heart is BTrees (I maintain Zope's BTree implementation), >> which is a wonderful way to get an associative mapping sorted by key >> value. BTrees have a large API all by themselves, so it wouldn't >> make sense to try to squash them into a module along with 50 other >> container types. > > Right. +1 for a package. I've added this idea to PEP 320 to try to stop it disappearing into the murk... Cheers, mwh -- In short, just business as usual in the wacky world of floating point . -- Tim Peters, comp.lang.python From guido at python.org Tue Jan 13 12:14:22 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 13 12:14:28 2004 Subject: [Python-Dev] collections module In-Reply-To: Your message of "Mon, 12 Jan 2004 19:46:58 EST." References: Message-ID: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> > [Guido] > > Just in case nobody had thought of it before, couldn't the realloc() > > call be avoided when roundupsize(oldsize) == roundupsize(newsize)??? [Tim] > Followup: it's not quite that easy, because (at least) PyList_New(int size) > can create a list whose allocated size hasn't been rounded up. > > If I "fix" that, then it's possible to get away with just one roundupsize() > call on most list.insert() calls (including list.append()), while avoiding > the realloc() call too. Patch attached. > > I timed it like so: > > def punchit(): > from time import clock as now > a = [] > push = a.append > start = now() > for i in xrange(1000000): > push(i) > finish = now() > return finish - start > > for i in range(10): > print punchit() > > and got these elapsed times (this is Windows, so this is elapsed wall-clock > time): > > before after > ------------- -------------- > 1.05227710823 1.02203188119 > 1.05532442716 0.569660068053 > 1.05461539751 0.568627533147 > 1.0564449622 0.569336562799 > 1.05964146231 0.573247959235 > 1.05679528655 0.568503494862 > 1.05569402772 0.569745553898 > 1.05383177727 0.569499991619 > 1.05528000805 0.569163914916 > 1.05636618113 0.570356526258 > > So pretty consistent here, apart from the glitch on the first "after" run, > and about an 80% speedup. > > Yawn . > > If someone wants to run with this, be my guest. There may be other holes > (c.f. PyList_New above), and it could sure use a comment about why it's > correct (if it in fact is <0.9 wink>). Awesome! We need timing results on some other platforms (Linux, FreeBSD, MacOSX). Volunteers? --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Jan 13 12:53:27 2004 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 13 12:54:32 2004 Subject: [Python-Dev] list.append optimization Was: collections module In-Reply-To: Message-ID: <00bf01c3d9fe$25f2de20$6e02a044@oemcomputer> > So pretty consistent here, apart from the glitch on the first "after" run, > and about an 80% speedup. > > Yawn . > > If someone wants to run with this, be my guest. There may be other holes > (c.f. PyList_New above), and it could sure use a comment about why it's > correct (if it in fact is <0.9 wink>). I'll look at the this this weekend. The improvement is awesome. Hope it has doesn't muck-up anything else. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From mcherm at mcherm.com Tue Jan 13 13:37:46 2004 From: mcherm at mcherm.com (Michael Chermside) Date: Tue Jan 13 13:37:51 2004 Subject: [Python-Dev] collections module Message-ID: <1074019066.40043afacdfc6@mcherm.com> [Guido:] > My own ideal solution [for queues] would be a subtle hack to the list > implementation to make pop(0) and insert(0, x) go a lot faster -- this > can be done by adding one or two extra fields to the list head and > allowing empty space at the front of the list as well as at the end. [Tim:] > That Python's builtin list calls realloc() on every append is pretty gross, > but it's also a tradeoff, saving the bytes in the list header that would be > required to remember how many allocated-but-unused slots exist. If you've > got a million tiny lists (and some apps do), burning an additional 8MB for > something your app doesn't really need isn't attractive (on a 32-bit box > today, the list header is 16 bytes; pymalloc aligns to 8-byte boundaries, so > the next possible size is 24 bytes). [Guido:] > Just in case nobody had thought of it before, couldn't the realloc() > call be avoided when roundupsize(oldsize) == roundupsize(newsize)??? [Tim:] > I had thought of it before, but never acted on it . roundupsize() has > gotten more expensive over the years too, and it may vary across boxes > whether it's cheaper to call that a second time than it is to make the > system realloc deduce that nothing needs to be done. roundupsize() is > certainly cheaper than Microsoft's realloc() in the do-nothing case. > > Note that there's an endcase glitch here when oldsize==0 and newsize==1. > roundupsize() returns 8 for both, but it doesn't follow that ob_item isn't > NULL. realloc(p, n) special-cases the snot out of p==NULL to avoid crashing > then. [...] > Note in passing that roundupsize() is expensive *because* lists don't > remember the amount of overallocated space too: roundupsize() has to > arrange to deliver the same result for many consecutive inputs I'm sure this is just pointing out the obvious, but *IF* the list header were increased, we could add TWO fields... on for allocated_size, and another for left_side_slots. According to Tim, this would still "only" bump list objects up to 24 bytes, but would allow: [1] O(1) performance for pop(0) [2] No need to call realloc() on every append(), just on "true" resizes. I guess I'm proposing the following (all ideas previously mentioned): [1] pop(0) would increment ob_item and left_side_slots. [2] insert(0,x) would decrement ob_item and left_side_slots *IF* left_side_slots were > 0. Lists should still grow at the tail for efficiency, but we can re-use space at the head left by a pop(0). [3] append(0,x) (and other functions that insert) would compare ob_size and allocated_size (do we need left_side_slots in here too?) to see whether new space is needed at the end. If so, it would EITHER call realloc() for more space, OR it would memmove so left_side_slots became 0. I'm still not quite sure what rule to use to determine which to do (although for MOST lists, left_side_slots would be 0, so there wouldn't be a question). So, to recap advantages and disadvantages: lists would still grow just as they do now. Lists used as Queues by calling append() and pop(0) would be time efficient so long as we avoid calling memmove for each and every append() (that's why we don't want to ALWAYS memmove when left_side_slots > 0). Lists on which pop(0) was never called (that's most of 'em) would have NO degradation of space used, and IMPROVED speed in cases (like Windows) where a realloc() noop takes noticable time. And (this seems the most significant drawback) the size of a list would increase by 8 bytes, even for very short lists. I wouldn't want to make the change without trying out the performance in some "real applications", but it seems like there are some potentially useful ideas floating around here. -- Michael Chermside From sdeibel at wingide.com Tue Jan 13 13:51:51 2004 From: sdeibel at wingide.com (Stephan Deibel) Date: Tue Jan 13 13:50:57 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: References: Message-ID: Just wanted to make sure people looking at webbrowser.py know about the following patch I posted on Source Forge last April: [ 728278 ] Multiple webbrowser.py bug fixes / improvements It makes quite a few changes, including fixing some OS X support. Hopefully some form of this can make it into 2.4... I've been lurking here for years but I think this is my first post so I'll do a quick introduction: I'm co-founder of Archaeopteryx Software, Inc, which makes Wing IDE, one of the commercial IDEs for Python. I also run pythonology.org, edited the Python Success Stories collection found there, and try to help out with Python advocacy and things like press releases. Thanks to all for Python's elegance and incredibly high quality! Stephan Deibel -- Wing IDE for Python Archaeopteryx Software, Inc Take Flight! www.wingide.com From maketo at sdf.lonestar.org Tue Jan 13 14:05:48 2004 From: maketo at sdf.lonestar.org (Ognen Duzlevski) Date: Tue Jan 13 14:05:57 2004 Subject: [Python-Dev] a possible bug (or something I don't understand)? Message-ID: Hi, Python 2.3.3 (#1, Dec 24 2003, 00:21:04) [GCC 3.3] on linux2 Please forgive me if I am bold in the assumption of a bug. It probably is not a bug but sure has me puzzled. I have written a function using PyInline to modify three matrices (or, lists of lists) of integers. The function appears to run fine. After the function has been called, I am using the following assignment: minscore = min(scoremat[tlen][qlen],delmat[tlen][qlen],insertmat[teln][qlen]) However, the call breaks with: TypeError: an integer is required Then I went a step further and here is the sequence of calls: def SeqMapping(tseq,qseq,seqmap,invmap,tlen,qlen): # determine each energy components given the alignments # modifies seqmap, invmap # returns (minscore,identical) global LARGESCORE,OPENGAPPENALTY,ONEINDELPENALTY,EMPTY opengappenalty = OPENGAPPENALTY oneindelpenalty = ONEINDELPENALTY largescore = LARGESCORE empty = EMPTY minscore = largescore scoremat = [] insertmat = [] delmat = [] st = time.clock() DynAlign(scoremat,insertmat,delmat,''.join(tseq),''.join(qseq),tlen,qlen) et = time.clock() print "time in nested loop: %f" % (et-st) m1 = int(delmat[tlen][qlen]) print m1 m2 = int(insertmat[tlen][qlen]) print m2 m3 = int(scoremat[tlen][qlen]) print m3 #minscore = min(int(delmat[tlen][qlen]), int(insertmat[tlen][qlen]), \ #int(scoremat[tlen][qlen])) print min(m1,m2,m3) This is the output: time in nested loop: 0.250000 403 394 402 Traceback (most recent call last): File "./primegens_old.py", line 1653, in ? GetSegment() File "./primegens_old.py", line 1496, in GetSegment tmp_score,sim = SeqMapping(seq[ndx1],seq[0],seqmapping,inversemapping,length[ndx1],length[0]) File "./primegens_old.py", line 676, in SeqMapping print min(m1,m2,m3) TypeError: an integer is required Cleary scoremat[tlen][qlen], delmat[tlen][qlen] and insertmat[tlen][qlen] are integers (I even had type() show me they are 'int'). I am even using int() to make sure it is all right. Still the output is as above and I am really puzzled. Is it something really trivial I am missing? Thank you, Ognen p.s. just in case, here is the inlined C function: from PyInline import C import __main__ C.Builder(code=""" void DynAlign(PyObject *scoremat,PyObject *insertmat,PyObject *delmat,PyObject *tseq,PyObject *qseq,PyObject *tlen,PyObject$ { #define MIN(a,b) ((a) < (b) ? (a):(b)) int opengappenalty; int oneindelpenalty; int largescore; int cnst; int ndx1, ndx2; int tl,ql; char *tsq; char *qsq; opengappenalty = 4; oneindelpenalty = 2; largescore = 999999; tl = PyInt_AsLong(tlen); ql = PyInt_AsLong(qlen); tsq = PyString_AsString(tseq); qsq = PyString_AsString(qseq); for (ndx1=0; ndx1<=tl; ndx1++) { PyList_Append(scoremat,PyList_New(ql+1)); PyList_Append(insertmat,PyList_New(ql+1)); PyList_Append(delmat,PyList_New(ql+1)); } for (ndx1=1; ndx1<=tl; ndx1++) { long num; PyObject* inner_list = NULL; PyObject* item; num = opengappenalty + (ndx1 * oneindelpenalty); inner_list = PyList_GetItem(scoremat,ndx1); item = PyInt_FromLong(num); PyList_SetItem(inner_list,0,PyNumber_Int(item)); inner_list = PyList_GetItem(insertmat,ndx1); PyList_SetItem(inner_list,0,PyNumber_Int(item)); inner_list = PyList_GetItem(delmat,ndx1); item = PyInt_FromLong(largescore); PyList_SetItem(inner_list,0,PyNumber_Int(item)); } for (ndx1=1; ndx1<=ql; ndx1++) { long num; PyObject* inner_list = NULL; PyObject* item; num = opengappenalty + (ndx1 * oneindelpenalty); inner_list = PyList_GetItem(scoremat,0); item = PyInt_FromLong(num); PyList_SetItem(inner_list,ndx1,PyNumber_Int(item)); inner_list = PyList_GetItem(delmat,0); PyList_SetItem(inner_list,ndx1,PyNumber_Int(item)); inner_list = PyList_GetItem(insertmat,0); item = PyInt_FromLong(largescore); PyList_SetItem(inner_list,ndx1,PyNumber_Int(item)); } cnst = oneindelpenalty + opengappenalty; for (ndx1=1; ndx1<=tl; ndx1++) { for (ndx2=1; ndx2<=ql; ndx2++) { PyObject *delm1 = NULL; PyObject *delm2 = NULL; PyObject *scmat1 = NULL; PyObject *scmat2 = NULL; PyObject *insmat1 = NULL; PyObject *insmat2 = NULL; PyObject *item = NULL; long n1,n2,n3,minm,add; delm1 = PyList_GetItem(delmat,ndx1); delm2 = PyList_GetItem(delmat,ndx1-1); scmat1 = PyList_GetItem(scoremat,ndx1); scmat2 = PyList_GetItem(scoremat,ndx1-1); insmat1 = PyList_GetItem(insertmat,ndx1); insmat2 = PyList_GetItem(insertmat,ndx1-1); item = PyList_GetItem(delm2,ndx2); n1 = PyInt_AsLong(item) + oneindelpenalty; item = PyList_GetItem(scmat2,ndx2); n2 = PyInt_AsLong(item) + cnst; item = PyList_GetItem(insmat2,ndx2); n3 = PyInt_AsLong(item) + cnst; minm = MIN(n1,MIN(n2,n3)); item = PyInt_FromLong(minm); PyList_SetItem(delm1,ndx2,PyNumber_Int(item)); item = PyList_GetItem(insmat1,ndx2-1); n1 = PyInt_AsLong(item) + oneindelpenalty; item = PyList_GetItem(scmat1,ndx2-1); n2 = PyInt_AsLong(item) + cnst; item = PyList_GetItem(delm1,ndx2-1); n3 = PyInt_AsLong(item) + cnst; minm = MIN(n1,MIN(n2,n3)); item = PyInt_FromLong(minm); PyList_SetItem(insmat1,ndx2,PyNumber_Int(item)); if (tsq[ndx1-1] == qsq[ndx2-1]) add = -1; else add = 1; item = PyList_GetItem(scmat2,ndx2-1); n1 = PyInt_AsLong(item); item = PyList_GetItem(delm2,ndx2-1); n2 = PyInt_AsLong(item); item = PyList_GetItem(insmat2,ndx2-1); n3 = PyInt_AsLong(item) + cnst; minm = MIN(n1,MIN(n2,n3)) + add; item = PyInt_FromLong(minm); PyList_SetItem(scmat1,ndx2,PyNumber_Int(item)); } } } """,targetmodule=__main__).build() From jrw at pobox.com Tue Jan 13 14:10:10 2004 From: jrw at pobox.com (John Williams) Date: Tue Jan 13 14:10:30 2004 Subject: [Python-Dev] list.append optimization Was: collections module In-Reply-To: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> Message-ID: <1074021011.3668.8682.camel@lcc54.languagecomputer.com> Tim's optimization seems to help even more on Linux than Windows. On my box each iteration of the test program takes 0.80s without the patch and 0.27 with it. jw From sdeibel at wingide.com Tue Jan 13 14:21:24 2004 From: sdeibel at wingide.com (Stephan Deibel) Date: Tue Jan 13 14:20:29 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: References: Message-ID: > On Jan 13, 2004, at 1:51 PM, Stephan Deibel wrote: > > [ 728278 ] Multiple webbrowser.py bug fixes / improvements Sorry, should have posted the link. Apparently it's not easy to find this w/ the above info: http://sf.net/tracker/index.php?func=detail&aid=728278&group_id=5470&atid=305470 - Stephan From tim.one at comcast.net Tue Jan 13 14:24:08 2004 From: tim.one at comcast.net (Tim Peters) Date: Tue Jan 13 14:24:15 2004 Subject: [Python-Dev] list.append optimization Was: collections module In-Reply-To: <1074021011.3668.8682.camel@lcc54.languagecomputer.com> Message-ID: [John Williams] > Tim's optimization seems to help even more on Linux than Windows. On > my box each iteration of the test program takes 0.80s without the > patch and 0.27 with it. Wow! That's better than I had hoped for. I'll note one subtlety: the call to realloc() probably endures mutex locking overhead to keep it threadsafe, even if the realloc() is (in the end) a nop. It certainly does under Microsoft's realloc(), but the native locking gimmicks on Windows are very fast (Windows loves threads). From tim.one at comcast.net Tue Jan 13 14:34:17 2004 From: tim.one at comcast.net (Tim Peters) Date: Tue Jan 13 14:34:21 2004 Subject: [Python-Dev] list.append optimization Was: collections module In-Reply-To: <00bf01c3d9fe$25f2de20$6e02a044@oemcomputer> Message-ID: [Raymond Hettinger] > I'll look at the this this weekend. > > The improvement is awesome. Hope it has doesn't muck-up anything > else. It probably does. It introduces a deep assumption that a list is *never* allocated, or resized, to hold ob_size slots, but always to hold (at lesat) roundupsize(ob_size) slots. If that's ever wrong, the "after" code may not call realloc() when a realloc is needed (for example, leave out the part that patched PyList_New(), and watch Python crash all over the place). One of the standard tests crashed, but I didn't have time to figure out why. It seemed odd because it died in some sorting test. A note about why it "should be" correct: for all i >= 0 roundupsize(i) > i [1] is true. So, calling ob_size "n" for brevity, if we ever reach a point where roundupsize(n) == n+1 then since roundupsize(n+1) > n+1 (by #1, letting i = n+1) it follows that roundupsize(n+1) > n+1 == roundupsize(n) Or, IOW, when roundupsize(n)==n+1, we're exactly at a boundary where roundupsize(n+1) grows, so that's a point at which adding a new element requires calling realloc. Otherwise roundupsize(n) > n+1 (since roundupsize(n) > n by #1, it's not possible that roundupsize(n) < n+1), so there's room enough already for an additional element, and realloc() doesn't need to be called. From raymond.hettinger at verizon.net Tue Jan 13 14:51:56 2004 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue Jan 13 14:53:08 2004 Subject: [Python-Dev] Accumulation module Message-ID: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> I'm working a module with some accumulation/reduction/statistical formulas: average(iterable): stddev(iterable, sample=False) product(iterable) nlargest(iterable, n=1) nsmallest(iterable, n=1) The questions that have arisen so far are: * What to call the module * What else should be in it? * Is "sample" a good keyword to distinguish from population stddev? * There seems to be a speed/space choice on how to implement nlargest/nsmallest. The faster way lists out the entire iterable, heapifies it, and pops off the top n elements. The slower way is less memory intensive: build only a n-length heap and then just do a heapreplace when necessary. Note, heapq is used for both (I use operator.neg to swap between largest and smallest). * Is there a way to compute the standard deviation without multiple passes over the data (one to compute the mean and a second to sum the squares of the differences from the mean? Raymond Hettinger From jepler at unpythonic.net Tue Jan 13 14:53:31 2004 From: jepler at unpythonic.net (Jeff Epler) Date: Tue Jan 13 14:53:48 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> Message-ID: <20040113195331.GB6601@unpythonic.net> To repeat findings on a Linux machine (redhat9, x86, 2.4GHz): Old New 0.74 0.22 0.74 0.22 0.74 0.22 0.74 0.22 0.72 0.21 0.76 0.22 0.74 0.22 0.77 0.22 0.75 0.22 0.74 0.22 I also got the same (?) failure as tim described: test_list make: *** [test] Segmentation fault Jeff From bh at intevation.de Tue Jan 13 15:06:54 2004 From: bh at intevation.de (Bernhard Herzog) Date: Tue Jan 13 15:07:03 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> (Raymond Hettinger's message of "Tue, 13 Jan 2004 14:51:56 -0500") References: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: <6qvfnfmx41.fsf@salmakis.intevation.de> "Raymond Hettinger" writes: > nlargest(iterable, n=1) > nsmallest(iterable, n=1) [...] > Note, heapq is used for both (I use > operator.neg to swap between largest and smallest). Does that mean nlargest/nsmallest only work for numbers? I think it might be useful for e.g. strings too. Bernhard -- Intevation GmbH http://intevation.de/ Sketch http://sketch.sourceforge.net/ Thuban http://thuban.intevation.org/ From jeremy at alum.mit.edu Tue Jan 13 15:02:02 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue Jan 13 15:07:06 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> References: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: <1074024120.6341.2238.camel@localhost.localdomain> On Tue, 2004-01-13 at 14:51, Raymond Hettinger wrote: > I'm working a module with some accumulation/reduction/statistical > formulas: > > average(iterable): > stddev(iterable, sample=False) > product(iterable) > nlargest(iterable, n=1) > nsmallest(iterable, n=1) > > The questions that have arisen so far are: > > * What to call the module > > * What else should be in it? median() And a function like bins() or histogram() that accumulates the values in buckets of some size. Jeremy From skip at pobox.com Tue Jan 13 15:14:45 2004 From: skip at pobox.com (Skip Montanaro) Date: Tue Jan 13 15:14:50 2004 Subject: [Python-Dev] patching webbrowser.py for OS X In-Reply-To: References: Message-ID: <16388.20917.972458.843319@montanaro.dyndns.org> >> > [ 728278 ] Multiple webbrowser.py bug fixes / improvements Stephan> Sorry, should have posted the link. Apparently it's not easy Stephan> to find this w/ the above info... Well, yeah, just convert the patch id to http://python.org/sf/728278 Skip From theller at python.net Tue Jan 13 15:17:01 2004 From: theller at python.net (Thomas Heller) Date: Tue Jan 13 15:17:11 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> (Raymond Hettinger's message of "Tue, 13 Jan 2004 14:51:56 -0500") References: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: "Raymond Hettinger" writes: > * Is there a way to compute the standard deviation without multiple > passes over the data (one to compute the mean and a second to sum the > squares of the differences from the mean? Yes ;-). I would look it up in my HP-11C pocket calculator manual, if I had it with me. But I'm also sure someone has it in his head. Thomas From aahz at pythoncraft.com Tue Jan 13 15:20:48 2004 From: aahz at pythoncraft.com (Aahz) Date: Tue Jan 13 15:20:53 2004 Subject: [Python-Dev] a possible bug (or something I don't understand)? In-Reply-To: References: Message-ID: <20040113202048.GA9457@panix.com> On Tue, Jan 13, 2004, Ognen Duzlevski wrote: > > Please forgive me if I am bold in the assumption of a bug. It probably is > not a bug but sure has me puzzled. You'll probably get more help solving this if you post to comp.lang.python. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From barry at barrys-emacs.org Tue Jan 13 15:21:09 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Tue Jan 13 15:21:30 2004 Subject: [Python-Dev] re: The os module, unix and win32 In-Reply-To: <0HRC00H9FODC7D@mta7.srv.hcvlny.cv.net> References: <0HRC00H9FODC7D@mta7.srv.hcvlny.cv.net> Message-ID: <6.0.1.1.2.20040113194123.02188e98@torment.chelsea.private> At 12-01-2004 00:07, Arthur wrote: >But Thomas (who should know) wrote: > > >bdist_wininst can create Windows shortcuts in it's postinstall script. > >Thomas (off-list) had also given me a cite to some code, which I haven't had >the opportunity to try to digest. > >Is this a semantic issue as to what can and cannot be done on Windows with >disutils? > >I thought I gathered from Thomas that the functionality he is referencing is >new as of 2.3. You could say that we are both right. Turns out that the key is the special wininst.exe that is used to run the postinstall script. Its wininst.exe that implements the shortcut code in C and makes them available to the postinstall script that is its argument. wininst.exe only implements creating shortcuts a standard library module would need to implement support reading shortcuts and the other shortcut actions, resolving(?). Using win32all you can do everything with short cuts, create, modify, read, resolve, etc. Barry From aahz at pythoncraft.com Tue Jan 13 15:21:58 2004 From: aahz at pythoncraft.com (Aahz) Date: Tue Jan 13 15:22:03 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> References: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: <20040113202158.GB9457@panix.com> On Tue, Jan 13, 2004, Raymond Hettinger wrote: > > I'm working a module with some accumulation/reduction/statistical > formulas: > > average(iterable): > stddev(iterable, sample=False) > product(iterable) > nlargest(iterable, n=1) > nsmallest(iterable, n=1) > > The questions that have arisen so far are: > > * What to call the module stats -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From jacobs at penguin.theopalgroup.com Tue Jan 13 15:26:43 2004 From: jacobs at penguin.theopalgroup.com (Kevin Jacobs) Date: Tue Jan 13 15:26:46 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: On Tue, 13 Jan 2004, Raymond Hettinger wrote: > * Is there a way to compute the standard deviation without multiple > passes over the data (one to compute the mean and a second to sum the > squares of the differences from the mean? Yes -- def one_pass_stddev(l): n = 0 x = 0. xx = 0. for y in l: n += 1 x += y xx += y*y x /= n xx /= n var = max(0,xx - x*x) dev = var**0.5 return dev Skewness and kurtosis can also be computed in one pass, though numerical stability problems can occur (even with std.dev) with these kinds of methods. -Kevin From tim.one at comcast.net Tue Jan 13 15:30:13 2004 From: tim.one at comcast.net (Tim Peters) Date: Tue Jan 13 15:30:16 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040113195331.GB6601@unpythonic.net> Message-ID: [Jeff Epler] > To repeat findings on a Linux machine (redhat9, x86, 2.4GHz): > Old New > 0.74 0.22 > 0.74 0.22 > 0.74 0.22 > 0.74 0.22 > 0.72 0.21 > 0.76 0.22 > 0.74 0.22 > 0.77 0.22 > 0.75 0.22 > 0.74 0.22 > > I also got the same (?) failure as tim described: > test_list > make: *** [test] Segmentation fault Yup, that's the one. This line in listsort(): self->ob_item = empty_ob_item = PyMem_NEW(PyObject *, 0); violates the new assumption too; it can be replaced with, e.g., self->ob_item = empty_ob_item = PyMem_NEW(PyObject *, roundupsize(0) * sizeof(PyObject*)); although that should really be checked for a NULL return too. From jepler at unpythonic.net Tue Jan 13 15:53:00 2004 From: jepler at unpythonic.net (Jeff Epler) Date: Tue Jan 13 15:53:11 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040113195331.GB6601@unpythonic.net> References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> <20040113195331.GB6601@unpythonic.net> Message-ID: <20040113205259.GC6601@unpythonic.net> On Tue, Jan 13, 2004 at 01:53:31PM -0600, Jeff Epler wrote: > I also got the same (?) failure as tim described: > test_list > make: *** [test] Segmentation fault test_sort (test.test_list.ListTest) ... ==32746== ==32746== Invalid write of size 4 ==32746== at 0x806A855: ins1 (listobject.c:175) ==32746== by 0x806B144: listappend (listobject.c:637) ==32746== by 0x80A5928: call_function (ceval.c:3440) ==32746== by 0x80A3F16: eval_frame (ceval.c:2127) ==32746== Address 0x41302B2C is 0 bytes inside a block of size 1 alloc'd ==32746== at 0x40160DB8: malloc (in /usr/lib/valgrind/valgrind.so) ==32746== by 0x806B50E: listsort (listobject.c:1884) ==32746== by 0x80EEF07: PyCFunction_Call (methodobject.c:118) ==32746== by 0x805AAD3: PyObject_Call (abstract.c:1759) It's in the self-modifying list test: z = range(12) def c(x, y): z.append(1) # HERE return cmp(x, y) z.sort(c) Jeff From tim.one at comcast.net Tue Jan 13 15:59:18 2004 From: tim.one at comcast.net (Tim Peters) Date: Tue Jan 13 15:59:24 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: [Raymond Hettinger] > I'm working a module with some accumulation/reduction/statistical > formulas: There are several statistical pkgs available for Python right now, so I hope you're scouring them for ideas and algorithms. > average(iterable): > stddev(iterable, sample=False) sample should probably default to True (since that's what most people really want, even if they don't realize it). > product(iterable) > nlargest(iterable, n=1) > nsmallest(iterable, n=1) > > The questions that have arisen so far are: > > * What to call the module > > * What else should be in it? > > * Is "sample" a good keyword to distinguish from population stddev? I think so, but I'm sure you're looking at what other pkgs do . > * There seems to be a speed/space choice on how to implement > nlargest/nsmallest. The faster way lists out the entire iterable, > heapifies it, and pops off the top n elements. The slower way is less > memory intensive: build only a n-length heap and then just do a > heapreplace when necessary. Note, heapq is used for both (I use > operator.neg to swap between largest and smallest). So long as n is significantly smaller than the # of elements in the iterable, I expect the latter way is also faster. For example, consider n=1, and let N be the # of incoming elements. Then N-1 comparisons are necessary and sufficient (to find the min, or the max). Using a heap of size 1 achieves that. Heapifying a list of N elements requires more comparisons than that ("it varies", and while heapifying is O(N), it's not exactly N). In effect, all the work a heap of size N does to distinguish among elements larger/smaller than the n smallest/largest so far is wasted, as those can have no effect on the final result. > * Is there a way to compute the standard deviation without multiple > passes over the data (one to compute the mean and a second to sum the > squares of the differences from the mean? The textbook formula does it in one pass, by accumulating running totals of the elements and of their squares. It's numerically atrocious, though, and should never be used (it can even end up delivering a negative result!). Knuth gives a much better recurrence formula for this, doing one-pass computation of variance in a numerically stable way. From doko at cs.tu-berlin.de Tue Jan 13 15:56:41 2004 From: doko at cs.tu-berlin.de (Matthias Klose) Date: Tue Jan 13 16:01:34 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> References: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: <16388.23433.654722.329442@gargle.gargle.HOWL> Raymond Hettinger writes: > I'm working a module with some accumulation/reduction/statistical > formulas: > > average(iterable): > stddev(iterable, sample=False) > product(iterable) > nlargest(iterable, n=1) > nsmallest(iterable, n=1) > > The questions that have arisen so far are: > > * What to call the module > > * What else should be in it? you may want to have a look at http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html From tim.one at comcast.net Tue Jan 13 16:03:29 2004 From: tim.one at comcast.net (Tim Peters) Date: Tue Jan 13 16:03:32 2004 Subject: [Python-Dev] collections module In-Reply-To: <1074019066.40043afacdfc6@mcherm.com> Message-ID: [Michael Chermside] > ... > I'm sure this is just pointing out the obvious, but *IF* the list > header were increased, we could add TWO fields... According to Tim, > this would still "only" bump list objects up to 24 bytes, but ... There's something else to note here, in connection with the realloc-avoiding list patch/hack I posted yesterday: for that logic to work, a non-empty list must always contain enough slots for roundupsize(ob->size) elements. So, for example, a list of length 1 must nevertheless contain room for 8 objects, which on a 32-bit machine is 7*4 = 28 bytes in the list *guts* it doesn't need today (but does need after that patch). So if people like that patch, it would be more *memory*-efficient too to drop the roundupsize() business and just expand the list header to remember how many slots have been allocated. Then no overallocation would be needed. Adding 8 bytes in the list header should be a net win over asking for up to 28 unused bytes in the guts (of small lists; it can be larger than that for large lists). From kbk at shore.net Tue Jan 13 21:02:51 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Tue Jan 13 21:02:55 2004 Subject: [Python-Dev] collections module References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> Message-ID: <87d69ne184.fsf@hydra.localdomain> Guido van Rossum writes: > Awesome! We need timing results on some other platforms (Linux, > FreeBSD, MacOSX). OpenBSD 3.3, i386 2.4Ghz P4 Celeron, 256MB, 128kB cache before after ------ ----- 1.2 0.64 1.15 0.60 1.16 0.60 1.16 0.60 1.16 0.61 1.15 0.60 1.13 0.61 1.15 0.59 1.14 0.62 1.15 0.60 before: 15773 pystones, 29.9 mPB after: 12690 pystones, 31.4 mPB Switch back to CVS: 19531, 19379, 19531, 19379, 19305, 19084 pystones, 31.4 mPB Switch back to Tim's patch: 15015, 15480, 14588 pystones should not need, but: make clean && make (just make on previous, only listobject.c being recompiled) Still patched: 13158, 14006, 13193; restart python; 15923, 16722, 16023 restart python; 12953, 13698, 12953, 13513 Well, pystones are all over the map on this box. But the improvement is there for punchit and the parrotbench benchmark is solid across the change. -- KBK From bob at redivi.com Tue Jan 13 22:55:56 2004 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 13 22:53:37 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> Message-ID: <8E5CCA0E-4645-11D8-9430-000A95686CD8@redivi.com> On Jan 13, 2004, at 12:14 PM, Guido van Rossum wrote: >> [Guido] >>> Just in case nobody had thought of it before, couldn't the realloc() >>> call be avoided when roundupsize(oldsize) == roundupsize(newsize)??? > > [Tim] >> Followup: it's not quite that easy, because (at least) >> PyList_New(int size) >> can create a list whose allocated size hasn't been rounded up. >> >> If I "fix" that, then it's possible to get away with just one >> roundupsize() >> call on most list.insert() calls (including list.append()), while >> avoiding >> the realloc() call too. Patch attached. >> >> I timed it like so: >> >> def punchit(): >> from time import clock as now >> a = [] >> push = a.append >> start = now() >> for i in xrange(1000000): >> push(i) >> finish = now() >> return finish - start >> >> for i in range(10): >> print punchit() >> >> and got these elapsed times (this is Windows, so this is elapsed >> wall-clock >> time): >> >> before after >> ------------- -------------- >> 1.05227710823 1.02203188119 >> 1.05532442716 0.569660068053 >> 1.05461539751 0.568627533147 >> 1.0564449622 0.569336562799 >> 1.05964146231 0.573247959235 >> 1.05679528655 0.568503494862 >> 1.05569402772 0.569745553898 >> 1.05383177727 0.569499991619 >> 1.05528000805 0.569163914916 >> 1.05636618113 0.570356526258 >> >> So pretty consistent here, apart from the glitch on the first "after" >> run, >> and about an 80% speedup. >> >> Yawn . >> >> If someone wants to run with this, be my guest. There may be other >> holes >> (c.f. PyList_New above), and it could sure use a comment about why >> it's >> correct (if it in fact is <0.9 wink>). >> > Awesome! We need timing results on some other platforms (Linux, > FreeBSD, MacOSX). > > Volunteers? Mac OS X 10.3.2, gcc (GCC) 3.3 20030304 (Apple Computer, Inc. build 1495), dual g5 2ghz w/ 1gb ram, ./configure --disable-framework before after 0.38 0.29 0.36 0.26 0.36 0.26 0.36 0.27 0.36 0.26 0.36 0.26 0.36 0.26 0.36 0.26 0.36 0.27 0.36 0.26 Mac OS X 10.3.2, gcc (GCC) 3.3 20030304 (Apple Computer, Inc. build 1495), dvi titanium powerbook g4 1ghz w/ 1gb ram, ./configure --disable-framework before after 0.78 0.57 0.74 0.53 0.73 0.53 0.74 0.52 0.73 0.52 0.76 0.53 0.73 0.53 0.75 0.53 0.75 0.53 0.75 0.52 So it seems that the 2ghz g5 is about twice as fast as the 1ghz g4 for Python (when running TIm's test), and that either way you save about 25% with Tim's patch. -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2357 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040113/7c6d47d1/smime.bin From python at rcn.com Wed Jan 14 03:24:41 2004 From: python at rcn.com (Raymond Hettinger) Date: Wed Jan 14 03:26:30 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <20040113202158.GB9457@panix.com> Message-ID: <005801c3da78$01246940$e102a044@oemcomputer> > > * What to call the module [Aahz] > stats There is already a stat module. Any chance of confusion? The other naming issue is that some of the functions have non-statistical uses: product() is general purpose; nlargest() and nsmallest() will accept any datatype (though most of the use cases are with numbers). Are there other general purpose (non-statistical) accumulation/reduction formulas that go here? > > * What else should be in it? [Matthias Klose] > you may want to have a look at > http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html Ages ago, when the idea for this module first arose, a certain bot recommended strongly against including any but the most basic statistical functions (muttering something about the near impossibility of doing it well in either python or portable C and something about not wanting to maintain anything that wasn't dirt simple). His words would have of course fallen on deaf ears, but a certain dictatorial type had just finished teaching advanced programming skills to people who couldn't operate a high school calculator. Sooooo, no Kurtosis for you, no gamma function for me! It's possible that chi-square or regression could slip in, but it would require considerable cheerleading and a rare planetary alignment. > > * What else should be in it? [Jeremy] > median() > And a function like bins() or histogram() that accumulates > the values in buckets of some size. That sounds beginner simple and reasonably useful though it would have been nice if all the reduction formulas could work with one-pass and never need to manifest the whole dataset in memory. > > Note, heapq is used for both (I use > > operator.neg to swap between largest and smallest). [Bernhard Herzog] > Does that mean nlargest/nsmallest only work for numbers? I think it > might be useful for e.g. strings too. The plan was to make them work with anything defining __lt__; however, if it is coded in python and uses heapq, I don't see a straight-forward way around using operator.neg without wrapping everything in some sense reverser object. Raymond Hettinger From python at rcn.com Wed Jan 14 07:40:43 2004 From: python at rcn.com (Raymond Hettinger) Date: Wed Jan 14 07:41:35 2004 Subject: [Python-Dev] Accumulation module Message-ID: <007401c3da9b$a05336e0$e102a044@oemcomputer> > > * Is there a way to compute the standard deviation without multiple > > passes over the data (one to compute the mean and a second to sum the > > squares of the differences from the mean? [Tim] > The textbook formula does it in one pass, by accumulating running totals > of > the elements and of their squares. It's numerically atrocious, though, > and > should never be used (it can even end up delivering a negative result!). > Knuth gives a much better recurrence formula for this, doing one-pass > computation of variance in a numerically stable way. I found it. There is a suggestion that the formula would be better off if the recurrence sums are accumulated along with a cumulative error term that tracks the loss of significance when adding a small adjustment to a larger sum. The textbook exercise indicates that that technique is the most useful in the presence of crummy rounding. Do you think there is any value in using the error term tracking? def stddev(data): # Knuth -- Seminumerical Algorithms 4.2.2 p.216 it = iter(data) m = float(it.next()) # new running mean s = 0.0 # running sum((x-mean)**2) k = 1 # number of items dm = ds = 0 # carried forward error term for x in it: k += 1 adjm = (x-m)/k - dm newm = m + adjm dm = (newm - m) - adjm adjs = (x-m)*(x-newm) - ds news = s + adjs ds = (news - s) - adjs s = news m = newm return (s / (k-1)) ** 0.5 # sample standard deviation Without the error term tracking, the inner loop simplifies to: k += 1 newm = m+(x-m)/k s += (x-m)*(x-newm) m = newm Raymond Hettinger From guido at python.org Wed Jan 14 11:06:27 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 14 11:06:38 2004 Subject: [Python-Dev] collections module In-Reply-To: Your message of "Tue, 13 Jan 2004 21:02:51 EST." <87d69ne184.fsf@hydra.localdomain> References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> <87d69ne184.fsf@hydra.localdomain> Message-ID: <200401141606.i0EG6Se10228@c-24-5-183-134.client.comcast.net> > OpenBSD 3.3, i386 2.4Ghz P4 Celeron, 256MB, 128kB cache > > before after > ------ ----- > 1.2 0.64 > 1.15 0.60 > 1.16 0.60 > 1.16 0.60 > 1.16 0.61 > 1.15 0.60 > 1.13 0.61 > 1.15 0.59 > 1.14 0.62 > 1.15 0.60 > > before: 15773 pystones, 29.9 mPB > after: 12690 pystones, 31.4 mPB What's the mPB number? If its the time to complete parrotbench, this is actually a slowdown, just as for pystone. > Switch back to CVS: > > 19531, 19379, 19531, 19379, 19305, 19084 pystones, 31.4 mPB > > Switch back to Tim's patch: > > 15015, 15480, 14588 pystones > > should not need, but: > make clean && make (just make on previous, only listobject.c being recompiled) > > Still patched: > 13158, 14006, 13193; restart python; 15923, 16722, 16023 > restart python; 12953, 13698, 12953, 13513 > > Well, pystones are all over the map on this box. But the > improvement is there for punchit and the parrotbench benchmark is > solid across the change. Unclear. Please clarify mPB. --Guido van Rossum (home page: http://www.python.org/~guido/) From jepler at unpythonic.net Wed Jan 14 11:40:21 2004 From: jepler at unpythonic.net (Jeff Epler) Date: Wed Jan 14 11:49:59 2004 Subject: [Python-Dev] collections module In-Reply-To: <87d69ne184.fsf@hydra.localdomain> References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> <87d69ne184.fsf@hydra.localdomain> Message-ID: <20040114164021.GB10507@unpythonic.net> I did my own pybenching, and then wanked with a spreadsheet. Python-listopt seems a hair faster (~2%) I think the t-test says there's 90% confidence that the difference is statistically significant. I haven't had a chance to run the pie benchmark. redhat9, x86 2.4GHz, Python CVS. Python-cvs Python-listopt 29239.8 30303 29940.1 30864.2 29585.9 30487.8 29585.8 29411.8 29940.1 30487.8 29411.8 30303 29761.9 30303 27027 30303 30303 30487.8 29069.8 27777.8 Minimum 27027 27777.8 Maximum 30303 30864.2 Median 29585.85 30303 Average 29386.52 30072.92 Std.Dev 904.49 885.61 T-test 10.4% =ttest(b2:b11; c2:c11; 2; 3) From jepler at unpythonic.net Wed Jan 14 12:01:42 2004 From: jepler at unpythonic.net (Jeff Epler) Date: Wed Jan 14 12:01:54 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040114164021.GB10507@unpythonic.net> References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> <87d69ne184.fsf@hydra.localdomain> <20040114164021.GB10507@unpythonic.net> Message-ID: <20040114170142.GD10507@unpythonic.net> Here are the parrotbench results, using the "user" field from "make time" output. There's a clear speedup of nearly 1 second in the -listopt version. During parts of these tests my system was also running nice'd distcc jobs, but since user time was measured this shouldn't matter much. "real" times were slower for python-listopt (0.56 seconds difference of median values) and more scattered (1.52 std dev vs 0.32 std dev) for this reason. Python-cvs Python-listopt 18.32 18.39 18.44 18.39 18.83 17.53 18.75 17.38 18.63 17.37 18.91 17.36 18.55 17.63 18.66 18.06 18.68 17.38 18.74 17.92 Minimum 18.32 17.36 Maximum 18.91 18.39 Median 18.67 17.58 Average 18.65 17.74 Std.Dev 0.18 0.42 T-test 0.00% =ttest(b2:b11; c2:c11; 2; 3) Difference of median values -0.91 From james.kew at btinternet.com Wed Jan 14 14:53:16 2004 From: james.kew at btinternet.com (James Kew) Date: Wed Jan 14 14:53:07 2004 Subject: [Python-Dev] Re: Accumulation module References: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: "Raymond Hettinger" wrote in message news:00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer... > I'm working a module with some accumulation/reduction/statistical > formulas: > > average(iterable) "Average" is a notoriously vague word. Prefer "mean" -- which opens the door for "median" and "mode" to join the party. I could have used a stats module a few months ago -- feels like a bit of an omission from the standard library, especially compared against the richness of the random module. James From kbk at shore.net Wed Jan 14 15:04:15 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed Jan 14 15:04:20 2004 Subject: [Python-Dev] collections module In-Reply-To: <200401141606.i0EG6Se10228@c-24-5-183-134.client.comcast.net> (Guido van Rossum's message of "Wed, 14 Jan 2004 08:06:27 -0800") References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> <87d69ne184.fsf@hydra.localdomain> <200401141606.i0EG6Se10228@c-24-5-183-134.client.comcast.net> Message-ID: <87u12ycn5s.fsf@hydra.localdomain> Guido van Rossum writes: >> before: 15773 pystones, 29.9 mPB >> after: 12690 pystones, 31.4 mPB > > What's the mPB number? If its the time to complete parrotbench, this > is actually a slowdown, just as for pystone. > >> Switch back to CVS: >> >> 19531, 19379, 19531, 19379, 19305, 19084 pystones, 31.4 mPB >> >> Switch back to Tim's patch: >> >> 15015, 15480, 14588 pystones >> >> should not need, but: >> make clean && make (just make on previous, only listobject.c being recompiled) >> >> Still patched: >> 13158, 14006, 13193; restart python; 15923, 16722, 16023 >> restart python; 12953, 13698, 12953, 13513 >> >> Well, pystones are all over the map on this box. But the >> improvement is there for punchit and the parrotbench benchmark is >> solid across the change. > > Unclear. Please clarify mPB. milliParrotBenches/sec. 1000/(time to complete Parrotbench [sec]) In a few years we'll be spec'ing kPB :-) So the data so far is 29.9 mPB unpatched 31.4 mPB patched 31.4 mPB unpatched This morning: 30.7, 30.4, 30.1, 32.1, 32.2, 31.9 mPB patched (using make time) rm Objects/listobject.c && cvs up Objects/listobject.c make clean && make 30.1, 30.2, 30.3, 32.1, 29.6, 29.9, 28.2 mPB unpatched Average unpatched: 30.2 mPB Average patched: 31.3 mPB So a 3 - 4% improvement, maybe. Needs more data. I reported the equivalent of 31.1 mPB on 2Jan04, which was the best of several trials. The 'best of 3' pystones at that time was 17182. Pystone ------- My impression is that Pystone isn't a good benchmark for this box unless run many times over a long period with python restarts, perhaps a cron at night. It may be the 128kB cache, but the results are very variable. This morning, with patched listobject.c, it's 15015, 15060, 14881; restart python; 16891, 17182, 17361 restart python; 14925, 14970, 14925, 14492, 14662, 14836, 15060, 15106, 15060 append sys.path and import a module: 13889, 13927, 14084 do some stuff, then: 14662, 14368, 14577 restart python; 14836, 14881, 14836 restart python; 15337, 15337, 15337, 15773, 15873, 15674, 15479, 15432, 14792 Following the switch back to unpatched mentioned above: 18182, 18050, 18248, 18450, 18248, 18181 restart python; 14881, 14970, 14705, 14836, 14749, 14970 restart python; 16835, 16835, 16778, 17064, 17182, 16892 restart python; append sys.path, import module; 20000, 20408, 20080, 20808, 19763, 20000 do some stuff; 19305, 19762, 19607, 19084, 19455, 19455 restart python; 17605, 17731, 17482, 17544, 17544, 17483 On these two sets: Average unpatched: 17865 pystones Average patched: 15136 pystones A 15% reduction with the patch. This *may* be significant, but requires a lot more data here and investigation by others. On this box the pystone rating jumps between python restarts. I haven't calculated any more advanced statistics; IMHO with small data sets, especially jumpy, ragged ones like these, you can often get a better feel by just eyeballing it. The downside being, you tend to see what you want to see.... -- KBK From martin at v.loewis.de Wed Jan 14 15:08:58 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 14 15:09:33 2004 Subject: [Python-Dev] Allowing non-ASCII identifiers Message-ID: <4005A1DA.207@v.loewis.de> I'd like to work on adding support for non-ASCII characters in identifiers, using the following principles: 1. At run-time, identifiers are represented as Unicode objects unless they are pure ASCII. IOW, they are converted from the source encoding to Unicode objects in the process of parsing. 2. As a consequence of 1), all places there identifiers appear need to support Unicode objects (e.g. __dict__, __getattr__, etc) 3. Legal non-ASCII identifiers are what legal non-ASCII identifiers are in Java, except that Python may use a different version of the Unicode character database. Python would share the property that future versions allow more characters in identifiers than older versions. If you are too lazy too look up the Java definition, here is a rough overview: An identifier is "JavaLetter JavaLetterOrDigit*" JavaLetter is a character of the classes Lu, Ll, Lt, Lm, or Lo, or a currency symbol (for Python: excluding $), or a connecting punctuation character (which is unfortunately underspecified - will research the implementation). JavaLetterOrDigit is a JavaLetter, or a digit, a numeric letter, a combining mark, a non-spacing mark, or an ignorable control character. Does this need a PEP? Regards, Martin From guido at python.org Wed Jan 14 15:16:56 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 14 15:17:04 2004 Subject: [Python-Dev] Allowing non-ASCII identifiers In-Reply-To: Your message of "Wed, 14 Jan 2004 21:08:58 +0100." <4005A1DA.207@v.loewis.de> References: <4005A1DA.207@v.loewis.de> Message-ID: <200401142016.i0EKGuV10980@c-24-5-183-134.client.comcast.net> > I'd like to work on adding support for non-ASCII characters > in identifiers, using the following principles: > > 1. At run-time, identifiers are represented as Unicode > objects unless they are pure ASCII. IOW, they are > converted from the source encoding to Unicode objects > in the process of parsing. > > 2. As a consequence of 1), all places there identifiers > appear need to support Unicode objects (e.g. __dict__, > __getattr__, etc) > > 3. Legal non-ASCII identifiers are what legal non-ASCII > identifiers are in Java, except that Python may use > a different version of the Unicode character database. > Python would share the property that future versions > allow more characters in identifiers than older versions. > > If you are too lazy too look up the Java definition, > here is a rough overview: > An identifier is "JavaLetter JavaLetterOrDigit*" > > JavaLetter is a character of the classes Lu, Ll, > Lt, Lm, or Lo, or a currency symbol (for Python: > excluding $), or a connecting punctuation character > (which is unfortunately underspecified - will > research the implementation). > > JavaLetterOrDigit is a JavaLetter, or a digit, > a numeric letter, a combining mark, a non-spacing > mark, or an ignorable control character. > > Does this need a PEP? Sure does. Since this could create a serious burden for code protability, I'd like to see a serious section on motivation and discussion on how to keep Unicode out of the standard library and out of most 3rd party distributions. Without that I'm strongly -1. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Wed Jan 14 16:12:18 2004 From: aahz at pythoncraft.com (Aahz) Date: Wed Jan 14 16:12:22 2004 Subject: [Python-Dev] Allowing non-ASCII identifiers In-Reply-To: <4005A1DA.207@v.loewis.de> References: <4005A1DA.207@v.loewis.de> Message-ID: <20040114211218.GC27649@panix.com> On Wed, Jan 14, 2004, "Martin v. L?wis" wrote: > > If you are too lazy too look up the Java definition, > here is a rough overview: > An identifier is "JavaLetter JavaLetterOrDigit*" Is that an actual space between the characters? > Does this need a PEP? Yes. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From aahz at pythoncraft.com Wed Jan 14 16:06:50 2004 From: aahz at pythoncraft.com (Aahz) Date: Wed Jan 14 16:26:48 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <005801c3da78$01246940$e102a044@oemcomputer> References: <20040113202158.GB9457@panix.com> <005801c3da78$01246940$e102a044@oemcomputer> Message-ID: <20040114210650.GA27649@panix.com> On Wed, Jan 14, 2004, Raymond Hettinger wrote: > >>> * What to call the module > > [Aahz] >> stats > > There is already a stat module. Any chance of confusion? Make it statistics, then. > The other naming issue is that some of the functions have > non-statistical uses: product() is general purpose; nlargest() and > nsmallest() will accept any datatype (though most of the use cases are > with numbers). Are there other general purpose (non-statistical) > accumulation/reduction formulas that go here? Doesn't matter, IMO. The primary use-case for this module will be statistics; people looking for the more unusual functionality will still be likely to poke in this module for related uses. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From jepler at unpythonic.net Wed Jan 14 16:42:06 2004 From: jepler at unpythonic.net (Jeff Epler) Date: Wed Jan 14 16:42:14 2004 Subject: [Python-Dev] collections module In-Reply-To: <20040114170142.GD10507@unpythonic.net> References: <200401131714.i0DHEMB06084@c-24-5-183-134.client.comcast.net> <87d69ne184.fsf@hydra.localdomain> <20040114164021.GB10507@unpythonic.net> <20040114170142.GD10507@unpythonic.net> Message-ID: <20040114214206.GA17742@unpythonic.net> On Wed, Jan 14, 2004 at 11:01:42AM -0600, Jeff Epler wrote: > Here are the parrotbench results, using the "user" field from "make [snip] I tried to reproduce these results on a Fedora Core 1 machine (x86, 1GHz) but found that the unpatched version performed better by about .4%. I realized that the binary I used had threading disabled. I recompiled and measured the patched version as 1% slower. Jeff CVS List Optimization FC1 28.73 28.9 No threading 28.78 28.84 "user" time 28.65 28.78 on parrotbench 28.77 28.83 28.81 28.87 28.84 28.9 28.65 28.83 28.75 28.87 28.67 28.82 28.79 28.95 Minimum 28.65 28.78 Maximum 28.84 28.95 Median 28.76 28.86 Mean 28.74 28.86 St.Dev 0.07 0.05 T-test 0.05% Difference 0.11 0.40% of means FC1 29.08 29.32 Threading 29.03 29.24 "user" time 29.52 29.37 on parrotbench 29.2 29.31 29.5 29.82 29.12 29.81 29.12 29.69 29.1 29.3 28.99 29.67 29.11 29.12 Minimum 28.99 29.12 Maximum 29.52 29.82 Median 29.12 29.35 Mean 29.18 29.47 St.Dev 0.18 0.26 T-test 1.05% Difference 0.29 0.99% of means From skip at pobox.com Wed Jan 14 17:25:01 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 14 17:29:17 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <20040114210650.GA27649@panix.com> References: <20040113202158.GB9457@panix.com> <005801c3da78$01246940$e102a044@oemcomputer> <20040114210650.GA27649@panix.com> Message-ID: <16389.49597.207503.489274@montanaro.dyndns.org> >>>> * What to call the module >> >> [Aahz] >>> stats >> >> There is already a stat module. Any chance of confusion? aahz> Make it statistics, then. Statistics is a rather big field. Maybe this should be a package, regardless what the final name is so that it would more easily allow future extensions. Skip From martin at v.loewis.de Wed Jan 14 17:29:26 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 14 17:29:43 2004 Subject: [Python-Dev] Allowing non-ASCII identifiers In-Reply-To: <20040114211218.GC27649@panix.com> References: <4005A1DA.207@v.loewis.de> <20040114211218.GC27649@panix.com> Message-ID: <4005C2C6.1030904@v.loewis.de> Aahz wrote: > On Wed, Jan 14, 2004, "Martin v. L?wis" wrote: > >> If you are too lazy too look up the Java definition, >> here is a rough overview: >> An identifier is "JavaLetter JavaLetterOrDigit*" > > > Is that an actual space between the characters? No. It's more like the left-hand side of an EBNF grammar rule. Martin From DavidA at ActiveState.com Thu Jan 15 00:47:15 2004 From: DavidA at ActiveState.com (David Ascher) Date: Thu Jan 15 00:52:26 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <20040114210650.GA27649@panix.com> References: <20040113202158.GB9457@panix.com> <005801c3da78$01246940$e102a044@oemcomputer> <20040114210650.GA27649@panix.com> Message-ID: <40062963.4070808@ActiveState.com> Aahz wrote: >On Wed, Jan 14, 2004, Raymond Hettinger wrote: > > >>>>* What to call the module >>>> >>>> >>[Aahz] >> >> >>>stats >>> >>> >>There is already a stat module. Any chance of confusion? >> >> > >Make it statistics, then. > > There is considerable prior art here worth considering: see http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html From kprwbz at excite.com Thu Jan 15 13:37:30 2004 From: kprwbz at excite.com (Kerry Mcneil) Date: Thu Jan 15 02:43:43 2004 Subject: [Python-Dev] Improve your cup size NATURALLLY! Clinically verified! dbmxbwj mkr r i Message-ID: An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20040115/7f8d47fb/attachment.html From guido at python.org Thu Jan 15 10:07:01 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 15 10:07:12 2004 Subject: [Python-Dev] PyCon Reminders (proposal submission ends tonight!) Message-ID: <200401151507.i0FF72J14249@c-24-5-183-134.client.comcast.net> Proposal Submission Deadline: Tonight! -------------------------------------- Tonight at midnight (in *some* timezone :-) is the last time to submit PyCon proposals. http://pycon.org/dc2004/cfp/ Early Bird Registration ----------------------- Deadline for Early Bird Registration: January 31. *** Speakers, don't forget to register!!!!!! *** http://pycon.org/dc2004/register/ Sprint pre-announcement ----------------------- Come a few days early and participate in a coding sprint. Sprints are free (courtesy of the PSF!), and are held from Saturday March 20 through Tuesday March 23 (i.e. the four days leading up to the conference). http://pycon.org/dc2004 . Volunteers ---------- We're looking for volunteers to help run PyCon. http://mail.python.org/mailman/listinfo/pycon-organizers About PyCon ----------- PyCon DC 2004 will be held March 24-26, 2004 in Washington, D.C. Keynote speaker is Mitch Kapor of the Open Source Applications Foundation (http://www.osafoundation.org/). Don't miss any PyCon announcements! Subscribe to http://mail.python.org/mailman/listinfo/pycon-announce You can discuss PyCon with other interested people by subscribing to http://mail.python.org/mailman/listinfo/pycon-interest The central resource for PyCon DC 2004 is http://www.pycon.org/ Pictures from last year's PyCon: http://www.python.org/cgi-bin/moinmoin/PyConPhotos I'm looking forward to seeing you all in DC in March!!! --Guido van Rossum (home page: http://www.python.org/~guido/) Open Source Awards: http://builder.com.com/5100-6375_14-5136758.html From Jack.Jansen at cwi.nl Thu Jan 15 15:09:12 2004 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Thu Jan 15 15:09:23 2004 Subject: [Python-Dev] No more releases of MacPython-OS9? Message-ID: [I'm cc-ing python-dev to get some feedback on this issue from a wider community too, let me know if the crowd there don't want this]. There is a serious problem with distributing future versions of MacPython-OS9. While the 2.3.3 installer has been sitting on my machine for the last three weeks I cannot distribute it due to licensing issues with Installer VISE from Mindvision. Mindvision's licensing scheme is a subscription-based model: you pay a yearly fee based on the number of copies you intend to distribute and whether you're a commercial/noncommercial operation. Built installers continue to work when you don't renew your subscription, but you cannot create new installers. For many years they have had a program whereby shareware and freeware developers could apply for a free license, and MacPython-OS9 installers have been created with Installer VISE for many many years. And, I must say, with great satisfaction on my side: even though MacPython installers have been very complex with 68K versus PowerPC, Classic versus Carbon execution model, inclusion of various optional bits and other issues all handled fairly easily. However, Mindvision has discontinued the free licensing program. And while there is some leeway for renewing existing freeware licenses apparently I should have applied for that before last october, when the previous license expired (apparently Mindvision thinks that people create installers every couple of days?). So now I'm looking at three choices: 1. Stop distributing MacPython-OS9. I was planning to continue the 2.3.X series for as long as I had access to compatible hardware, but as a community service (personally I couldn't be bothered with OS9 anymore:-). 2. Buy an Installer VISE license. If I read correctly this would cost $500 per year, probably for the next 2 years. This will only happen if someone else comes up with the money: I don't have it... 3. Switch to a different installer. Given the complexity of the MacPython distribution (even though it is a lot simpler nowadays, with only Carbon/PPC supported) I don't want to invest time in this. So: someone else will have to step in and create an installer. I'd like to hear what people think the best course of action is, especially from people who have a current interest in MacPython-OS9 (and/or are willing to donate time or money), -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From barry at python.org Thu Jan 15 15:18:30 2004 From: barry at python.org (Barry Warsaw) Date: Thu Jan 15 15:18:41 2004 Subject: [Python-Dev] No more releases of MacPython-OS9? In-Reply-To: References: Message-ID: <1074197910.3167.75.camel@anthem> On Thu, 2004-01-15 at 15:09, Jack Jansen wrote: > 2. Buy an Installer VISE license. If I read > correctly this would cost $500 > per year, probably for the next 2 years. This will only happen if > someone else comes up with the money: I don't have it... If continuing the MacOS9 distro of Python 2.3.x is important enough to the Python community, then I think the PSF should pick up the cost of the license. Personally, I don't care about OS9 either myself. :) OTOH, I think it's perfectly reasonable to declare an EOL date for MacPython-OS9 now. -Barry From paul at fxtech.com Thu Jan 15 15:18:51 2004 From: paul at fxtech.com (Paul Miller) Date: Thu Jan 15 15:19:03 2004 Subject: [Python-Dev] Re: [Pythonmac-SIG] No more releases of MacPython-OS9? In-Reply-To: References: Message-ID: <6.0.1.1.2.20040115141555.050ed148@mail.fxtech.com> >I'd like to hear what people think the best course of action is, >especially from people who have a current interest in MacPython-OS9 >(and/or are willing to donate time or money), We currently use MacPython on OS9 and OS X systems in one of our commercial effects packages, and appreciate the effort Jack and many others have contributed to making it happen. With that said, I believe fully that OS9 is a dead horse and we should encourage people to migrate to OS X and Mach-O as fast as possible. IMHO, any effort spent on OS9 is a waste of valuable resources. There are only so many hours in the day. I'd rather you have a beer on my behalf than spend any more time on beating the dead horse. From aahz at pythoncraft.com Thu Jan 15 16:27:02 2004 From: aahz at pythoncraft.com (Aahz) Date: Thu Jan 15 16:27:46 2004 Subject: [Python-Dev] No more releases of MacPython-OS9? In-Reply-To: References: Message-ID: <20040115212700.GA7170@panix.com> On Thu, Jan 15, 2004, Jack Jansen wrote: > > 2. Buy an Installer VISE license. If I read > correctly this would cost $500 > per year, probably for the next 2 years. This will only happen if > someone else comes up with the money: I don't have it... Looks like 2.3 is the latest release currently available. There've been a lot of bugfixes since then. Do you have any download statistics for the OS9 version? (I.e. rough idea of how many people care.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From barry at barrys-emacs.org Fri Jan 9 14:42:38 2004 From: barry at barrys-emacs.org (Barry Scott) Date: Thu Jan 15 16:45:55 2004 Subject: [Python-Dev] The os module, unix and win32 In-Reply-To: References: <200401082211.i08MBGx21879@c-24-5-183-134.client.comcast.net> Message-ID: <6.0.1.1.2.20040109193232.021a3d28@torment.chelsea.private> For pythonw execution you will need AllocConsole and SetStdHandle to allow CMD.EXE to be run. At 08-01-2004 22:31, Peter Astrand wrote: >I'm not sure since I haven't implemented much yet, but we'll need at >least: > >win32.CreatePipe >win32api.DuplicateHandle >win32api.GetCurrentProcess >win32process.CreateProcess >win32process.STARTUPINFO >win32gui.EnumThreadWindows >win32event.WaitForSingleObject >win32process.TerminateProcess From Jack.Jansen at cwi.nl Fri Jan 16 07:37:10 2004 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Fri Jan 16 07:36:39 2004 Subject: [Python-Dev] No more releases of MacPython-OS9? In-Reply-To: <20040115212700.GA7170@panix.com> References: <20040115212700.GA7170@panix.com> Message-ID: On 15-jan-04, at 22:27, Aahz wrote: > On Thu, Jan 15, 2004, Jack Jansen wrote: >> >> 2. Buy an Installer VISE license. If I read >> correctly this would cost $500 >> per year, probably for the next 2 years. This will only happen if >> someone else comes up with the money: I don't have it... > > Looks like 2.3 is the latest release currently available. There've > been > a lot of bugfixes since then. Do you have any download statistics for > the OS9 version? (I.e. rough idea of how many people care.) I have now:-) I'm a bit surprised at the results: MacPython-OS9 is a *lot* more popular than I had thought.... Over the last few months I see about 6000 MacPython downloads per month from my site (I didn't check versiontracker and such, that'll be a couple hundred more, I guess). For december the breakdown is 2000 downloads for 10.2, 2000 downloads for 10.3 and 1700 downloads for OS9. The other 300 are various things like source and such. 10.2 downloads are slowly declining, 10.3 downloads increasing. But the fact that 30% of the MacPython users are still downloading the OS9 binary really surprises me... So it seems MacPython-OS9 does have a place, still... -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From tree at basistech.com Fri Jan 16 08:53:42 2004 From: tree at basistech.com (Tom Emerson) Date: Fri Jan 16 08:53:56 2004 Subject: [Python-Dev] No more releases of MacPython-OS9? In-Reply-To: References: <20040115212700.GA7170@panix.com> Message-ID: <16391.60646.924678.704941@magrathea.basistech.com> Jack Jansen writes: > But the fact that 30% of the MacPython users are still downloading the > OS9 binary really surprises me... Have you considered sending mail to MindVision and asking them to donate a license to the PSF? No doubt they are being inundated with such requests, but it may be worth trying. Are you using InstallerVise on OS X as well (I build from the tarball)? If so I'd be willing to help migrate to the Apple Installer. Then you could tell MindVision that you would like to continuing distributing an installer built with InstallerVise only on Mac OS 9 and earlier, and ask if they'll donate the appropriate license. Who knows... perhaps they'll toss in the full deal. -tree -- Tom Emerson Basis Technology Corp. Software Architect http://www.basistech.com "Beware the lollipop of mediocrity: lick it once and you suck forever" From guido at python.org Fri Jan 16 10:20:28 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 16 10:20:34 2004 Subject: [Python-Dev] No more releases of MacPython-OS9? In-Reply-To: Your message of "Fri, 16 Jan 2004 13:37:10 +0100." References: <20040115212700.GA7170@panix.com> Message-ID: <200401161520.i0GFKS218572@c-24-5-183-134.client.comcast.net> > So it seems MacPython-OS9 does have a place, still... How about: if you manage to get one of its users to sign up as a PSF sponsor, the PSF will buy you an installer. (A better deal for the sponsor might be to buy you an installer directly, of course...) --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Fri Jan 16 10:34:55 2004 From: aahz at pythoncraft.com (Aahz) Date: Fri Jan 16 10:34:59 2004 Subject: [Python-Dev] No more releases of MacPython-OS9? In-Reply-To: References: <20040115212700.GA7170@panix.com> Message-ID: <20040116153455.GA1401@panix.com> On Fri, Jan 16, 2004, Jack Jansen wrote: > > I'm a bit surprised at the results: MacPython-OS9 is a *lot* more > popular than I had thought.... Over the last few months I see about > 6000 MacPython downloads per month from my site (I didn't check > versiontracker and such, that'll be a couple hundred more, I guess). > For december the breakdown is 2000 downloads for 10.2, 2000 downloads > for 10.3 and 1700 downloads for OS9. The other 300 are various things > like source and such. 10.2 downloads are slowly declining, 10.3 > downloads increasing. Then I think you should apply to the PSF for funds for the license (or ask the PSF to buy the license directly and send the software to you). Getting the license for one year should allow you to finish the 2.3 line. (While I appreciate what Guido is saying about finding a sponsor, I don't think we should hold your license pending that -- spending $500 to support several thousand people in the Python community is exactly what the PSF should doing IMO. At this point, I'd suggest that further discussion move to the psf-members list.) One thing you *could* do right now: post a message to c.l.py.a with a Subject: saying "Save OS9!", requesting donations to the PSF. ;-) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From webmaster at beyond-thoughts.com Fri Jan 16 12:33:05 2004 From: webmaster at beyond-thoughts.com (Christoph Becker-Freyseng) Date: Fri Jan 16 12:34:05 2004 Subject: [Python-Dev] No more releases of MacPython-OS9? In-Reply-To: References: <20040115212700.GA7170@panix.com> Message-ID: <40082051.5010602@beyond-thoughts.com> Hello, I'm very sorry I just posted my reply to python-list instead of python-dev. Christoph Becker-Freyseng ################ ack Jansen wrote: > > On 15-jan-04, at 22:27, Aahz wrote: > >> On Thu, Jan 15, 2004, Jack Jansen wrote: >> >>> >>> 2. Buy an Installer VISE license. If I read >>> correctly this would cost $500 >>> per year, probably for the next 2 years. This will only happen if >>> someone else comes up with the money: I don't have it... >> >> >> >> Looks like 2.3 is the latest release currently available. There've been >> a lot of bugfixes since then. Do you have any download statistics for >> the OS9 version? (I.e. rough idea of how many people care.) > > > > I have now:-) > > I'm a bit surprised at the results: MacPython-OS9 is a *lot* more > popular than I had thought.... Over the last few months I see about > 6000 MacPython downloads per month from my site (I didn't check > versiontracker and such, that'll be a couple hundred more, I guess). > For december the breakdown is 2000 downloads for 10.2, 2000 downloads > for 10.3 and 1700 downloads for OS9. The other 300 are various things > like source and such. 10.2 downloads are slowly declining, 10.3 > downloads increasing. > > But the fact that 30% of the MacPython users are still downloading the > OS9 binary really surprises me... > > So it seems MacPython-OS9 does have a place, still... What about asking if someone of these Mac Users has such a license? Christoph Becker-Freyseng -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20040116/64908282/attachment.html From rowen at cesmail.net Fri Jan 16 13:43:47 2004 From: rowen at cesmail.net (Russell E. Owen) Date: Fri Jan 16 13:43:53 2004 Subject: [Python-Dev] Re: No more releases of MacPython-OS9? References: <20040115212700.GA7170@panix.com> Message-ID: In article , Jack Jansen wrote: > I'm a bit surprised at the results: MacPython-OS9 is a *lot* more > popular than I had thought.... Over the last few months I see about > 6000 MacPython downloads per month from my site (I didn't check > versiontracker and such, that'll be a couple hundred more, I guess)... This may not only be people using MacOS 9. Having a carbon version of MacPython running on MacOS X makes a certain amount of sense. The main one being that drag-and-drop applets having an actual console window. I've not had the nerve to ask my users to open the system console to see results. Also, I doubt any significant # of users are affected by this, but....Carbon MacPython made it trivial to get to the file creation date, and I ended up using it for processing some files. I'm sure it can be obtained in framework MacPython, but so far haven't figured out how. Due to these issues, I still use carbon MacPython for a fair # of my file conversion scripts, although most of my work is done using framework MacPython. In any case, I'd be willing to contact VISE about a possible exception to their licensing. Or perhaps we could work up a petition if you think it might help. -- Russell From jordan at krushen.com Fri Jan 16 15:50:56 2004 From: jordan at krushen.com (Jordan Krushen) Date: Fri Jan 16 15:51:14 2004 Subject: [Python-Dev] Re: No more releases of MacPython-OS9? In-Reply-To: References: <20040115212700.GA7170@panix.com> Message-ID: On 16-Jan-04, at 10:43 AM, Russell E. Owen wrote: > In any case, I'd be willing to contact VISE about a possible exception > to their licensing. Or perhaps we could work up a petition if you think > it might help. I don't think bullying a software company into giving you free software is the way to go, here. I'd ask all the potential petition-signers to donate $10 each to the PSF, instead. ;) J. From barry at python.org Fri Jan 16 16:03:08 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 16 16:06:13 2004 Subject: [Python-Dev] Re: No more releases of MacPython-OS9? In-Reply-To: References: <20040115212700.GA7170@panix.com> Message-ID: <1074286988.3167.257.camel@anthem> On Fri, 2004-01-16 at 15:50, Jordan Krushen wrote: > I don't think bullying a software company into giving you free software > is the way to go, here. I'd ask all the potential petition-signers to > donate $10 each to the PSF, instead. ;) I hope the PSF will just pony up the dough. It's not that much and it clearly helps the Python community. I'd also hope that such goodwill would inspire lots of MacPython users to donate what they can to the PSF to show their appreciation! -Barry From jeremy at zope.com Fri Jan 16 16:46:44 2004 From: jeremy at zope.com (Jeremy Hylton) Date: Fri Jan 16 16:53:03 2004 Subject: [Python-Dev] Re: No more releases of MacPython-OS9? In-Reply-To: <1074286988.3167.257.camel@anthem> References: <20040115212700.GA7170@panix.com> <1074286988.3167.257.camel@anthem> Message-ID: <1074289603.24983.38.camel@localhost.localdomain> On Fri, 2004-01-16 at 16:03, Barry Warsaw wrote: > On Fri, 2004-01-16 at 15:50, Jordan Krushen wrote: > > > I don't think bullying a software company into giving you free software > > is the way to go, here. I'd ask all the potential petition-signers to > > donate $10 each to the PSF, instead. ;) > > I hope the PSF will just pony up the dough. It's not that much and it > clearly helps the Python community. I'd also hope that such goodwill > would inspire lots of MacPython users to donate what they can to the PSF > to show their appreciation! Sounds like the right arrangement to me. Jeremy From skip at pobox.com Fri Jan 16 17:08:56 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 16 17:09:00 2004 Subject: [Python-Dev] PEP 11 mistake? Message-ID: <16392.24824.367536.660812@montanaro.dyndns.org> Looking at the future unsupported platforms I saw this: Name: MacOS 9 Unsupported in: Python 2.4 Code removed in: Python 2.4 Surely the code shouldn't removed until 2.5, right? Skip From pf_moore at yahoo.co.uk Fri Jan 16 17:18:05 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Fri Jan 16 17:17:50 2004 Subject: [Python-Dev] Python 2.4 and MSVC 7.1 - extension compatibility Message-ID: <3cafedwi.fsf@yahoo.co.uk> One of the issues with moving to VC7 for the Python Windows build is the fact that extensions (or embedding apps) built with MSVC 6 "won't work" with it. But I've never actually managed to provoke a crash (not that I've tried *very* hard, mind). And thinking about the issue a little, I wonder whether there is a real problem here at all... A quick check of the DLLs in my Windows\System32 directory shows a lot which use MSVCRT. Including OLEAUT32, which is one of the key OLE/COM DLLs. So, presumably, this means that DLLs built with MSVCRT *can* be used by applications built with VC7. And a Python extension is just a DLL, with an unusual extension. So, the question is - is there really any problem with Python 2.4, built with MSVC7, using an extension DLL built with MSVC6? The areas where I can see a potential issue are where CRT objects are shared between the extension, and the Python runtime: 1. Memory allocated by the extension CRT, and freed by Python 2. FILE* objects passed from one CRT to the other 3. Any others? Memory allocation "should" be OK, if extension writers follow the rules in the Python/C API manual, section 9.1 - "To avoid memory corruption, extension writers should never try to operate on Python objects with the functions exported by the C library: malloc(), calloc(), realloc() and free()". And no-one ever writes extensions which violate the rules :-) There are a series of APIs in the "Very High Level Layer" (PyRun_* and friends), and some in the PyMarshal series, which pass FILE* objects about. I wouldn't expect these to be common in extensions. So, a question - would it be worth documenting which APIs are *not* supported in extensions built with an incompatible CRT to the one Python was built with? And if this was done, would it be OK to state "if you don't use these APIs, you don't have to build your extension or embedding application with the same CRT as that of Python"? If it is deemed acceptable to make these sort of guarantees, I'll volunteer to search out the relevant APIs, and draft a section for the Python/C API manual. But I don't want to spend the time and effort, if it's still not possible to say that it's safe to use MSVCRT in those cases. Paul. -- This signature intentionally left blank From barry at python.org Fri Jan 16 17:48:17 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 16 17:48:24 2004 Subject: [Python-Dev] Bug #878572 Message-ID: <1074293297.24216.2.camel@anthem> The bug and attached patch improves the error messages when pwd.getpwnam() and friends raises a KeyError. I expect the change to be uncontroversial for Python 2.4 and I believe it should be uncontroversial for Python 2.3, since it affects only the error string. I'll commit these changes in the next few days unless objections are raised. -Barry From martin at v.loewis.de Fri Jan 16 18:01:31 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri Jan 16 18:01:31 2004 Subject: [Python-Dev] Python 2.4 and MSVC 7.1 - extension compatibility In-Reply-To: <3cafedwi.fsf@yahoo.co.uk> References: <3cafedwi.fsf@yahoo.co.uk> Message-ID: <40086D4B.4090105@v.loewis.de> Paul Moore wrote: > So, the question is - is there really any problem with Python 2.4, > built with MSVC7, using an extension DLL built with MSVC6? Just try and see for yourself. I saw the omniORB IDL compiler crash because of that. > Memory allocation "should" be OK, if extension writers follow the > rules in the Python/C API manual, section 9.1 - "To avoid memory > corruption, extension writers should never try to operate on Python > objects with the functions exported by the C library: malloc(), > calloc(), realloc() and free()". And no-one ever writes extensions > which violate the rules :-) This is not sufficient. For example, using the s# argument parser means the extension module needs to call PyMem_Free on the resulting memory block, allocated by getargs. According to the documentation, the extension module can safely call PyMem_FREE instead, which would cause heap confusion. > So, a question - would it be worth documenting which APIs are *not* > supported in extensions built with an incompatible CRT to the one > Python was built with? And if this was done, would it be OK to state > "if you don't use these APIs, you don't have to build your extension > or embedding application with the same CRT as that of Python"? I would not make such a statement. It is *never* OK to mix CRTs, even if one cannot construct a case where this causes problems. That said, people will be using other compilers to create extensions if they find it works for them. They need to perform their own testing, and once they are satisfied, they will ship the code no matter what the documentation says. Regards, Martin From pf_moore at yahoo.co.uk Fri Jan 16 18:43:22 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Fri Jan 16 18:43:02 2004 Subject: [Python-Dev] Re: Python 2.4 and MSVC 7.1 - extension compatibility References: <3cafedwi.fsf@yahoo.co.uk> <40086D4B.4090105@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > Paul Moore wrote: >> So, the question is - is there really any problem with Python 2.4, >> built with MSVC7, using an extension DLL built with MSVC6? > > Just try and see for yourself. I saw the omniORB IDL compiler > crash because of that. Hmm. In itself, that's not conclusive, as it's possible that omniORB uses the "dangerous" APIs I mention below. But your other arguments *are* sufficient, so having a specific example is very helpful. I've seen too many "well, it seems to work OK for me" comments to be entirely comfortable with the fact that I had managed to find no evidence of crashes myself. >> Memory allocation "should" be OK, [...] > This is not sufficient. For example, using the s# argument parser > means the extension module needs to call PyMem_Free on the resulting > memory block, allocated by getargs. According to the documentation, > the extension module can safely call PyMem_FREE instead, which would > cause heap confusion. OK. That's a documented usage which requires using the same C runtime. It could be argued that the documentation could be changed to "tighten up" the requirements, but that way lies madness :-) >> So, a question - would it be worth documenting which APIs are *not* >> supported in extensions built with an incompatible CRT to the one >> Python was built with? And if this was done, would it be OK to state >> "if you don't use these APIs, you don't have to build your extension >> or embedding application with the same CRT as that of Python"? > > I would not make such a statement. It is *never* OK to mix CRTs, even > if one cannot construct a case where this causes problems. Thanks. I guess my only reservation was that I wasn't 100% clear on what constitutes "mixing", given that, for example, Python, built with MSVC7, can load win32api.pyd (from win32all), built with MSVC7, which uses oleaut32.dll (MS system file) which in turn uses msvcrt.dll. That is expected to work, so we are, in some sense, not "mixing" CRTs in this case. I'm still not entirely clear, but I'm happy to accept your statement on the matter. > That said, people will be using other compilers to create extensions > if they find it works for them. They need to perform their own testing, > and once they are satisfied, they will ship the code no matter what > the documentation says. I know :-( And given the fact that there's nothing in a binary distribution to make it obvious what compiler was used to build the extension, there's the risk that binary distributions will be shipped with subtle problems like this. [Checks] On the other hand, i've just seen that the (MSVC7) Python 2.4 distutils refuses to let me build an extension with MSVC 6. So maybe things aren't as bad as all that - if you get a distutils-built Windows binary, you can be pretty happy that it uses a compatible CRT. Thanks for the comments. Most of this is quite probably FUD, and in reality everything will go smoothly. I just shocked myself this evening by going through my Python environment, and realising how many binary extensions I rely on (many of which I don't have the facilities to build for myself, even if I had the time...) Paul. -- This signature intentionally left blank From bob at redivi.com Fri Jan 16 19:27:18 2004 From: bob at redivi.com (Bob Ippolito) Date: Fri Jan 16 19:24:54 2004 Subject: [Python-Dev] Re: No more releases of MacPython-OS9? In-Reply-To: References: <20040115212700.GA7170@panix.com> Message-ID: On Jan 16, 2004, at 1:43 PM, Russell E. Owen wrote: > In article , > Jack Jansen wrote: > >> I'm a bit surprised at the results: MacPython-OS9 is a *lot* more >> popular than I had thought.... Over the last few months I see about >> 6000 MacPython downloads per month from my site (I didn't check >> versiontracker and such, that'll be a couple hundred more, I guess)... > > This may not only be people using MacOS 9. Having a carbon version of > MacPython running on MacOS X makes a certain amount of sense. The main > one being that drag-and-drop applets having an actual console window. > I've not had the nerve to ask my users to open the system console to > see > results. > > Also, I doubt any significant # of users are affected by this, > but....Carbon MacPython made it trivial to get to the file creation > date, and I ended up using it for processing some files. I'm sure it > can > be obtained in framework MacPython, but so far haven't figured out how. > > Due to these issues, I still use carbon MacPython for a fair # of my > file conversion scripts, although most of my work is done using > framework MacPython. You should speak up about things like this -- it's totally possible to develop a Framework Python BuildApplet that redirects stdin/stdout to a window. File a feature request and/or make some noise in pythonmac-sig. -bob From skip at pobox.com Fri Jan 16 22:14:55 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 16 22:15:00 2004 Subject: [Python-Dev] bad exec* prototypes check - obsolete? Message-ID: <16392.43183.967488.201235@montanaro.dyndns.org> In configure.in there is a check for bad exec* prototypes: if test "$have_prototypes" = yes; then bad_prototypes=no AC_MSG_CHECKING(for bad exec* prototypes) AC_TRY_COMPILE([#include ], [char **t;execve("@",t,t);], , AC_DEFINE(BAD_EXEC_PROTOTYPES, 1, [Define if your contains bad prototypes for exec*() (as it does on SGI IRIX 4.x)]) bad_prototypes=yes ) AC_MSG_RESULT($bad_prototypes) fi Since we assume a proper ANSI C compiler at this point, can we get rid of this test? Skip From skip at pobox.com Fri Jan 16 23:11:47 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 16 23:11:49 2004 Subject: [Python-Dev] anydbm test failures using Berkeley DB 4.2.52 on Solaris 8 Message-ID: <16392.46595.508694.305314@montanaro.dyndns.org> Trying to test some of my obsolete platform excisions on more than just Mac OS X, I came across these test failures this evening on Solaris 8 when Berkeley DB 4.2.52 was installed: bash-2.03$ ./python Lib/test/test_anydbm.py test_anydbm_creation (__main__.AnyDBMTestCase) ... ok test_anydbm_keys (__main__.AnyDBMTestCase) ... ERROR test_anydbm_modification (__main__.AnyDBMTestCase) ... ERROR test_anydbm_read (__main__.AnyDBMTestCase) ... ERROR ====================================================================== ERROR: test_anydbm_keys (__main__.AnyDBMTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_anydbm.py", line 58, in test_anydbm_keys self.init_db() File "Lib/test/test_anydbm.py", line 69, in init_db f = anydbm.open(_fname, 'n') File "/export/home/skip/src/python/Lib/anydbm.py", line 83, in open return mod.open(file, flag, mode) File "/export/home/skip/src/python/Lib/dbhash.py", line 16, in open return bsddb.hashopen(file, flag, mode) File "/export/home/skip/src/python/Lib/bsddb/__init__.py", line 293, in hashopen d.open(file, db.DB_HASH, flags, mode) DBInvalidArgError: (22, 'Invalid argument -- DB_TRUNCATE illegal with locking specified') ====================================================================== ERROR: test_anydbm_modification (__main__.AnyDBMTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_anydbm.py", line 45, in test_anydbm_modification self.init_db() File "Lib/test/test_anydbm.py", line 69, in init_db f = anydbm.open(_fname, 'n') File "/export/home/skip/src/python/Lib/anydbm.py", line 83, in open return mod.open(file, flag, mode) File "/export/home/skip/src/python/Lib/dbhash.py", line 16, in open return bsddb.hashopen(file, flag, mode) File "/export/home/skip/src/python/Lib/bsddb/__init__.py", line 293, in hashopen d.open(file, db.DB_HASH, flags, mode) DBInvalidArgError: (22, 'Invalid argument -- DB_TRUNCATE illegal with locking specified') ====================================================================== ERROR: test_anydbm_read (__main__.AnyDBMTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_anydbm.py", line 52, in test_anydbm_read self.init_db() File "Lib/test/test_anydbm.py", line 69, in init_db f = anydbm.open(_fname, 'n') File "/export/home/skip/src/python/Lib/anydbm.py", line 83, in open return mod.open(file, flag, mode) File "/export/home/skip/src/python/Lib/dbhash.py", line 16, in open return bsddb.hashopen(file, flag, mode) File "/export/home/skip/src/python/Lib/bsddb/__init__.py", line 293, in hashopen d.open(file, db.DB_HASH, flags, mode) DBInvalidArgError: (22, 'Invalid argument -- DB_TRUNCATE illegal with locking specified') ---------------------------------------------------------------------- Ran 4 tests in 0.102s FAILED (errors=3) Traceback (most recent call last): File "Lib/test/test_anydbm.py", line 95, in ? test_main() File "Lib/test/test_anydbm.py", line 90, in test_main test_support.run_unittest(AnyDBMTestCase) File "/export/home/skip/src/python/Lib/test/test_support.py", line 290, in run_unittest run_suite(suite, testclass) File "/export/home/skip/src/python/Lib/test/test_support.py", line 274, in run_suite raise TestFailed(msg) test.test_support.TestFailed: errors occurred in __main__.AnyDBMTestCase I don't recall seeing these before, but this is the first time I've built from CVS on Solaris 8 w/ BDB 4.2.52. The same tests work fine with Python 2.3.3+Berkeley DB 4.2.52 (Lib/test/test_anydbm.py is the same in 2.3.3 and CVS). I'm going to make a wild ass guess and say these aren't related to what I'm doing. Greg, have you seen this before? Skip From tim.one at comcast.net Fri Jan 16 23:21:10 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 16 23:21:33 2004 Subject: [Python-Dev] Accumulation module In-Reply-To: <007401c3da9b$a05336e0$e102a044@oemcomputer> Message-ID: [Raymond Hettinger, on using Knuth's recurrence for variance] > I found it. > > There is a suggestion that the formula would be better off > if the recurrence sums are accumulated along with a cumulative > error term that tracks the loss of significance when adding > a small adjustment to a larger sum. The textbook exercise > indicates that that technique is the most useful in the presence > of crummy rounding. Do you think there is any value in using > the error term tracking? The tension between speed and accuracy in floating numerics bites on every operation. Tricks like Kahan summation can be valuable, but they can also so grossly complicate the code as to introduce subtle bugs of their own, and can also lull into believing they deliver more than they do. For a discussion of numeric *problems* with Kahan's gimmick, and yet another layer of complication to overcome them (well, to a milder level of weakness): http://www.owlnet.rice.edu/~caam420/lectures/fpsum.html The numeric problems for mean and variance are worst when the inputs vary in sign and span a large dynamic range. But that's unusual in real-life data. For example, computing the mean height of American adults involves all-positive numbers that don't even span a factor of 10. As a bad example, consider this vector: x = [1e10]*10 + [21] + [-1e10]*10 There are 21 values total, the first and last 10 cancel out, so the true mean is 21/21 = 1. Here are two ways to compute the mean: def mean1(x): return sum(x) / float(len(x)) def mean2(x): m = 0.0 for i, elt in enumerate(x): m += (elt - m) / (i+1.0) return m Then >>> mean1(x) 1.0 >>> mean2(x) 0.99999988079071045 >>> mean2 is relatively atrocious, and you'll recognize it now as Knuth's recurrence for the mean. sum(x) is actually exact with 754 double arithmetic (we're not even close to using up 53 bits), and the 9 orders of magnitude spanned by the input elements is much larger than seen in most real-life sources of data. The problem with mean2 here is subtler than just that. So the one thing I absolutely recommend with Knuth's method is keeping a running sum, computing the mean fresh on each iteration, instead of using the mean recurrence. If that's done, Knuth's recurrence for the sum of (x_i-mean)**2 works very well even for the "bad example" above. > ... > Without the error term tracking, the inner loop simplifies to: > > k += 1 > newm = m+(x-m)/k > s += (x-m)*(x-newm) > m = newm I hope you worked out the derivation for s! It's astonishing that such a complicated update could collapse to such a simple expression. The suggestion above amounts to replacing newm = m+(x-m)/k by sum += x newm = sum / k From sjoerd at acm.org Sat Jan 17 02:47:54 2004 From: sjoerd at acm.org (Sjoerd Mullender) Date: Sat Jan 17 02:48:09 2004 Subject: [Python-Dev] bad exec* prototypes check - obsolete? In-Reply-To: <16392.43183.967488.201235@montanaro.dyndns.org> References: <16392.43183.967488.201235@montanaro.dyndns.org> Message-ID: <4008E8AA.8060707@acm.org> Skip Montanaro wrote: > In configure.in there is a check for bad exec* prototypes: > > if test "$have_prototypes" = yes; then > bad_prototypes=no > AC_MSG_CHECKING(for bad exec* prototypes) > AC_TRY_COMPILE([#include ], [char **t;execve("@",t,t);], , > AC_DEFINE(BAD_EXEC_PROTOTYPES, 1, > [Define if your contains bad prototypes for exec*() > (as it does on SGI IRIX 4.x)]) > bad_prototypes=yes > ) > AC_MSG_RESULT($bad_prototypes) > fi > > Since we assume a proper ANSI C compiler at this point, can we get rid of > this test? That code is ancient. The fi on the last line is still from revision 1.1 in 1993! I wrote the original version of this and Guido incorporated it into the Python configure. I think it can go, especially since IRIX 4 support has been deleted. -- Sjoerd Mullender From pf_moore at yahoo.co.uk Sat Jan 17 06:42:02 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sat Jan 17 06:41:46 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> Message-ID: "Martin v. Loewis" writes: > Sure: Please try my installer at > > http://www.dcl.hpi.uni-potsdam.de/home/loewis/python2.4.0.msi One item I noticed with this installer (I used the build 12421, which I believe is the latest) is that there is no ability to specify where you want the start menu entries to go. I regularly, with Python 2.3, move them from "All Programs\Python 2.3" to "All Programs\Programming \Python 2.3" I can manually move the start menu entries, but doing it via the installer means (a) that the uninstaller knows where to remove them from, and (b) other extensions such as win32all, can find them to add their own extras. Paul. -- This signature intentionally left blank From barry at python.org Sat Jan 17 11:06:32 2004 From: barry at python.org (Barry Warsaw) Date: Sat Jan 17 11:06:39 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: References: Message-ID: <1074355591.27252.77.camel@anthem> On Sat, 2004-01-17 at 09:29, perky@users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Lib/email/test > In directory sc8-pr-cvs1:/tmp/cvs-serv14239/Lib/email/test > > Modified Files: > test_email_codecs.py > Log Message: > Add CJK codecs support as discussed on python-dev. (SF #873597) Should we backport this to 2.3? I still intend to backport 852347 when the branch is unfrozen . -Barry From aahz at pythoncraft.com Sat Jan 17 11:11:28 2004 From: aahz at pythoncraft.com (Aahz) Date: Sat Jan 17 11:11:38 2004 Subject: [Python-Dev] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: <1074355591.27252.77.camel@anthem> References: <1074355591.27252.77.camel@anthem> Message-ID: <20040117161128.GA8138@panix.com> On Sat, Jan 17, 2004, Barry Warsaw wrote: > On Sat, 2004-01-17 at 09:29, perky@users.sourceforge.net wrote: >> >> Modified Files: >> test_email_codecs.py >> Log Message: >> Add CJK codecs support as discussed on python-dev. (SF #873597) > > Should we backport this to 2.3? I still intend to backport 852347 when > the branch is unfrozen . -1 to both Unless, that is, you can say with complete certainty that the codecs will behave the same as in 2.3.3. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From amk at amk.ca Sat Jan 17 12:14:01 2004 From: amk at amk.ca (A.M. Kuchling) Date: Sat Jan 17 12:14:16 2004 Subject: [Python-Dev] Python in 2003 summary Message-ID: <20040117171401.GA18383@rogue.amk.ca> [Cc'ed to python-dev, marketing-python; trim followups accordingly] Here's a first draft of a summary of significant Pythonological matters in 2003. * What significant things did I miss? * Are there additional pages I should link to? * Can anyone help fill out the conferences section? For example, are the papers for OSCON, PythonUK, or EuroPython available online anywhere? Does anyone have pages with photos/conference reports/anything? If you don't want to read ReST source, an HTML-formatted version is at http://www.amk.ca/python/writing/2003 . Please don't link to or weblog it yet; when it's done, it'll go on python.org. --amk ================================================== In 2003, there was one new major release of Python and several minor bugfix releases. The Python Software Foundation began to assume a greater role and visibility in the community, organizing the first PyCon conference. A number of noteworthy books were published, and the conference calendar was also full. The Python Language ============================ Python 2.3 was released in July. Compared to last year's Python 2.2 release, 2.3 was fairly conservative, making relatively few changes to the language itself such as some new built-in functions, and a true Boolean type. There were also a number of low-level changes such as new import hooks, minor optimizations, and a specialized object allocator. The real action was in the standard library, where new packages were added for date/time handling, sets, heaps, logging, bzip2 data compression, reading and writing tar files, word-wrapping paragraphs, command-line parsing, and importing Python modules from ZIP archives. Some external packages were absorbed into the standard library such as the latest version of IDLE and the PyBSDDB wrapper for BerkeleyDB. Another new feature in 2.3 was support for cataloging Python modules and applications in the Package Index. The list of packages can be browsed and searched at http://www.python.org/pypi . Since the release of 2.3, three bugfix releases have been issued. All of them were ably coordinated by Anthony Baxter. The current version is Python 2.3.3: http://www.python.org/2.3.3/. For the highlights of the new features, see "What's New in Python 2.3": http://www.python.org/doc/2.3.3/whatsnew/. For a full list of changes to 2.3, see the release notes at http://www.python.org/2.3/NEWS.txt. There was also a bugfix release of Python 2.2. Python 2.2.3 was released in May, coordinated by Barry Warsaw. http://www.python.org/2.2.3/. The Python Software Foundation ================================================== The Python Software Foundation, or PSF, is the copyright holder for current versions of Python. This year the PSF became a 501(c)(3) charity for the purposes of US taxation, so donations to the PSF are now tax-deductible in the US. To donate funds, go to http://www.python.org/psf/donations.html ; you can use PayPal or mail a cheque. In December the ownership of the python.org domain name was transferred from CNRI, which has held it since the mid-1990s, to the PSF. The largest PSF activity in 2003 was organizing the first PyCon conference, held in Washington DC in late March. Conferences =================== PyCon '''''' PyCon DC 2003 was held in Washington DC in late March, organized by the PSF. The conference had an experimental, informal flavor. While several tracks of refereed presentations were on the program, there was also a lot of free time for informally scheduled presentations and discussions, and a few development sprints took place before the official beginning of the conference. Links to papers and slides from PyCon 2003 are at http://www.python.org/cgi-bin/moinmoin/PyConPapers . Various people posted pictures: http://www.python.org/cgi-bin/moinmoin/PyConPhotos Mike Orr had a writeup of the conference in the Linux Journal: http://www.linuxjournal.com/print.php?sid=6800. A.M Kuchling wrote about his impressions of PyCon at http://www.amk.ca/diary/2003/03/27#2003-03-27-2 and http://www.amk.ca/diary/2003/03/28. Coming Soon: PyCon 2004 """""""""""""""""""""""""""" The production work for PyCon 2004 is well under way. The conference will be from March 24-26 in Washington DC. Mitch Kapor will be a keynote speaker. The deadline for proposals is January 15 2004; early bird registration ends February 1 2004. See the conference web pages at http://www.python.org/pycon/dc2004/ for more information. OSCON '''''''' The Python 11 conference was held as part of O'Reilly's OSCON2003 in Portland, Oregon in July. The schedule for Python 11 is at http://conferences.oreillynet.com/pub/w/23/track_python.html . XXX is there a page with links to papers and slides? The Python 12 conference will be part of OSCON 2004, July 26-30 in Portland. The conference site is http://conferences.oreillynet.com/os2004/ . EuroPython '''''''''''''''''''' EuroPython, the major European Python conference, was held in Charleroi, Belgium, in June. Links to papers and slides from EuroPython 2003 are at http://www.europython.org/sessions/talks/slidespapers (for now, anyway). The conference organizers interviewed various participants, available from http://www.europython.org/other/interviews . PythonUK '''''''''''''' The PythonUK conference was held in Oxford in April, organized with the assistance of the Association of C/C++ Users. XXX any links to papers? Workshop on Scientific Computing with Python '''''''''''''''''''''''''''''''''''''''''''''''''' The second SciPy conference was held in 2003, taking place at Caltech in Pasadena, California, in September. The conference site is at http://www.scipy.org/ . You can see the schedule at http://www.scipy.org/site_content/scipy03/scipy03sched.htm . Chuck Esterbrook wrote up his impressions: http://www.sandpyt.org/pipermail/sandpyt/2003-September/000271.html Community =================== At OSCON, a number of awards were given out to notable members of the Python community. * Mark Hammond won the ActiveState Programmer's Choice award for his work on the Windows and COM frameworks for Python. * Martin von Loewis won the Activator's Choice award for his constant and efficient work on maintaining the Python core and handling bugs and patches. * The Frank Willison award went to Fredrik Lundh, another skilled contributor who wrote the regular expression engine for Python 1.6, implemented XML-RPC, and wrote an introduction to the Python standard library. He also started the Daily Python-URL weblog, which posts Python-related news and links. His comments on receiving the award are at http://effbot.org/zone/frank-willison-award.htm. The daily Python-URL is at http://www.pythonware.com/daily/. There was one significant loss to the community; Bryan Richard, editor and publisher of the Py 'zine, had to abandon it due to other personal commitments. Py was taken over by beehive KG, the firm that's been publishing ZopeMag magazine, and will become an electronic-only publication. The Py web site is still at http://www.pyzine.com. New Books =================== A number of significant books about basic Python were released in 2003. O'Reilly released :title-reference:`Python in a Nutshell` by Alex Martelli. At 654 pages long it's a pretty large nutshell, but the book is an excellent one-volume Python reference. Soon after its release the book reached Amazon's list of top 100 bestsellers for a while. The second edition of :title-reference:`Learning Python` by Mark Lutz and David Ascher, another noteworthy title, also came out this past year; :title-reference:`Learning Python` is most commonly recommended as the book for beginning Python users, so this update was most welcome. The new edition brings the coverage up to Python 2.3, including features such as iterators and generators. One title went out of print and was resurrected as an e-book. Fredrik Lundh's :title-reference:`Python Standard Library` was published in 2001, but the electronic version has additional updates, including examples for some modules introduced in Python 2.2 and 2.3. See http://effbot.org/zone/librarybook-index.htm for the online edition. Some more specialized titles appeared. Two books on games appeared, Tom Gutschmidt's :title-reference:`Game Programming w/ Lua, Python, Ruby`, Sean Riley's :title-reference:`Game Programming with Python`. John Zelle's :title-reference:`Python Programming: An Introduction to Computer Science` uses Python as the vehicle for an introductory course. David Mertz's :title-reference:`Text Processing in Python` takes takes its chosen topic and carefully covers every possible aspect of it, mixing sections of reference material with tutorial explanations. A more complete listing of Python books is available in the Python Wiki: http://www.python.org/cgi-bin/moinmoin/PythonBooks From mwh at python.net Sat Jan 17 12:29:37 2004 From: mwh at python.net (Michael Hudson) Date: Sat Jan 17 12:29:41 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: <20040117171401.GA18383@rogue.amk.ca> (A. M. Kuchling's message of "Sat, 17 Jan 2004 12:14:01 -0500") References: <20040117171401.GA18383@rogue.amk.ca> Message-ID: <2m3caescu6.fsf@starship.python.net> "A.M. Kuchling" writes: > [Cc'ed to python-dev, marketing-python; trim followups accordingly] > > Here's a first draft of a summary of significant Pythonological matters in > 2003. > > * What significant things did I miss? If you're going to pimp PyCon 2004, you could mention that both EuroPython and PythonUK are set to happen again this year :-) We might have a cool keynote speaker too, but it's probably too early to go spamming that around. > * Are there additional pages I should link to? > * Can anyone help fill out the conferences section? > For example, are the papers for OSCON, PythonUK, or EuroPython > available online anywhere? Does anyone have pages with > photos/conference reports/anything? Well, googling for "EuroPython 2003" gets a few reports, including mine: http://starship.python.net/crew/mwh/europython2003.html but perhaps it would be better to link to the talk matrix: http://www.europython.org/sessions/talks/matrix (which has links to the slides) and the photo album: http://www.europython.org/other/photos/ And I notice that http://www.python.org/cgi-bin/moinmoin/EuroPython_2f2003 actually links to most of the above... Cheers, mwh -- Strangely enough I saw just such a beast at the grocery store last night. Starbucks sells Javachip. (It's ice cream, but that shouldn't be an obstacle for the Java marketing people.) -- Jeremy Hylton, 29 Apr 1997 From bob at redivi.com Sat Jan 17 13:01:39 2004 From: bob at redivi.com (Bob Ippolito) Date: Sat Jan 17 12:59:12 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: <20040117171401.GA18383@rogue.amk.ca> References: <20040117171401.GA18383@rogue.amk.ca> Message-ID: <32D85CD4-4917-11D8-84EF-000A95686CD8@redivi.com> On Jan 17, 2004, at 12:14 PM, A.M. Kuchling wrote: > [Cc'ed to python-dev, marketing-python; trim followups accordingly] > > Here's a first draft of a summary of significant Pythonological > matters in > 2003. > > * What significant things did I miss? The 2.3 release was pretty huge for the MacPython community (off the top of my head): PackageManager framework build included in OS X 10.3 (versus a buggy non-linkable 2.2 build in OS X 10.2) near-complete set of extensions "out of the box" (versus a big list of missing ones in OS X 10.2) some support from Apple in their Developer Tools a SWIG wrapper for the CoreGraphics API example scripts/documentation for using the CoreGraphics API through Python the inclusion of BuildApplet Important enough to Apple such that they mentioned Python and CoreGraphics in their 10.3 press release I'm pretty sure the Python 2.3 release schedule was even tweaked such that it could make it into OS X 10.3 on time New MacPython community site (wiki and FAQ): http://pythonmac.org/ -bob From skip at pobox.com Sat Jan 17 13:14:56 2004 From: skip at pobox.com (Skip Montanaro) Date: Sat Jan 17 13:15:00 2004 Subject: [Python-Dev] bad exec* prototypes check - obsolete? In-Reply-To: <4008E8AA.8060707@acm.org> References: <16392.43183.967488.201235@montanaro.dyndns.org> <4008E8AA.8060707@acm.org> Message-ID: <16393.31648.598730.482316@montanaro.dyndns.org> >> In configure.in there is a check for bad exec* prototypes: ... >> Since we assume a proper ANSI C compiler at this point, can we get >> rid of this test? Sjoerd> That code is ancient. The fi on the last line is still from revision Sjoerd> 1.1 in 1993! I wrote the original version of this and Guido Sjoerd> incorporated it into the Python configure. I think it can go, Sjoerd> especially since IRIX 4 support has been deleted. It's gone. Thanks for the feedback. It goes without saying that with these rather substantial subtractions, testing should be carried out on as many platforms as possible. I'm doing Mac OS X and Solaris 8, and can give things a whirl on RH Linux and Cygwin, but there are a number of platforms (DEC Alpha comes to mind) which appear to overlap more with the changes I've made so far, especially in thread_pthread.h. Skip From garth at garthy.com Sat Jan 17 15:54:29 2004 From: garth at garthy.com (Garth) Date: Sat Jan 17 15:51:24 2004 Subject: [Python-Dev] Compiling python with the free .NET SDK Message-ID: <4009A105.30303@garthy.com> Hi I written a script that allows you to compile python with the free .NET Framework SDK. It reads in the XML file and uses a rule set to build the commands. At the moment it just executes them but I'm going to write a Makefile writter (GNU make is all I know though) It's probably possible to write rules to compile with mingw32 as well. What would be a real help is if someone could send me the console output from the Debug and Release buiilds when you use command line devstudio to build it as some of the xml properties make no sense but I could probably work out what they mean. Things like 'RuntimeLibrary="2"' have no meaning unless I can see the output. You need the free .NET framework SDK and the Platform SDK both available for free from microsoft Just put the two python files in the PCBuild directory and run vcproj_read.py pythoncore.vcproj "Release|Win32" or vcproj_read.py pythoncore.vcproj "Debug|Win32" I've only tested it with pythoncode.vcproj and python.vcproj so if you get a keyerror around line 187 you need to add more rules to interpret the settings. Cheers Garth -------------- next part -------------- import sys from xml.dom.ext.reader import Sax2 import xml.dom.ext import file_rules if len(sys.argv) < 3: print " " sys.exit(1) xml_file = file( sys.argv[1], 'rU' ) reader = Sax2.Reader() doc = reader.fromStream( xml_file ) xml_file.close() vsproj = doc.getElementsByTagName('VisualStudioProject')[0] def attributes2dict(dom): """ """ d = {} for attr in dom.attributes: #print attr d[attr.name] = attr.nodeValue return d def xmlnode2dict( dom ): #print dom #print dir(dom) #self = { } self = attributes2dict( dom ) if dom.hasChildNodes(): for child in dom.childNodes: try: childdict = xmlnode2dict( child ) try: self[child.nodeName].append( childdict ) except KeyError: self[child.nodeName] = [ childdict ] except TypeError: pass #print "skipping",child return self project = xmlnode2dict( vsproj ) # Select the configuration for config in project['Configurations'][0]['Configuration']: if config['Name'] == sys.argv[2]: currentconfig = config try: print "Config selected", currentconfig['Name'] except NameError: print "Couldn't find config",sys.argv[2] sys.exit(1) file_rules.currentconfig = currentconfig # write the makefile by sening ocommand off to MakfileWrite class # For now just exe objectlist = [] import os for f in project['Files'][0]['File']: print "Processing ",f['RelativePath'] cmdline,obj = file_rules.generate_command_line( f ) print cmdline if cmdline: os.system( cmdline ) objectlist.append( obj ) # Now link..... print "link ",objectlist print "When I get around to writing it" -------------- next part -------------- import os.path # Tool rule setting # Sometimes file setting s override trhe config rule and sometimes they # just add to it maybe this is wrong? TOOL_RULE_OVERRIDE = 0 TOOL_RULE_BOTH = 1 currentconfig = None def find_file_config( fileconfig ): for config in fileconfig['FileConfiguration']: if config['Name'] == currentconfig['Name']: return config def find_tool_settings( fileconfig, name, setting, rule): settings = [] try: currentfileconfig = find_file_config( fileconfig ) for tool in currentfileconfig['Tool']: if tool['Name'] == name: if tool[setting] != '': settings.append( tool[setting] ) if rule == TOOL_RULE_OVERRIDE: return tool[setting] except KeyError: # OK no file specific config for this file pass for tool in currentconfig['Tool']: if tool['Name'] == name: settings.append( tool[setting] ) if rule == TOOL_RULE_BOTH: return settings else: return tool[setting] raise KeyError,"DAMN" + str(fileconfig) + str(name) + str(setting) # Guessing at this.. need to fill in the real values DebugInformationFormatmap = {'3' : '-Zi'} Optimizationmap = {'0' : "", '1' : "/O1", '2' : "/O2", '3' : "/Ox"} def option_list( fconfig, tool, var, delimiter, prefix ): """ """ options = find_tool_settings( fconfig, tool,var,TOOL_RULE_BOTH) optionlist = [] for option in options: optionlist += option.split( delimiter ) return prefix + prefix.join( optionlist ) def option_param( fconfig, tool, var, prefix ): """ """ option = find_tool_settings( fconfig, tool,var,TOOL_RULE_OVERRIDE) return prefix + option def option_bool( fconfig, tool, boolname, prefix ): """ """ line = "" b = find_tool_settings( fconfig, 'VCCLCompilerTool',boolname,TOOL_RULE_OVERRIDE) if bool(b): line = prefix return line def option_bool_param(fconfig, tool, boolname, paramname, prefix ): """ """ line = "" b = find_tool_settings( fconfig, 'VCCLCompilerTool',boolname,TOOL_RULE_OVERRIDE) if bool(b): paramvalue = find_tool_settings( fconfig, 'VCCLCompilerTool',paramname,TOOL_RULE_OVERRIDE) line = prefix + paramvalue return line def option_map( fconfig, tool, var, optionmap ): """ """ option = find_tool_settings( fconfig, tool,var,TOOL_RULE_OVERRIDE) return optionmap[ option ] toolextmap = {'c' : 'VCCLCompilerTool', 'rc': 'VCResourceCompilerTool', 'ico' : None,} # HMM are the ones I'm not sure about... OPTION_TYPE_HMM = 0 OPTION_TYPE_BOOL = 1 OPTION_TYPE_BOOL_PARAM = 2 OPTION_TYPE_MAP = 3 OPTION_TYPE_PARAM = 4 OPTION_TYPE_LIST = 5 # IGNORE is there as they are probably used in some other # parameter thing OPTION_TYPE_IGNORE = 6 toolmap = {'VCCLCompilerTool' : { 'pre' : 'cl ' , 'post' : ' -c ', 'objpath': 'ObjectFile', 'objext' : 'obj', #'Name': {'type' : OPTION_TYPE_IGNORE }, 'AdditionalOptions' : {'type' : OPTION_TYPE_PARAM, 'flag' : '', }, 'PrecompiledHeaderFile': {'type' : OPTION_TYPE_IGNORE }, 'EnableFunctionLevelLinking' : { 'type' : OPTION_TYPE_BOOL, 'flag' : '-Gy' }, 'StringPooling': { 'type' : OPTION_TYPE_BOOL, 'flag' : '-Gf' }, # Not sure about this one 'InlineFunctionExpansion': { 'type' : OPTION_TYPE_BOOL, 'flag' : '-Ob1' }, 'ProgramDataBaseFileName' : {'type' : OPTION_TYPE_IGNORE }, 'RuntimeLibrary' : {'type' : OPTION_TYPE_HMM }, 'AssemblerListingLocation' : {'type' : OPTION_TYPE_HMM }, 'CompileAs' : {'type' : OPTION_TYPE_HMM }, 'AdditionalIncludeDirectories' : {'type' : OPTION_TYPE_LIST, 'delim' : ',', 'flag' : ' -I', }, 'PreprocessorDefinitions' : {'type' : OPTION_TYPE_LIST, 'delim' : ';', 'flag' : ' -D', }, 'BrowseInformation' : { 'type' : OPTION_TYPE_BOOL_PARAM, 'paramname' :'ProgramDataBaseFileName', 'flag' : '-FR' }, 'DebugInformationFormat' : { 'type' : OPTION_TYPE_MAP, 'map' : DebugInformationFormatmap, }, 'ObjectFile' : {'type' : OPTION_TYPE_PARAM, 'flag' : '-Fo', }, 'Optimization': { 'type' : OPTION_TYPE_MAP, 'map' : Optimizationmap, }, 'WarningLevel' : {'type' : OPTION_TYPE_PARAM, 'flag' : '-W', }, 'UsePrecompiledHeader': { 'type' : OPTION_TYPE_BOOL_PARAM, 'paramname' :'PrecompiledHeaderFile', 'flag' : '-YX', }, 'SuppressStartupBanner' : { 'type' : OPTION_TYPE_BOOL, 'flag' : '-nologo' }, }, 'VCResourceCompilerTool' : {'pre' : 'rc ' , 'post' : '', 'objpath': None, 'objext' : 'res', 'Name': {'type' : OPTION_TYPE_IGNORE }, 'PreprocessorDefinitions' : {'type' : OPTION_TYPE_LIST, 'delim' : ';', 'flag' : ' -d', }, 'AdditionalIncludeDirectories' : {'type' : OPTION_TYPE_LIST, 'delim' : ',', 'flag' : ' -i', }, 'Culture': {'type' : OPTION_TYPE_PARAM, 'flag' : '-l', }, } } def run_option( f, tool, option ): """ """ line = "" oinfo = toolmap[tool][option] # print oinfo if oinfo['type'] == OPTION_TYPE_HMM: print "Option ", option, "Not implemented yet!" pass elif oinfo['type'] == OPTION_TYPE_BOOL: line = option_bool(f, tool, option, oinfo['flag'] ) elif oinfo['type'] == OPTION_TYPE_BOOL_PARAM : line = option_bool_param( f, tool, option, oinfo['paramname'], oinfo['flag'] ) elif oinfo['type'] == OPTION_TYPE_MAP: line = option_map( f, tool, option, Optimizationmap) elif oinfo['type'] == OPTION_TYPE_PARAM: line = option_param( f, tool, option, oinfo['flag'] ) elif oinfo['type'] == OPTION_TYPE_LIST: line = option_list( f, tool, option, oinfo['delim'], oinfo['flag'] ) return line def generate_command_line( f ): """ """ cmd = "" obj = "" ext = f['RelativePath'].split(".")[-1] toolname = toolextmap[ ext ] if toolname: lines = [] for tool in currentconfig['Tool']: if tool['Name'] == toolname: for setting in tool: lines.append( run_option( f, toolname, setting) ) cmd = toolmap[ toolname ]['pre'] cmd += (" ".join( lines )) cmd += toolmap[ toolname ]['post'] cmd += " " + f['RelativePath'] if toolmap[ toolname ]['objpath']: objectpath = find_tool_settings( f, toolname, toolmap[ toolname ]['objpath'] ,TOOL_RULE_OVERRIDE) else: objectpath = os.path.split( f['RelativePath'] )[0] objectfile = "".join( os.path.splitext( os.path.split( f['RelativePath'] )[1] )[0] ) + toolmap[ toolname ]['objext'] obj = os.path.join(objectpath,objectfile) return ( cmd, obj) def generate_command_line_old( f ): """ """ ext = f['RelativePath'].split(".")[-1] cmdlinefunc = filetype[ ext ] return cmdlinefunc( f ) def generate_link_line( files ): """ """ line = ['link'] raise SystemError,"WriteME!" From martin at v.loewis.de Sat Jan 17 19:19:29 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Jan 17 19:19:48 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> Message-ID: <4009D111.8070108@v.loewis.de> Paul Moore wrote: > One item I noticed with this installer (I used the build 12421, which > I believe is the latest) is that there is no ability to specify where > you want the start menu entries to go. I regularly, with Python 2.3, > move them from "All Programs\Python 2.3" to "All Programs\Programming > \Python 2.3" I don't know how to do this. Does the current (Wise) installer offer user interface for this? Do you know of other .msi-based software that has that feature (and for which I might have the .msi package somewhere). Regards, Martin From martin at v.loewis.de Sat Jan 17 19:20:50 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Jan 17 19:21:03 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: <1074355591.27252.77.camel@anthem> References: <1074355591.27252.77.camel@anthem> Message-ID: <4009D162.3000700@v.loewis.de> Barry Warsaw wrote: > Should we backport this to 2.3? I think the maintainer of 2.3 has voiced a clear "no to new features" policy for 2.3. This would be certainly a new feature. Regards, Martin From python at bobs.org Sat Jan 17 20:24:20 2004 From: python at bobs.org (Bob Arnson) Date: Sat Jan 17 20:24:56 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: <4009D111.8070108@v.loewis.de> References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <4009D111.8070108@v.loewis.de> Message-ID: <1508525992.20040117202420@bobs.org> Saturday, January 17, 2004, 7:19:29 PM, you wrote: > Paul Moore wrote: >> One item I noticed with this installer (I used the build 12421, which >> I believe is the latest) is that there is no ability to specify where >> you want the start menu entries to go. I regularly, with Python 2.3, >> move them from "All Programs\Python 2.3" to "All Programs\Programming >> \Python 2.3" > I don't know how to do this. Does the current (Wise) installer offer > user interface for this? Do you know of other .msi-based software > that has that feature (and for which I might have the .msi package > somewhere). To do this, you'd have to treat MENUDIR the same way you treat TARGETDIR: Provide a default of [ProgramMenuFolder]Python 2.4 and provide a dialog to change it. I'm not aware of any MSI-based installers that offer this feature; the current "Windows User Experience" book says you should have critical shortcuts in the root of Programs and a subfolder only if absolutely necessary, so selecting a shortcut folder is kinda awkward... -- sig://boB http://foobob.com From martin at v.loewis.de Sun Jan 18 04:12:16 2004 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 18 04:12:35 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: <1508525992.20040117202420@bobs.org> References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <4009D111.8070108@v.loewis.de> <1508525992.20040117202420@bobs.org> Message-ID: <400A4DF0.3070005@v.loewis.de> Bob Arnson wrote: > To do this, you'd have to treat MENUDIR the same way you treat TARGETDIR: > Provide a default of [ProgramMenuFolder]Python 2.4 and provide a > dialog to change it. I'm not aware of any MSI-based installers that offer > this feature; the current "Windows User Experience" book says you > should have critical shortcuts in the root of Programs and a subfolder > only if absolutely necessary, so selecting a shortcut folder > is kinda awkward... But isn't it a problem that [ProgramMenuFolder] depends on the value of ALLUSERS? I don't set ALLUSERS until the feature selection dialog is complete, so the magic computing ProgramMenuFolder certainly isn't invoked at the point I would show that dialog. However, a specific directory must be computed for the dialog to work, as the directory browser would have to show the existing items. It would be best if browsing for the shortcut folder, in the just-for-me case, would perform the same directory merging that the start menu does (i.e. combining the all users start menu with the current user's start menu, then perform modifications only in the current user's directories). I somewhat doubt this is possible at all. Regards, Martin From drewp at bigasterisk.com Sun Jan 18 06:16:58 2004 From: drewp at bigasterisk.com (Drew Perttula) Date: Sun Jan 18 06:17:04 2004 Subject: [Python-Dev] PEP0004: audioop deprecated? In-Reply-To: Your message of "Mon, 08 Dec 2003 16:01:37 +1100." <200312080501.hB851bP7000438@maxim.off.ekorp.com> Message-ID: <200401181116.i0IBGwQS011514@bang.bigasterisk.com> > According to PEP-0004 and PEP-0320, the audioop module is deprecated, > and scheduled for removal in 2.4. I hadn't noticed this before, but > I'd like to ask that it be kept. I've a number of pieces of code using > it - both internal (our unified messaging server makes heavy use of > it) and external (the shtoom python SIP phone uses it). > I'll second this request for keeping audioop. I use it in my public project cuisine.bigasterisk.com for assembling stereo audio out of two audio channels and for mixing down multiple audio tracks. I also intend to try out audioop's search routines as a way of automatically syncing video clips from two cameras that recorded approximately the same audio. From anthony at interlink.com.au Sun Jan 18 07:26:33 2004 From: anthony at interlink.com.au (Anthony Baxter) Date: Sun Jan 18 07:27:17 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: <4009D162.3000700@v.loewis.de> Message-ID: <20040118122633.F068625AF47@bonanza.off.ekorp.com> >>> =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote > I think the maintainer of 2.3 has voiced a clear "no to new features" > policy for 2.3. This would be certainly a new feature. If it can be shown that this is a pure win (that is, it improves code for 2.3.4, but is both backwards and forwards compatible, then I don't mind. I _do_ mind if we end up with the horror of 2.2.2's not-really- booleans, which has resulted in FAR FAR too much code of the form: try: True, False except: True, False = 1, 0 I do not wish to see something like this again. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From pf_moore at yahoo.co.uk Sun Jan 18 07:32:55 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sun Jan 18 07:32:59 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <4009D111.8070108@v.loewis.de> Message-ID: <7jzpbfns.fsf@yahoo.co.uk> "Martin v. L?wis" writes: > Paul Moore wrote: >> One item I noticed with this installer (I used the build 12421, which >> I believe is the latest) is that there is no ability to specify where >> you want the start menu entries to go. I regularly, with Python 2.3, >> move them from "All Programs\Python 2.3" to "All Programs\Programming >> \Python 2.3" > > I don't know how to do this. Unfortunately, I have no experience with MSI installers, so I can't help. > Does the current (Wise) installer offer user interface for this? Yes, that's how I was aware of it. It gives a (fairly standard - but I don't know if MSI based installers have it) dialog, listing the subfolders of the start menu (Accessories, etc) and a default of "Python 2.3". You can override this, and if you include backslashes (as in Programming\Python 2.3) then that translates to a subfolder. There's no GUI way of specifying a subfolder, though. > Do you know of other .msi-based software that has that feature (and > for which I might have the .msi package somewhere). No, I'm afraid I don't. I tend not to be particularly aware of what installer technology particular software uses. Paul. -- This signature intentionally left blank From skip at manatee.mojam.com Sun Jan 18 08:01:34 2004 From: skip at manatee.mojam.com (Skip Montanaro) Date: Sun Jan 18 08:07:00 2004 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200401181301.i0ID1Yxa013161@manatee.mojam.com> Bug/Patch Summary ----------------- 659 open / 4522 total bugs (+127) 237 open / 2535 total patches (+44) New Bugs -------- threading module can deadlock after fork (2004-01-11) http://python.org/sf/874900 file.read returns reaches EOF when it shouldn't (2004-01-11) http://python.org/sf/875157 global stmt causes breakpoints to be ignored (2004-01-12) http://python.org/sf/875404 Make popen raise an exception if cmd doesn't exist (2004-01-12) http://python.org/sf/875471 email.Utils.formataddr doesn't handle whitespace too well (2004-01-12) http://python.org/sf/875646 add support for installations compiled for debugging (2004-01-12) http://python.org/sf/875654 Ctrl-C doesn't work with sleepy main thread (2004-01-12) http://python.org/sf/875692 Unbounded recursion in modulefinder (2004-01-13) http://python.org/sf/876278 logging handlers raise exception on level (2004-01-13) http://python.org/sf/876421 potential leak in ensure_fromlist (import.c) (2004-01-13) http://python.org/sf/876533 Random stack corruption from socketmodule.c (2004-01-13) http://python.org/sf/876637 configure detects incorrect compiler optimization (2004-01-14) http://python.org/sf/877121 configure links unnecessary library libdl (2004-01-14) http://python.org/sf/877124 distutils - compiling C++ ext in cyg or mingw w/ std Python (2004-01-14) http://python.org/sf/877165 freeze: problems excluding site (2004-01-15) http://python.org/sf/877904 Zipfile archive name can't be unicode (2004-01-16) http://python.org/sf/878120 optparse indents without respect to $COLUMNS (2004-01-16) http://python.org/sf/878453 optparse help text is not easily extensible (2004-01-16) http://python.org/sf/878458 Would like a console window for applets (2004-01-16) http://python.org/sf/878560 pwd and grp modules error msg improvements (2004-01-16) http://python.org/sf/878572 I would like an easy way to get to file creation date (2004-01-16) http://python.org/sf/878581 suspect documentation of operator.isNumberType (2004-01-17) http://python.org/sf/878908 New Patches ----------- Shadow Password Support Module (2002-07-09) http://python.org/sf/579435 >100k alloc wasted on startup (2004-01-12) http://python.org/sf/875689 input() vs __future__ (2004-01-13) http://python.org/sf/876178 reorganize, extend function call optimizations (2004-01-13) http://python.org/sf/876193 easiest 1% improvement in pystone ever (2004-01-13) http://python.org/sf/876198 scary frame speed hacks (2004-01-13) http://python.org/sf/876206 Closed Bugs ----------- inconsistent popen[2-4]() docs (2003-11-16) http://python.org/sf/843293 Multiple problems with GC in 2.3.3 (2004-01-01) http://python.org/sf/868896 Py_SequenceFast can mask errors (2004-01-06) http://python.org/sf/871704 Wrong URL for external documentation link (Lib ref 11.9) (2004-01-08) http://python.org/sf/873205 Closed Patches -------------- add link to DDJ's article (2004-01-05) http://python.org/sf/871402 From python at rcn.com Sun Jan 18 08:33:48 2004 From: python at rcn.com (Raymond Hettinger) Date: Sun Jan 18 08:34:50 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/testtest_email_codecs.py, 1.4, 1.5 In-Reply-To: <20040118122633.F068625AF47@bonanza.off.ekorp.com> Message-ID: <000101c3ddc7$b4335800$311dc797@oemcomputer> [Martin v. L?wis] > > I think the maintainer of 2.3 has voiced a clear "no to new features" > > policy for 2.3. This would be certainly a new feature. [Anthony Baxter] > If it can be shown that this is a pure win (that is, it improves code > for 2.3.4, but is both backwards and forwards compatible, then I don't > mind. I _do_ mind if we end up with the horror of 2.2.2's not-really- > booleans, which has resulted in FAR FAR too much code of the form: > > try: > True, False > except: > True, False = 1, 0 > > I do not wish to see something like this again. +1 on backporting it. The CJK codecs are essentially invisible to most users. However, its availability will increase python's acceptance in places where it is not currently being used. Increasing the user base should not have to wait for Py2.4. Raymond Hettinger From amk at amk.ca Sun Jan 18 09:28:17 2004 From: amk at amk.ca (A.M. Kuchling) Date: Sun Jan 18 09:28:33 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: <2m3caescu6.fsf@starship.python.net> References: <20040117171401.GA18383@rogue.amk.ca> <2m3caescu6.fsf@starship.python.net> Message-ID: <20040118142817.GC19930@rogue.amk.ca> On Sat, Jan 17, 2004 at 05:29:37PM +0000, Michael Hudson wrote: > If you're going to pimp PyCon 2004, you could mention that both > EuroPython and PythonUK are set to happen again this year :-) I couldn't figure out if PythonUK was going to be held again; www.python-uk.org just shows a directory listing. Is it going to be held with the ACCU again, meaning that it'll be April 14-17 2004 in Oxford? > Well, googling for "EuroPython 2003" gets a few reports, including > mine: Thanks for the pointers! --amk From martin at v.loewis.de Sun Jan 18 09:56:51 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 18 09:57:45 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: <20040118122633.F068625AF47@bonanza.off.ekorp.com> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> Message-ID: <400A9EB3.6030602@v.loewis.de> Anthony Baxter wrote: > If it can be shown that this is a pure win (that is, it improves code > for 2.3.4, but is both backwards and forwards compatible, then I don't > mind. I _do_ mind if we end up with the horror of 2.2.2's not-really- > booleans, which has resulted in FAR FAR too much code of the form: > > try: > True, False > except: > True, False = 1, 0 It depends on the usage. If applications just do foo.encode("some-cjk-encoding") then all will work fine. However, it may be that applications do try: codecs.lookup("some-cjk-encoding") except LookupError: try: import japanese japanese.register_aliases() except ImportError: pass Regards, Martin From aahz at pythoncraft.com Sun Jan 18 10:15:09 2004 From: aahz at pythoncraft.com (Aahz) Date: Sun Jan 18 10:15:15 2004 Subject: [Python-Dev] python/dist/src/Lib/email/testtest_email_codecs.py, 1.4, 1.5 In-Reply-To: <000101c3ddc7$b4335800$311dc797@oemcomputer> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <000101c3ddc7$b4335800$311dc797@oemcomputer> Message-ID: <20040118151509.GA19225@panix.com> On Sun, Jan 18, 2004, Raymond Hettinger wrote: > > The CJK codecs are essentially invisible to most users. However, its > availability will increase python's acceptance in places where it is not > currently being used. Increasing the user base should not have to wait > for Py2.4. That's not the question. IIUC, there are already some kind of CJK codecs available; this is a different set of codecs. Unless these new codecs produce the same results, we can't afford to switch. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From perky at i18n.org Sun Jan 18 10:26:16 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Sun Jan 18 10:25:49 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: <400A9EB3.6030602@v.loewis.de> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> Message-ID: <20040118152616.GA56270@i18n.org> On Sun, Jan 18, 2004 at 03:56:51PM +0100, "Martin v. L?wis" wrote: > Anthony Baxter wrote: > >If it can be shown that this is a pure win (that is, it improves code > >for 2.3.4, but is both backwards and forwards compatible, then I don't > >mind. I _do_ mind if we end up with the horror of 2.2.2's not-really- > >booleans, which has resulted in FAR FAR too much code of the form: > > > >try: > > True, False > >except: > > True, False = 1, 0 > > It depends on the usage. If applications just do > > foo.encode("some-cjk-encoding") > > then all will work fine. However, it may be that applications > do > > try: > codecs.lookup("some-cjk-encoding") > except LookupError: > try: > import japanese > japanese.register_aliases() > except ImportError: > pass This usage is useful only for -S mode. The recent Asian codecs register its aliases by their respective .pth. So, backporting CJK codecs won't let Japanese developers code in this way. If there's a man who need real JapaneseCodecs not CJK codecs, they can use it by installing JapaneseCodecs package as they've been done. Hye-Shik From barry at python.org Sun Jan 18 10:43:50 2004 From: barry at python.org (Barry Warsaw) Date: Sun Jan 18 10:43:55 2004 Subject: [Python-Dev] python/dist/src/Lib/email/testtest_email_codecs.py, 1.4, 1.5 In-Reply-To: <20040118151509.GA19225@panix.com> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <000101c3ddc7$b4335800$311dc797@oemcomputer> <20040118151509.GA19225@panix.com> Message-ID: <1074440629.28485.16.camel@anthem> On Sun, 2004-01-18 at 10:15, Aahz wrote: > That's not the question. IIUC, there are already some kind of CJK codecs > available; this is a different set of codecs. Unless these new codecs > produce the same results, we can't afford to switch. We're not switching codecs since 2.3 doesn't ship with either JapaneseCodecs or CJKCodecs. No one's proposing that 2.3 start doing that now. What we're doing is removing a hardcoded dependency on exactly the JapaneseCodecs, which have to be installed separately anyway. With these change, an end-user could install either JapaneseCodecs or CJKCodecs and the rest of the code would/should work without modification. -Barry From barry at python.org Sun Jan 18 10:50:26 2004 From: barry at python.org (Barry Warsaw) Date: Sun Jan 18 10:50:33 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: <20040118122633.F068625AF47@bonanza.off.ekorp.com> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> Message-ID: <1074441025.28485.23.camel@anthem> On Sun, 2004-01-18 at 07:26, Anthony Baxter wrote: > If it can be shown that this is a pure win (that is, it improves code > for 2.3.4, but is both backwards and forwards compatible, then I don't > mind. Understood. I'll do the best I can to test this, but it would help if someone less monolingual than I can help. Here's the plan: I'm going to run the tests w/o the patch, but with JapaneseCodecs installed. Then I'll apply the patch and re-run the test. Then I'll install/enable CJKCodecs and re-run the test. I'd expect everything to pass, but I'm not sure that's a complete enough test. In addition, I plan on running Mailman 2.1.x with Python 2.3.x under similar environments as above. I'd expect to see the same pretty pictures in the web interface under both scenarios. I'll try to cobble together some Japanese charset emails and send them through, but since I can't read Japanese, I'm really shooting blind. -Barry From aahz at pythoncraft.com Sun Jan 18 10:54:33 2004 From: aahz at pythoncraft.com (Aahz) Date: Sun Jan 18 10:54:36 2004 Subject: [Python-Dev] python/dist/src/Lib/email/testtest_email_codecs.py, 1.4, 1.5 In-Reply-To: <1074440629.28485.16.camel@anthem> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <000101c3ddc7$b4335800$311dc797@oemcomputer> <20040118151509.GA19225@panix.com> <1074440629.28485.16.camel@anthem> Message-ID: <20040118155433.GA721@panix.com> On Sun, Jan 18, 2004, Barry Warsaw wrote: > On Sun, 2004-01-18 at 10:15, Aahz wrote: >> >> That's not the question. IIUC, there are already some kind of CJK codecs >> available; this is a different set of codecs. Unless these new codecs >> produce the same results, we can't afford to switch. > > We're not switching codecs since 2.3 doesn't ship with either > JapaneseCodecs or CJKCodecs. No one's proposing that 2.3 start doing > that now. > > What we're doing is removing a hardcoded dependency on exactly the > JapaneseCodecs, which have to be installed separately anyway. With > these change, an end-user could install either JapaneseCodecs or > CJKCodecs and the rest of the code would/should work without > modification. Just to make completely sure I understand this: if someone has an existing installation with JapanseCodecs and they upgrade with these code changes, will there be any changes in the way they access codecs or in the results they get? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From infoHaugen at golfgod.net Sun Jan 18 06:15:10 2004 From: infoHaugen at golfgod.net (Python-dev) Date: Sun Jan 18 11:53:48 2004 Subject: [Python-Dev] Info regarding vigra Message-ID: how Vigra? works. So you can better understand, what Vigra can do for you. If you are sensible about your health, reflect on what you can do for your seual health, to keep the chances that you will need Vigra as low as possible. fleecy connote Englewood, political. Inrease Seks Drive Bost Seual Performance Fuller & Harder Erecions Inrease Stamna & Endurance Quicker Rechages http://www.securemedpills.com/index.php?pid=pharmaboss combined denounces scrutiny, Morse. furnished outruns bricklayer, inhaler. Thanks, frightful From mwh at python.net Sun Jan 18 12:13:34 2004 From: mwh at python.net (Michael Hudson) Date: Sun Jan 18 12:26:42 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: <20040118142817.GC19930@rogue.amk.ca> (A. M. Kuchling's message of "Sun, 18 Jan 2004 09:28:17 -0500") References: <20040117171401.GA18383@rogue.amk.ca> <2m3caescu6.fsf@starship.python.net> <20040118142817.GC19930@rogue.amk.ca> Message-ID: <2m4qutqiwx.fsf@starship.python.net> "A.M. Kuchling" writes: > On Sat, Jan 17, 2004 at 05:29:37PM +0000, Michael Hudson wrote: >> If you're going to pimp PyCon 2004, you could mention that both >> EuroPython and PythonUK are set to happen again this year :-) > > I couldn't figure out if PythonUK was going to be held again; > www.python-uk.org just shows a directory listing. Is it going to be > held with the ACCU again, meaning that it'll be April 14-17 2004 in > Oxford? I think that's the plan. I'm much less involved in PythonUK than EuroPython, though. Cheers, mwh -- 6. The code definitely is not portable - it will produce incorrect results if run from the surface of Mars. -- James Bonfield, http://www.ioccc.org/2000/rince.hint From python at bobs.org Sun Jan 18 13:30:36 2004 From: python at bobs.org (Bob Arnson) Date: Sun Jan 18 13:32:12 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: <400A4DF0.3070005@v.loewis.de> References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <4009D111.8070108@v.loewis.de> <1508525992.20040117202420@bobs.org> <400A4DF0.3070005@v.loewis.de> Message-ID: <1005711322.20040118133036@bobs.org> Sunday, January 18, 2004, 4:12:16 AM, you wrote: > But isn't it a problem that [ProgramMenuFolder] depends on the value of > ALLUSERS? I don't set ALLUSERS until the feature selection dialog is > complete, so the magic computing ProgramMenuFolder certainly isn't > invoked at the point I would show that dialog. However, a specific > directory must be computed for the dialog to work, as the directory > browser would have to show the existing items. That's true. Erg...I now remember facing the same problem at work, though in this case it's to give different Start menu folders for different products. We fixed it by post-processing the different MSIs. You can't even use a property in the Directory table DefaultDir column. I hacked the Python MSI to add a custom action to set MENUDIR and it works. So you could have a text box on the Advanced dialog that defaults to "Python 2.4" and can be overridden. Of course, then you have to worry about short file names... -- sig://boB http://foobob.com From chrisandannreedy at comcast.net Sun Jan 18 14:54:58 2004 From: chrisandannreedy at comcast.net (Chris Reedy) Date: Sun Jan 18 14:55:11 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: <20040117171401.GA18383@rogue.amk.ca> References: <20040117171401.GA18383@rogue.amk.ca> Message-ID: <400AE492.8030606@comcast.net> A.M. Kuchling wrote: >[Cc'ed to python-dev, marketing-python; trim followups accordingly] > >Here's a first draft of a summary of significant Pythonological matters in >2003. > > * What significant things did I miss? > > It seems to me that something should be said about the pie-thon contest. I would guess that the right emphasis would be to mention the challenge as part of the OSCON 2003 discussion and that the result will be seen at OSCON 2004. Chris From tjreedy at udel.edu Sun Jan 18 15:22:13 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Sun Jan 18 15:22:12 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com><3FF059A1.8090202@v.loewis.de> <4009D111.8070108@v.loewis.de> <7jzpbfns.fsf@yahoo.co.uk> Message-ID: "Paul Moore" wrote in message news:7jzpbfns.fsf@yahoo.co.uk... > "Martin v. Löwis" writes: > > > Paul Moore wrote: > >> One item I noticed with this installer (I used the build 12421, which > >> I believe is the latest) is that there is no ability to specify where > >> you want the start menu entries to go. I regularly, with Python 2.3, > >> move them from "All Programs\Python 2.3" to "All Programs\Programming > >> \Python 2.3" > > > > I don't know how to do this. > > Unfortunately, I have no experience with MSI installers, so I can't > help. Part of how I judge games (and other software - but I install and remove more demos than anything else) is by whether the installer allows me to put start menu entries in a directory/group of my own naming under my current start/games group or whether it imperialisticly insists on adding a new top level entry like egocorp/. (Microsoft itself being an example of one of the latter :-(. So, although I am personally happy with pythonx.y/, I agree it would be preferable to give people a choice. But I have no idea of the mechanics. I know that InstallShield can do this, but have no idea if they now use msi or not. Another related matter. Under XP, programs can be installed so they only work for admin user (ugh), so that they are only obviously accessible by admin (through desktop icon and start menu) but actually work for any account if one uses explorer to find and double click on executable, or so that they are visible and work from any account. Except for admin only programs, which python is not, I prefer latter (and will test for with latest installer soon on new xp system). Terry J. Reedy From perky at i18n.org Sun Jan 18 15:48:38 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Sun Jan 18 15:48:08 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects listobject.c, 2.176, 2.177 In-Reply-To: References: Message-ID: <20040118204838.GA75072@i18n.org> On Sun, Jan 18, 2004 at 12:31:04PM -0800, tim_one@users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Objects > In directory sc8-pr-cvs1:/tmp/cvs-serv17990 > > Modified Files: > listobject.c > Log Message: > Revert change accidentally checked in as part of a whitespace normalization > patch. > > > Index: listobject.c > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Objects/listobject.c,v [snip] > *************** > *** 1882,1887 **** > saved_ob_item = self->ob_item; > self->ob_size = 0; > ! /* self->ob_item = empty_ob_item = PyMem_NEW(PyObject *, 0); */ > ! self->ob_item = empty_ob_item = PyMem_NEW(PyObject *, roundupsize(0)); > > if (keyfunc != NULL) { > --- 1879,1883 ---- > saved_ob_item = self->ob_item; > self->ob_size = 0; > ! self->ob_item = empty_ob_item = PyMem_NEW(PyObject *, 0); > > if (keyfunc != NULL) { Is there a particular reason for allocating zero-sized memory for this? On my test, assigning NULL on self->ob_item instead is worked either. Hye-Shik From martin at v.loewis.de Sun Jan 18 16:00:09 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 18 16:00:54 2004 Subject: [Python-Dev] Compiling python with the free .NET SDK In-Reply-To: <4009A105.30303@garthy.com> References: <4009A105.30303@garthy.com> Message-ID: <400AF3D9.2000508@v.loewis.de> Garth wrote: > What would be a real help is if someone could send me the console output > from the Debug and Release buiilds when you use command line devstudio > to build it as some of the xml properties make no sense but I could > probably work out what they mean. Things like 'RuntimeLibrary="2"' have > no meaning unless I can see the output. I've put the build logs into http://www.dcl.hpi.uni-potsdam.de/home/loewis/buildlog.tgz I believe the console logs would be useless, as devenv generates response files which contain the actual command line arguments. HTH, Martin From martin at v.loewis.de Sun Jan 18 15:57:45 2004 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 18 16:04:43 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: <1005711322.20040118133036@bobs.org> References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <4009D111.8070108@v.loewis.de> <1508525992.20040117202420@bobs.org> <400A4DF0.3070005@v.loewis.de> <1005711322.20040118133036@bobs.org> Message-ID: <400AF349.3000001@v.loewis.de> Bob Arnson wrote: > You can't even use a property in the Directory table DefaultDir column. > I hacked the Python MSI to add a custom action to set MENUDIR and > it works. So you could have a text box on the Advanced dialog that > defaults to "Python 2.4" and can be overridden. Of course, then you > have to worry about short file names... But can you use that also to address a subfolder? I.e. if that were to contain a backslash ("Programming\Python 2.4" instead of "Python 2.4"), would that work? Regards, Martin From jepler at unpythonic.net Sun Jan 18 16:09:13 2004 From: jepler at unpythonic.net (Jeff Epler) Date: Sun Jan 18 16:12:41 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects listobject.c, 2.176, 2.177 In-Reply-To: <20040118204838.GA75072@i18n.org> References: <20040118204838.GA75072@i18n.org> Message-ID: <20040118210913.GB20915@unpythonic.net> On Mon, Jan 19, 2004 at 05:48:38AM +0900, Hye-Shik Chang wrote: > Is there a particular reason for allocating zero-sized memory for > this? On my test, assigning NULL on self->ob_item instead is worked > either. Yes. I think that the explanation goes something like this: Only values returned by malloc/calloc/realloc are suitable as an argument to realloc/free. So, if you want to start with 0 bytes but not special case the deallocation or reallocation case, you write f = malloc(0); instead of f = NULL; later, you can use f = realloc(f, newsize); /* Ignoring error checking */ and f = free(f) instead of if (f) f = realloc(f, newsize); else f = malloc(newsize); and if(f) free(f); now, some systems have malloc(0) return NULL, and accept free(NULL) as a no-op, and realloc(NULL, newsize) as malloc(newsize), but behavior other than that is allowed by the C standard. Jeff From perky at i18n.org Sun Jan 18 16:20:29 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Sun Jan 18 16:20:04 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects listobject.c, 2.176, 2.177 In-Reply-To: <20040118210913.GB20915@unpythonic.net> References: <20040118204838.GA75072@i18n.org> <20040118210913.GB20915@unpythonic.net> Message-ID: <20040118212029.GA75402@i18n.org> On Sun, Jan 18, 2004 at 03:09:13PM -0600, Jeff Epler wrote: > On Mon, Jan 19, 2004 at 05:48:38AM +0900, Hye-Shik Chang wrote: > > Is there a particular reason for allocating zero-sized memory for > > this? On my test, assigning NULL on self->ob_item instead is worked > > either. > > Yes. I think that the explanation goes something like this: > > Only values returned by malloc/calloc/realloc are suitable as an argument > to realloc/free. So, if you want to start with 0 bytes but not special > case the deallocation or reallocation case, you write > f = malloc(0); > instead of > f = NULL; > later, you can use > f = realloc(f, newsize); /* Ignoring error checking */ > and > f = free(f) > instead of > if (f) f = realloc(f, newsize); > else f = malloc(newsize); > and > if(f) free(f); > > now, some systems have malloc(0) return NULL, and accept free(NULL) as a > no-op, and realloc(NULL, newsize) as malloc(newsize), but behavior other > than that is allowed by the C standard. > Thanks for the explanation. But listobject isn't using realloc and ob_item == NULL case is concerned in the code already. In PyList_New: 73 if (size <= 0) { 74 op->ob_item = NULL; 75 } Excuse me but am I missed something? Hye-Shik From martin at v.loewis.de Sun Jan 18 16:24:12 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 18 16:25:10 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: <20040118152616.GA56270@i18n.org> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> Message-ID: <400AF97C.7060609@v.loewis.de> Hye-Shik Chang wrote: >>try: >> codecs.lookup("some-cjk-encoding") >>except LookupError: >> try: >> import japanese >> japanese.register_aliases() >> except ImportError: >> pass > > > This usage is useful only for -S mode. The recent Asian codecs > register its aliases by their respective .pth. So, backporting > CJK codecs won't let Japanese developers code in this way. Of course, users may have installed JapaneseCodecs with --without-aliases, in which case the aliases won't be there (neither automatically, nor after explicitly importing japanese). I believe mailman use it that way So people who currently do encodings.aliases.aliases.update( {"shift_jis" : "japanese.shift_jis"}) would have to adjust their code. I also find that installing JapaneseCodecs on top of a CJK-enabled Python 2.4 causes shift_jis to use the CJK codec, not the japanese codec. Regards, Martin From tim.one at comcast.net Sun Jan 18 16:51:24 2004 From: tim.one at comcast.net (Tim Peters) Date: Sun Jan 18 16:51:35 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objectslistobject.c, 2.176, 2.177 In-Reply-To: <20040118212029.GA75402@i18n.org> Message-ID: [Hye-Shik Chang] > Thanks for the explanation. But listobject isn't using realloc To the contrary, listobject uses realloc() a lot, but that doesn't matter here. empty_ob_item has to be a unique pointer value so that the later if (self->ob_item != empty_ob_item || self->ob_size) { /* The user mucked with the list during the sort. */ error check is robust. The C standard doesn't promise much about malloc(0), but it does promise that pointers returned by distinct malloc(0) calls are non-equal so long as none have been passed to free(). In contrast, all NULL pointers must compare equal. If empty_ob_item were NULL, and user code (e.g.) added something to the list during sorting, then managed to NULL it out again, the error check about couldn't catch that. In any case, the patch you're talking about was simply reverting a change I made by mistake, and the restored code has been there forever. Ain't broke, don't fix . From perky at i18n.org Sun Jan 18 17:01:08 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Sun Jan 18 17:00:39 2004 Subject: Changing codec lookup order. (was: Re: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5) In-Reply-To: <400AF97C.7060609@v.loewis.de> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> <400AF97C.7060609@v.loewis.de> Message-ID: <20040118220108.GA75991@i18n.org> On Sun, Jan 18, 2004 at 10:24:12PM +0100, "Martin v. L?wis" wrote: > > I also find that installing JapaneseCodecs on top of a CJK-enabled > Python 2.4 causes shift_jis to use the CJK codec, not the japanese > codec. > Aah. We need to change codec lookup order of encodings.search_function to enable to override default codec by 3rd party one. With attached patch, I could get JapaneseCodecs's one. Now, I admit that CJK codecs are obviously un-backportable to 2.3 due to this change. ;-) Hye-Shik -------------- next part -------------- *** __init__.py.orig Mon Jan 19 06:49:27 2004 --- __init__.py Mon Jan 19 06:46:16 2004 *************** *** 28,33 **** --- 28,34 ---- """#" import codecs, exceptions, types + import aliases _cache = {} _unknown = '--unknown--' *************** *** 74,96 **** # Import the module: # ! # First look in the encodings package, then try to lookup the ! # encoding in the aliases mapping and retry the import using the ! # default import module lookup scheme with the alias name. # modname = normalize_encoding(encoding) ! try: ! mod = __import__('encodings.' + modname, ! globals(), locals(), _import_tail) ! except ImportError: ! import aliases ! modname = (aliases.aliases.get(modname) or ! aliases.aliases.get(modname.replace('.', '_')) or ! modname) try: ! mod = __import__(modname, globals(), locals(), _import_tail) except ImportError: mod = None try: getregentry = mod.getregentry --- 75,97 ---- # Import the module: # ! # First lookup the encoding in the aliases mapping and try the ! # import using the default import module lookup scheme with the ! # alias name if available. Then look in the encodings package. # modname = normalize_encoding(encoding) ! modnamestry = ['encodings.' + modname] ! aliasedname = (aliases.aliases.get(modname) or ! aliases.aliases.get(modname.replace('.', '_'))) ! if aliasedname is not None: ! modnamestry.insert(0, aliasedname) ! for mn in modnamestry: try: ! mod = __import__(mn, globals(), locals(), _import_tail) except ImportError: mod = None + else: + break try: getregentry = mod.getregentry *************** *** 125,131 **** except AttributeError: pass else: - import aliases for alias in codecaliases: if not aliases.aliases.has_key(alias): aliases.aliases[alias] = modname --- 126,131 ---- From python at bobs.org Sun Jan 18 17:21:00 2004 From: python at bobs.org (Bob Arnson) Date: Sun Jan 18 17:21:21 2004 Subject: [Python-Dev] Re: Switching to VC.NET 2003 In-Reply-To: <400AF349.3000001@v.loewis.de> References: <5.1.1.6.0.20031229113056.02e151a0@telecommunity.com> <3FF059A1.8090202@v.loewis.de> <4009D111.8070108@v.loewis.de> <1508525992.20040117202420@bobs.org> <400A4DF0.3070005@v.loewis.de> <1005711322.20040118133036@bobs.org> <400AF349.3000001@v.loewis.de> Message-ID: <383695409.20040118172100@bobs.org> Sunday, January 18, 2004, 3:57:45 PM, you wrote: > But can you use that also to address a subfolder? I.e. if that were > to contain a backslash ("Programming\Python 2.4" instead of > "Python 2.4"), would that work? Yes, that works. Surprised me too. I would have expected a problem with MSI creating the subdirectory automatically. However, there is some problem with my hack because it doesn't remove the shortcuts at uninstall time. I authored the custom action into the execute sequence, so it should have run at uninstall time too, so I'm not sure what's wrong... -- sig://boB http://foobob.com From guido at python.org Sun Jan 18 19:17:45 2004 From: guido at python.org (Guido van Rossum) Date: Sun Jan 18 19:17:53 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: Your message of "Sun, 18 Jan 2004 09:28:17 EST." <20040118142817.GC19930@rogue.amk.ca> References: <20040117171401.GA18383@rogue.amk.ca> <2m3caescu6.fsf@starship.python.net> <20040118142817.GC19930@rogue.amk.ca> Message-ID: <200401190017.i0J0HjO03474@c-24-5-183-134.client.comcast.net> > I couldn't figure out if PythonUK was going to be held again; > www.python-uk.org just shows a directory listing. Is it going to be > held with the ACCU again, meaning that it'll be April 14-17 2004 in > Oxford? You should ask Andy Robinson ; last I heard they were going to do it April 14-17, with a keynote by Eric Raymond. Unfortunately I have decided to skip it this year -- too much travel in too short a period of time. Too bad really, because I enjoyed it immensely last year. It's a very different conference than PyCon or EuroPython, probably because of the presence of lots of non-Python folks (it's the yearly ACCU conference, and they do C, C++, Java and other OO languages). The Python offerings are limited in volume but (last year at least) were of very high quality. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Mon Jan 19 05:38:22 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Jan 19 05:39:18 2004 Subject: [Python-Dev] Re: Changing codec lookup order In-Reply-To: <20040118220108.GA75991@i18n.org> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> <400AF97C.7060609@v.loewis.de> <20040118220108.GA75991@i18n.org> Message-ID: <400BB39E.9040305@egenix.com> Hye-Shik Chang wrote: > On Sun, Jan 18, 2004 at 10:24:12PM +0100, "Martin v. L?wis" wrote: > >>I also find that installing JapaneseCodecs on top of a CJK-enabled >>Python 2.4 causes shift_jis to use the CJK codec, not the japanese >>codec. >> > > > Aah. We need to change codec lookup order of encodings.search_function > to enable to override default codec by 3rd party one. With attached > patch, I could get JapaneseCodecs's one. > > Now, I admit that CJK codecs are obviously un-backportable to 2.3 > due to this change. ;-) I think it's a good idea, to have the aliases dictionary checked before trying to do the import. I'll fix that. Backporting the fix should also be possible, since it is not related to any particular codec package. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 18 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From sebastian.wangnick at eurocontrol.int Mon Jan 19 08:42:13 2004 From: sebastian.wangnick at eurocontrol.int (WANGNICK Sebastian) Date: Mon Jan 19 08:42:37 2004 Subject: [Python-Dev] Unicode collation algorithm Message-ID: <15ACCB26C192D411BADE0000F64664CB01BFC40C@agnnl02.mas.eurocontrol.be> Dear all, I'm working on a Python program to access a database of populated places in the world. This database originates from an Excel table containing Unicode place names. I'd like to present this huge list to the users properly sorted. There were plans to provide the Unicode collation algorithm back in 2002 at the EuroPython conference in Brussels, but seemingly these didn't materialize. Is anybody working on this? Regards, Sebastian -- Dipl.-Inform. Sebastian Wangnick Office: Eurocontrol Maastricht UAC, Horsterweg 11, NL-6199AC Maastricht-Airport, Tel: +31-433661-370, Fax: -300 ____ This message and any files transmitted with it are legally privileged and intended for the sole use of the individual(s) or entity to whom they are addressed. If you are not the intended recipient, please notify the sender by reply and delete the message and any attachments from your system. Any unauthorised use or disclosure of the content of this message is strictly prohibited and may be unlawful. Nothing in this e-mail message amounts to a contractual or legal commitment on the part of EUROCONTROL unless it is confirmed by appropriately signed hard copy. Any views expressed in this message are those of the sender. From mal at egenix.com Mon Jan 19 09:09:51 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Jan 19 09:10:48 2004 Subject: [Python-Dev] Unicode collation algorithm In-Reply-To: <15ACCB26C192D411BADE0000F64664CB01BFC40C@agnnl02.mas.eurocontrol.be> References: <15ACCB26C192D411BADE0000F64664CB01BFC40C@agnnl02.mas.eurocontrol.be> Message-ID: <400BE52F.6050204@egenix.com> WANGNICK Sebastian wrote: > Dear all, > > I'm working on a Python program to access a database of populated places in the world. This database originates from an Excel table containing Unicode place names. > > I'd like to present this huge list to the users properly sorted. There were plans to provide the Unicode collation algorithm back in 2002 at the EuroPython conference in Brussels, but seemingly these didn't materialize. > > Is anybody working on this? There were a few attempts, but no contributions. Since you only seem to need one collation order, I'd suggest to normalize all Unicode strings and then write a compare function which implements your particular collation order. Patches for full collation support are welcome, of course :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 19 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From britney21Bundestag at yahoo.com Mon Jan 19 02:21:59 2004 From: britney21Bundestag at yahoo.com (Python-list-owner) Date: Mon Jan 19 09:25:38 2004 Subject: [Python-Dev] Fwd: Want to have S E X, 20 times a day ? Message-ID: An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20040119/eed3bb4b/attachment.html From amk at amk.ca Mon Jan 19 09:48:15 2004 From: amk at amk.ca (A.M. Kuchling) Date: Mon Jan 19 09:48:33 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: <200401190017.i0J0HjO03474@c-24-5-183-134.client.comcast.net> References: <20040117171401.GA18383@rogue.amk.ca> <2m3caescu6.fsf@starship.python.net> <20040118142817.GC19930@rogue.amk.ca> <200401190017.i0J0HjO03474@c-24-5-183-134.client.comcast.net> Message-ID: <20040119144815.GA22016@rogue.amk.ca> On Sun, Jan 18, 2004 at 04:17:45PM -0800, Guido van Rossum wrote: > It's a very different conference than PyCon or > EuroPython, probably because of the presence of lots of non-Python > folks (it's the yearly ACCU conference, and they do C, C++, Java and > other OO languages). The Python offerings are limited in volume but > (last year at least) were of very high quality. Can I quote that description in the summary, please? It's a good summation. --amk From amk at amk.ca Mon Jan 19 09:58:27 2004 From: amk at amk.ca (A.M. Kuchling) Date: Mon Jan 19 09:58:44 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: <400AE492.8030606@comcast.net> References: <20040117171401.GA18383@rogue.amk.ca> <400AE492.8030606@comcast.net> Message-ID: <20040119145827.GC22016@rogue.amk.ca> On Sun, Jan 18, 2004 at 11:54:58AM -0800, Chris Reedy wrote: > It seems to me that something should be said about the pie-thon contest. Nothing much actually *happened* in relation to the contest, so I'm happy leaving it out. Thanks for the suggestion, though. --amk From guido at python.org Mon Jan 19 10:15:31 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 19 10:15:36 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: Your message of "Mon, 19 Jan 2004 09:48:15 EST." <20040119144815.GA22016@rogue.amk.ca> References: <20040117171401.GA18383@rogue.amk.ca> <2m3caescu6.fsf@starship.python.net> <20040118142817.GC19930@rogue.amk.ca> <200401190017.i0J0HjO03474@c-24-5-183-134.client.comcast.net> <20040119144815.GA22016@rogue.amk.ca> Message-ID: <200401191515.i0JFFV222753@c-24-5-183-134.client.comcast.net> > On Sun, Jan 18, 2004 at 04:17:45PM -0800, Guido van Rossum wrote: > > It's a very different conference than PyCon or > > EuroPython, probably because of the presence of lots of non-Python > > folks (it's the yearly ACCU conference, and they do C, C++, Java and > > other OO languages). The Python offerings are limited in volume but > > (last year at least) were of very high quality. > > Can I quote that description in the summary, please? It's a good summation. Of course! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jan 19 10:15:56 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 19 10:16:03 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: Your message of "Mon, 19 Jan 2004 09:58:27 EST." <20040119145827.GC22016@rogue.amk.ca> References: <20040117171401.GA18383@rogue.amk.ca> <400AE492.8030606@comcast.net> <20040119145827.GC22016@rogue.amk.ca> Message-ID: <200401191515.i0JFFu922764@c-24-5-183-134.client.comcast.net> > Nothing much actually *happened* in relation to the contest, > so I'm happy leaving it out. Thanks for the suggestion, though. No? I sweated several days to come up with a benchmark. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jan 19 12:50:31 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 19 12:50:51 2004 Subject: [Python-Dev] PyCon Reminder: Early bird reg deadline 2/1 Message-ID: <200401191750.i0JHoVm23314@c-24-5-183-134.client.comcast.net> Info ---- This is a reminder that the deadline for early bird registration for PyCon DC 2004 is February 1, 2004. Early bird registration is $175; after that, it will be $200 through March 17, then $250 at the door. To register, visit: http://www.pycon.org/dc2004/register/ Plug ---- PyCon is the most important conference in the year for me. It brings together the largest group of Python users and developers imaginable. I come to PyCon for many reasons: to meet with other core Python developers face to face; to hear about the exciting things that users from all over the world are doing with Python; to interact in person with Python users of all types, from newbies to veterans of many years; to hear everybody's feedback on where Python is and where they'd like it to go; and perhaps most of all to continue to encourage this wonderful community by my presence, words and actions to grow, reach out to new users of all kinds, and keep listening to each other. I hope everyone who comes to the conference will return from it with a new or renewed feeling of excitement about Python, whether they are developers, sophisticated users, beginners, or even skeptical passers-by. The Python community includes everyone, from grade schoolers just learning about computer science to renowned scientists interested in using the best tools and business people looking for a secret weapon. I'm looking forward to seeing you all at PyCon! Background ---------- PyCon is a community-oriented conference targeting developers (both those using Python and those working on the Python project). It gives you opportunities to learn about significant advances in the Python development community, to participate in a programming sprint with some of the leading minds in the Open Source community, and to meet fellow developers from around the world. The organizers work to make the conference affordable and accessible to all. DC 2004 will be held March 24-26, 2004 in Washington, D.C. The keynote speaker is Mitch Kapor of the Open Source Applications Foundation (http://www.osafoundation.org/). There will be a four-day development sprint before the conference. We're looking for volunteers to help run PyCon. If you're interested, subscribe to http://mail.python.org/mailman/listinfo/pycon-organizers Don't miss any PyCon announcements! Subscribe to http://mail.python.org/mailman/listinfo/pycon-announce You can discuss PyCon with other interested people by subscribing to http://mail.python.org/mailman/listinfo/pycon-interest The central resource for PyCon DC 2004 is http://www.pycon.org/ --Guido van Rossum (home page: http://www.python.org/~guido/) From garth at garthy.com Mon Jan 19 15:50:41 2004 From: garth at garthy.com (Garth) Date: Mon Jan 19 15:47:34 2004 Subject: [Python-Dev] Compiling python with the free .NET SDK In-Reply-To: <400AF3D9.2000508@v.loewis.de> References: <4009A105.30303@garthy.com> <400AF3D9.2000508@v.loewis.de> Message-ID: <400C4321.2040302@garthy.com> Martin v. L?wis wrote: > Garth wrote: > >> What would be a real help is if someone could send me the console >> output from the Debug and Release buiilds when you use command line >> devstudio to build it as some of the xml properties make no sense >> but I could probably work out what they mean. Things like >> 'RuntimeLibrary="2"' have no meaning unless I can see the output. > > > I've put the build logs into > > http://www.dcl.hpi.uni-potsdam.de/home/loewis/buildlog.tgz > > I believe the console logs would be useless, as devenv generates > response files which contain the actual command line arguments. > > HTH, > Martin > Hi, Thanks for the build logs. Very useful and I can now build the whole of python with the free microsoft tools and GNU make. It appears that you still have a dependency on 'largeint.lib' in pythoncore. This is no longer required and the largeint.h was removed a in the move to VS.NET. (I think) When I remove it by hand editing the file python still builds. Can you update the project files in cvs. Cheers Garth From aahz at pythoncraft.com Mon Jan 19 16:45:29 2004 From: aahz at pythoncraft.com (Aahz) Date: Mon Jan 19 16:45:34 2004 Subject: [Python-Dev] Draft: PEP for imports Message-ID: <20040119214528.GA9767@panix.com> Here's the draft PEP for imports. I'll wait a few days for comments before sending to the PEPmeister. PEP: XXX Title: Imports: multi-line and absolute/relative Version: Last-Modified: Author: Aahz `_ - `Re: Python-Dev Digest, Vol 5, Issue 57 `_ - `Relative import `_ - `Another Strategy for Relative Import `_ Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From claird at lairds.com Mon Jan 19 17:01:29 2004 From: claird at lairds.com (Cameron Laird) Date: Mon Jan 19 17:01:36 2004 Subject: [marketing-python] Re: [Python-Dev] Python in 2003 summary In-Reply-To: <400AE492.8030606@comcast.net> Message-ID: > From marketing-python-admin@wingide.com Mon Jan 19 16:53:44 2004 > . > . > . > It seems to me that something should be said about the pie-thon contest. > I would guess that the right emphasis would be to mention the challenge > as part of the OSCON 2003 discussion and that the result will be seen at > OSCON 2004. > . > . > . Somebody tell me what the next "Python-URL!" should say about it. From martin at v.loewis.de Mon Jan 19 17:30:46 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 19 17:31:06 2004 Subject: [Python-Dev] Re: Changing codec lookup order In-Reply-To: <400BB39E.9040305@egenix.com> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> <400AF97C.7060609@v.loewis.de> <20040118220108.GA75991@i18n.org> <400BB39E.9040305@egenix.com> Message-ID: <400C5A96.1070604@v.loewis.de> M.-A. Lemburg wrote: > I think it's a good idea, to have the aliases dictionary > checked before trying to do the import. > > I'll fix that. Backporting the fix should also be possible, since > it is not related to any particular codec package. No. Instead, it applies uniformly to all codecs - so in some cases, users may see changes in the behaviour. Regards, Martin From martin at v.loewis.de Mon Jan 19 17:33:37 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 19 17:33:50 2004 Subject: [Python-Dev] Unicode collation algorithm In-Reply-To: <15ACCB26C192D411BADE0000F64664CB01BFC40C@agnnl02.mas.eurocontrol.be> References: <15ACCB26C192D411BADE0000F64664CB01BFC40C@agnnl02.mas.eurocontrol.be> Message-ID: <400C5B41.20901@v.loewis.de> WANGNICK Sebastian wrote: > Is anybody working on this? See http://cvs.sourceforge.net/viewcvs.py/python-codecs/picu/ This package offers Python wrappers around ICU, in particular wrt. collation. Regards, Martin From skip at pobox.com Mon Jan 19 17:34:58 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 19 17:35:07 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040119214528.GA9767@panix.com> References: <20040119214528.GA9767@panix.com> Message-ID: <16396.23442.906721.756478@montanaro.dyndns.org> aahz> Here's the draft PEP for imports. I'll wait a few days for aahz> comments before sending to the PEPmeister. I don't know if it's relevant to the discussion, but at various times people have asked about pushing the current global module/package namespace into a package, so instead of import sys you might use import std.sys or from std import sys to reference the sys module as delivered with Python. It might be worthwhile to consider that as an alternative or addition to this capability. Because of all the code changes it would cause, it's not likely to be considered before Python 3, but it seems to me that it might solve many common module collision problems without resorting to major changes to the semantics of the import statement. Skip From guido at python.org Mon Jan 19 17:49:49 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 19 17:49:56 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: Your message of "Mon, 19 Jan 2004 16:34:58 CST." <16396.23442.906721.756478@montanaro.dyndns.org> References: <20040119214528.GA9767@panix.com> <16396.23442.906721.756478@montanaro.dyndns.org> Message-ID: <200401192249.i0JMno624240@c-24-5-183-134.client.comcast.net> > I don't know if it's relevant to the discussion, but at various times people > have asked about pushing the current global module/package namespace into a > package, so instead of > > import sys > > you might use > > import std.sys > > or > > from std import sys > > to reference the sys module as delivered with Python. > > It might be worthwhile to consider that as an alternative or addition to > this capability. Because of all the code changes it would cause, it's not > likely to be considered before Python 3, but it seems to me that it might > solve many common module collision problems without resorting to major > changes to the semantics of the import statement. I strongly suggest to keep this out of Aahz' PEP; it's just going to confuse the issue in the short run, and in the long run it's pretty independent. I also don't think that it's ever going to be that way. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Mon Jan 19 18:01:21 2004 From: barry at python.org (Barry Warsaw) Date: Mon Jan 19 18:01:31 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <200401192249.i0JMno624240@c-24-5-183-134.client.comcast.net> References: <20040119214528.GA9767@panix.com> <16396.23442.906721.756478@montanaro.dyndns.org> <200401192249.i0JMno624240@c-24-5-183-134.client.comcast.net> Message-ID: <1074553280.11494.25.camel@anthem> On Mon, 2004-01-19 at 17:49, Guido van Rossum wrote: > I strongly suggest to keep this out of Aahz' PEP; it's just going to > confuse the issue in the short run, and in the long run it's pretty > independent. I also don't think that it's ever going to be that way. There was also the "import __me__" idea, which should also not go in Aahz's pep . Maybe we should just go ahead and reserve peps 800-899 for import related enhancements. -Barry From hopper at omnifarious.org Mon Jan 19 18:55:34 2004 From: hopper at omnifarious.org (Eric Mathew Hopper) Date: Mon Jan 19 18:55:42 2004 Subject: [Python-Dev] Stackless Python Message-ID: <20040119235534.GA27195@omnifarious.org> I'm working on a project (http://www.cakem.net/) that involves networking. I have to do a lot of decoding of network data into an internal representation, and encoding internal representations into a network format. The decoding part is going to present a problem, a problem that could easily be solved by continuations. I want to right the code so that it will work when there is no guarantee that I'll have an entire message before I start decoding it. This means the parser may have have to stop at any random point in time, and then be restarted again when more data is available. I see no way to do this without continuations, or threads, and I refuse to use threads for something like this. For now, I think I will do one of two things. Either build in the assumption that I will have the entire messages before I start parsing, or organize parsing as a series of steps that can be aborted in the middle and then restarted, and use exceptions to track whether the step was fully completed or not. Sort of a poor man's continuation. Anyway, I'm not a Python developer, just a user. I just wanted to add this to the things you consider when you decide what you intend to do next with Python. Continuations are important, and AFAIK, the stackless mod is the only way to get them. Thanks, -- "It does me no injury for my neighbor to say there are twenty gods or no God. It neither picks my pocket nor breaks my leg." --- Thomas Jefferson "Go to Heaven for the climate, Hell for the company." -- Mark Twain -- Eric Hopper (hopper@omnifarious.org http://www.omnifarious.org/~hopper) -- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040119/bd2cb4ca/attachment-0001.bin From aahz at pythoncraft.com Mon Jan 19 19:02:14 2004 From: aahz at pythoncraft.com (Aahz) Date: Mon Jan 19 19:02:17 2004 Subject: [Python-Dev] Stackless Python In-Reply-To: <20040119235534.GA27195@omnifarious.org> References: <20040119235534.GA27195@omnifarious.org> Message-ID: <20040120000214.GA26119@panix.com> On Mon, Jan 19, 2004, Eric Mathew Hopper wrote: > > I want to right the code so that it will work when there is no guarantee > that I'll have an entire message before I start decoding it. This means > the parser may have have to stop at any random point in time, and then > be restarted again when more data is available. I see no way to do this > without continuations, or threads, and I refuse to use threads for > something like this. Python 2.2 added generators that likely can handle what you're trying to do. However, python-dev is not the place to get help on this subject; please use comp.lang.python instead. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From pinard at iro.umontreal.ca Mon Jan 19 18:36:57 2004 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Mon Jan 19 19:58:15 2004 Subject: [Python-Dev] Re: 2.2 features In-Reply-To: References: Message-ID: <20040119233657.GA10668@titan.progiciels-bpi.ca> [Guido van Rossum] > Marcin 'Qrczak' Kowalczyk writes: > > Now thow in 'isinstance' as an old way to spell a specific membership > > test (for types and classes) which doesn't have to be learned about > > at all. > I love it. 'x in type' as a shorthand for isinstance(x, type). > Checked into CVS! The above exchange is dated 2001-07-31, and I'm merely glancing over some old saved email. The above feature does not seem to be implemented in Python 2.3.2, as demonstrated by the script below. Has it been retracted? Not that I really need it, however! :-) Python 2.3.2 (#1, Nov 9 2003, 20:14:22) [GCC 3.3 20030226 (prerelease) (SuSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. from __future__ import nested_scopes, generators, division del __future__, fcntl, termios, joined, character, nested_scopes, generators, division sys.ps1 = '>>> ' from __future__ import nested_scopes, generators, division del __future__, fcntl, termios, joined, character, nested_scopes, generators, division sys.ps1 = '>>> ' >>> 0 in int Traceback (most recent call last): File "", line 1, in ? TypeError: iterable argument required >>> isinstance(0, int) True >>> int >>> -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From bob at redivi.com Mon Jan 19 20:18:47 2004 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 19 20:16:21 2004 Subject: [Python-Dev] Re: 2.2 features In-Reply-To: <20040119233657.GA10668@titan.progiciels-bpi.ca> References: <20040119233657.GA10668@titan.progiciels-bpi.ca> Message-ID: <9910C4D2-4AE6-11D8-A4CC-000A95686CD8@redivi.com> On Jan 19, 2004, at 6:36 PM, Fran?ois Pinard wrote: > [Guido van Rossum] >> Marcin 'Qrczak' Kowalczyk writes: > >>> Now thow in 'isinstance' as an old way to spell a specific membership >>> test (for types and classes) which doesn't have to be learned about >>> at all. > >> I love it. 'x in type' as a shorthand for isinstance(x, type). >> Checked into CVS! > > The above exchange is dated 2001-07-31, and I'm merely glancing over > some old saved email. The above feature does not seem to be > implemented > in Python 2.3.2, as demonstrated by the script below. Has it been > retracted? Not that I really need it, however! :-) I don't know what happened to this before, but I'd give it a -1.. it breaks some "fun" uses of metaclasses where you have a desire to use __contains__. For example, I use an enumeration class as a container for enumerators.. the enumeration metaclass implements __contains__, __getitem__, and a few other things. Anyways, why not: mytype in type(foo).mro() (and add support for old style classes, if you still use them) -bob From guido at python.org Mon Jan 19 20:16:26 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 19 20:16:34 2004 Subject: [Python-Dev] Re: 2.2 features In-Reply-To: Your message of "Mon, 19 Jan 2004 18:36:57 EST." <20040119233657.GA10668@titan.progiciels-bpi.ca> References: <20040119233657.GA10668@titan.progiciels-bpi.ca> Message-ID: <200401200116.i0K1GQt24664@c-24-5-183-134.client.comcast.net> > [Guido van Rossum] > > Marcin 'Qrczak' Kowalczyk writes: > > > > Now thow in 'isinstance' as an old way to spell a specific membership > > > test (for types and classes) which doesn't have to be learned about > > > at all. > > > I love it. 'x in type' as a shorthand for isinstance(x, type). > > Checked into CVS! > > The above exchange is dated 2001-07-31, and I'm merely glancing over > some old saved email. The above feature does not seem to be implemented > in Python 2.3.2, as demonstrated by the script below. Has it been > retracted? Not that I really need it, however! :-) Yes, it was retracted. From the CVS log the next day: ---------------------------- revision 2.16.8.73 date: 2001/08/01 03:56:39; author: gvanrossum; state: Exp; lines: +1 -21 Rip out again the 'x in type' support. Further newsgroup discussion showed it was confusing. E.g. people tried "if x in [int, long]: ...". ---------------------------- In retrospect I think it was too cute... --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Mon Jan 19 23:14:57 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon Jan 19 23:17:33 2004 Subject: [Python-Dev] Stackless Python In-Reply-To: <20040119235534.GA27195@omnifarious.org> References: <20040119235534.GA27195@omnifarious.org> Message-ID: <20040119195011.D6AF.JCARLSON@uci.edu> > I'm working on a project (http://www.cakem.net/) that involves > networking. I have to do a lot of decoding of network data into an > internal representation, and encoding internal representations into a > network format. > > The decoding part is going to present a problem, a problem that could > easily be solved by continuations. > > I want to right the code so that it will work when there is no guarantee > that I'll have an entire message before I start decoding it. This means > the parser may have have to stop at any random point in time, and then > be restarted again when more data is available. I see no way to do this > without continuations, or threads, and I refuse to use threads for > something like this. > > For now, I think I will do one of two things. Either build in the > assumption that I will have the entire messages before I start parsing, > or organize parsing as a series of steps that can be aborted in the > middle and then restarted, and use exceptions to track whether the step > was fully completed or not. Sort of a poor man's continuation. > > Anyway, I'm not a Python developer, just a user. I just wanted to add > this to the things you consider when you decide what you intend to do > next with Python. Continuations are important, and AFAIK, the stackless > mod is the only way to get them. Eric, I've taken a look at your protocol, and it looks to be simple enough to just decode the header (maybe partially) and determine (based on the length of the data received) if the entire message has been received yet. Certainly it does end up reparsing the header a few times, but as long as you only check to see if the message has been received on data read (which is guaranteed to be at least 1 byte for select/poll read events, and is commonly much larger), there shouldn't be too much reparsing. I've actually done the same thing with some nontrivial headers, and have had no problems maxing out a 100mbit connection with a PII-400 and a properly subclassed asyncore.dispatcher. Even though it would be computationally more efficient to use continuations or generators, they are not necessary. You will likely find more overhead with the cryptographic portions of your protocol than you will reparsing the header multiple times. - Josiah From bob at redivi.com Mon Jan 19 23:31:02 2004 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 19 23:28:34 2004 Subject: [Python-Dev] Stackless Python In-Reply-To: <20040119195011.D6AF.JCARLSON@uci.edu> References: <20040119235534.GA27195@omnifarious.org> <20040119195011.D6AF.JCARLSON@uci.edu> Message-ID: <744E0F78-4B01-11D8-A4CC-000A95686CD8@redivi.com> On Jan 19, 2004, at 11:14 PM, Josiah Carlson wrote: >> I'm working on a project (http://www.cakem.net/) that involves >> networking. I have to do a lot of decoding of network data into an >> internal representation, and encoding internal representations into a >> network format. >> >> The decoding part is going to present a problem, a problem that could >> easily be solved by continuations. You could write your implementation of the network protocol using asyncore (medusa), Twisted, and/or PEAK... all of which are good at asynchronous "stuff" to varying degrees. PEAK has a bunch of new things that make it easy to write asynchronous-acting but synchronous-looking code using generators, these may only be in CVS though. >> Anyway, I'm not a Python developer, just a user. I just wanted to add >> this to the things you consider when you decide what you intend to do >> next with Python. Continuations are important, and AFAIK, the >> stackless >> mod is the only way to get them. I agree, but good luck convincing everyone that it's important enough to merge the fork (it's a fork of Python, not a module). BTW, Christian says the 2.3 branch of Stackless is pretty close to ready. -bob From mal at egenix.com Tue Jan 20 04:19:19 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Jan 20 04:20:21 2004 Subject: [Python-Dev] Re: Changing codec lookup order In-Reply-To: <400C5A96.1070604@v.loewis.de> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> <400AF97C.7060609@v.loewis.de> <20040118220108.GA75991@i18n.org> <400BB39E.9040305@egenix.com> <400C5A96.1070604@v.loewis.de> Message-ID: <400CF297.8050803@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >> I think it's a good idea, to have the aliases dictionary >> checked before trying to do the import. >> >> I'll fix that. Backporting the fix should also be possible, since >> it is not related to any particular codec package. > > No. Instead, it applies uniformly to all codecs - so in some cases, > users may see changes in the behaviour. Could you give an example ? The only change that I can think of is that setting up aliases to override builtin codecs would start working and that's more like a bug fix than a new feature. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 19 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From g.ladd at eris.qinetiq.com Tue Jan 20 07:50:40 2004 From: g.ladd at eris.qinetiq.com (Gareth Ladd) Date: Tue Jan 20 07:51:10 2004 Subject: [Python-Dev] isinstance(x, types.UnboundMethodType) feature Message-ID: <400D2420.7040701@eris.qinetiq.com> Hello all, Whilst writing a routine to perform callbacks, I was performing a check to ensure the callback method is callable and bound. After checking "callable(x)" I then check that "x" is bound using "isinstance(x, types.UnboundMethodType)". It appears the latter returns true if "x" is a bound or an unbound class/instance method. Please have a look at the following example. I would expect #3 to return "False". import types def hello(): print "hello" class C: def hello(self): print "hello" c=C() print isinstance(hello, types.UnboundMethodType) #1 print isinstance(C.hello, types.UnboundMethodType) #2 print isinstance(c.hello, types.UnboundMethodType) #3 # Check it is actualy bound f=c.hello f() #4 False True True hello Am I missing an subtelty of bound/unbound-ness? Thanks in advance, Gareth. From mwh at python.net Tue Jan 20 07:57:00 2004 From: mwh at python.net (Michael Hudson) Date: Tue Jan 20 07:57:03 2004 Subject: [Python-Dev] isinstance(x, types.UnboundMethodType) feature In-Reply-To: <400D2420.7040701@eris.qinetiq.com> (Gareth Ladd's message of "Tue, 20 Jan 2004 12:50:40 +0000") References: <400D2420.7040701@eris.qinetiq.com> Message-ID: <2m4quqpylf.fsf@starship.python.net> Gareth Ladd writes: > Am I missing an subtelty of bound/unbound-ness? Unbound and bound methods have the same type. Cheers, mwh (PS: inappropriate post for python-dev) -- I have long since given up dealing with people who hold idiotic opinions as if they had arrived at them through thinking about them. -- Erik Naggum, comp.lang.lisp From niemeyer at conectiva.com Tue Jan 20 08:37:13 2004 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Tue Jan 20 08:42:57 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: References: <20040110044032.GA2901@burma.localdomain> Message-ID: <20040120133713.GA4251@burma.localdomain> I'm back to life, after a long and nice vacation. Sorry for taking a while to answer this. > I think Martin is using a more-recent version of gcc than most people here. > Everyone on Windows (MSVC) sees these warnings, though. I'm using gcc 3.3.2 here. > > Do you see any others? > > Windows doesn't complain about the ones your compiler complains about; it > complains about 17 others, listed here (although the line numbers have > probably changed in the last 3 months ): > > http://mail.python.org/pipermail/python-dev/2003-October/039059.html Ok.. it should give me an idea about where to look. [...] > > Suggestions welcome. > > Upgrade to Windows . Next one please.. :-) > Most of those seem to come from lines of the form > > SRE_LOC_IS_WORD((int) ptr[-1]) : 0; [...] Yes, that's what I reported in my first email. IMO, the compiler should not complain about it, since the char is being casted away. [...] > when the "unsigned char" expansion of SRE_CHAR is in effect. I suppose that > could be repaired by defining SRE_LOC_IS_ALNUM differently depending on how > SRE_CHAR is defined. That's an alternative indeed. [...] > and SRE_CODE is unsigned short or unsigned long, depending on > Py_UNICODE_WIDE. This warning is really irritating. I suppose [...] > would shut it up, but that's sure ugly. I'm trying to avoid using something ugly to avoid an ugly warning. I'll spend some time on it now that I'm back, and relaxed. :-) Thanks. -- Gustavo Niemeyer http://niemeyer.net From niemeyer at conectiva.com Tue Jan 20 08:40:17 2004 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Tue Jan 20 08:45:57 2004 Subject: [Python-Dev] Re: sre warnings In-Reply-To: <3FFFED2D.8060401@v.loewis.de> References: <20040109174051.GA6480@burma.localdomain> <001401c3d6e3$8a4c91e0$ceb99d8d@oemcomputer> <20040109191401.GA8086@burma.localdomain> <3FFF2166.90201@v.loewis.de> <20040110044032.GA2901@burma.localdomain> <3FFFED2D.8060401@v.loewis.de> Message-ID: <20040120134017.GB4251@burma.localdomain> > >Ohh.. I wasn't aware about this case, since I get no errors at all at > >this line. Do you see any others? > > As Tim explains, this is likely due to me using a more recent gcc (3.3). That's the same one I'm using. > I'll attach my current list of warnings below. Thanks. > The signed/unsigned warnings are actually harmless: they cause a problem > only when comparing negative and unsigned numbers (where the negative > number gets converted to unsigned and may actually compare larger than > the positive one). This should not be an issue for SRE - I believe > all the numbers are always non-negative, unless they are the -1 marker > (for which I'm uncertain whether it is ever relationally compared). I'll check it out. Thanks. -- Gustavo Niemeyer http://niemeyer.net From raymond.hettinger at verizon.net Tue Jan 20 09:17:43 2004 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue Jan 20 09:18:46 2004 Subject: [Python-Dev] dict.addlist() Message-ID: <001b01c3df60$30a54da0$d118c797@oemcomputer> Here is anidea to kick around: Evolve and eventually replace dict.setdefault with a more specialized method that is clearer, cleaner, easier to use, and faster. d.addlist(k, v) would work like d.setdefault(k, []).append(v) Raymond Hettinger >>> d = dict() >>> for elem in 'apple orange ant otter banana asp onyx boa'.split(): ... k = elem[0] ... d.addlist(k, elem) ... >>> d {'a': ['apple', 'ant', 'asp'], 'b': ['banana', 'boa'], 'o': ['orange', 'otter', 'onyx']} bookindex = dict() for pageno, page in enumerate(pages): for word in page: bookindex.addlist(word, pageno) From bob at redivi.com Tue Jan 20 09:26:39 2004 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 20 09:24:12 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <001b01c3df60$30a54da0$d118c797@oemcomputer> References: <001b01c3df60$30a54da0$d118c797@oemcomputer> Message-ID: On Jan 20, 2004, at 9:17 AM, Raymond Hettinger wrote: > Here is anidea to kick around: > > Evolve and eventually replace dict.setdefault with a more specialized > method that is clearer, cleaner, easier to use, and faster. > > d.addlist(k, v) would work like d.setdefault(k, > []).append(v) -1 There are other reasons to use setdefault. This one is pretty common though, but I think a more generic solution could be implemented. Perhaps: d.setdefault(k, factory=list).append(v) ? -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2357 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040120/09595ead/smime.bin From jacobs at penguin.theopalgroup.com Tue Jan 20 09:25:09 2004 From: jacobs at penguin.theopalgroup.com (Kevin Jacobs) Date: Tue Jan 20 09:25:12 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <001b01c3df60$30a54da0$d118c797@oemcomputer> Message-ID: On Tue, 20 Jan 2004, Raymond Hettinger wrote: > Here is anidea to kick around: > > Evolve and eventually replace dict.setdefault with a more specialized > method that is clearer, cleaner, easier to use, and faster. > > d.addlist(k, v) would work like d.setdefault(k, []).append(v) -1: I use .setdefault all the time with non-list default arguments. The dict interface is getting complex enough that adding more clutter will make it even harder to teach optimal idiomatic Python to newbies. -Kevin -- Kevin Jacobs - Enterprise Systems Architect | What is an Architect? He designs Consultant to The National Cancer Institute | a house for another to build Dept. of Cancer Epidemiology and Genetics | and someone else to inhabit. V: (301) 954-0726 E: jacobske@mail.nih.gov | -- William Kahan From python at rcn.com Tue Jan 20 09:30:21 2004 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 20 09:31:17 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: Message-ID: <003301c3df61$f0147480$d118c797@oemcomputer> [Bob Ippolito] > d.setdefault(k, factory=list).append(v) ? That is somewhat nice and backwards compatible too. Raymond Hettinger From Boris.Boutillier at arteris.net Tue Jan 20 09:34:28 2004 From: Boris.Boutillier at arteris.net (Boris Boutillier) Date: Tue Jan 20 09:34:37 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <001b01c3df60$30a54da0$d118c797@oemcomputer> References: <001b01c3df60$30a54da0$d118c797@oemcomputer> Message-ID: <400D3C74.7080109@arteris.net> This also mean that we add : d.addset(k,v) d.setdefault(k,Set()).add(v) d.adddict(k,subk,v) d.setdefault(k,{})[subk]=v I think we can't add a new method for each python datatype .. Raymond Hettinger wrote: >Here is anidea to kick around: > >Evolve and eventually replace dict.setdefault with a more specialized >method that is clearer, cleaner, easier to use, and faster. > > d.addlist(k, v) would work like d.setdefault(k, []).append(v) > > >Raymond Hettinger > > > > >>>>d = dict() >>>>for elem in 'apple orange ant otter banana asp onyx boa'.split(): >>>> >>>> >... k = elem[0] >... d.addlist(k, elem) >... > > >>>>d >>>> >>>> >{'a': ['apple', 'ant', 'asp'], 'b': ['banana', 'boa'], 'o': ['orange', >'otter', >'onyx']} > > I'm sure here you'll be more interested in a addset rather than addlist :). > >bookindex = dict() >for pageno, page in enumerate(pages): > for word in page: > bookindex.addlist(word, pageno) > > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/boris.boutillier%40arteris.net > > From Paul.Moore at atosorigin.com Tue Jan 20 09:41:19 2004 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Tue Jan 20 09:41:21 2004 Subject: [Python-Dev] dict.addlist() Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.atosorigin.com> From: Raymond Hettinger > [Bob Ippolito] >> d.setdefault(k, factory=list).append(v) ? > > That is somewhat nice and backwards compatible too. -1 How is it more expressive than d.setdefault(k, []).append(v)? As far as I can see, it's longer and contains an extra obscure(ish) term (factory). And I agree with the other posters that other defaults are often useful, and further complicating the dictionary interface isn't particularly helpful. If conciseness is important, def addlist(d, k, v): d.setdefault(k, []).append(v) Paul. From jeremy at alum.mit.edu Tue Jan 20 09:46:02 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue Jan 20 09:51:37 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.atosorigin.com> Message-ID: <1074609962.14813.133.camel@localhost.localdomain> On Tue, 2004-01-20 at 09:41, Moore, Paul wrote: > From: Raymond Hettinger > > [Bob Ippolito] > >> d.setdefault(k, factory=list).append(v) ? > > > > That is somewhat nice and backwards compatible too. > > -1 > > How is it more expressive than d.setdefault(k, []).append(v)? As > far as I can see, it's longer and contains an extra obscure(ish) > term (factory). One benefit is that it doesn't create an unnecessary empty list when the key is already in the dictionary. > And I agree with the other posters that other defaults are often > useful, and further complicating the dictionary interface isn't > particularly helpful. > > If conciseness is important, > > def addlist(d, k, v): > d.setdefault(k, []).append(v) Nonetheless, I agree that the dict API is already quite large and the benefit of this new/changed method is very small. Jeremy From barry at python.org Tue Jan 20 09:52:05 2004 From: barry at python.org (Barry Warsaw) Date: Tue Jan 20 09:52:17 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: References: <001b01c3df60$30a54da0$d118c797@oemcomputer> Message-ID: <1074610325.14688.88.camel@anthem> On Tue, 2004-01-20 at 09:26, Bob Ippolito wrote: > On Jan 20, 2004, at 9:17 AM, Raymond Hettinger wrote: > > > Here is anidea to kick around: > > > > Evolve and eventually replace dict.setdefault with a more specialized > > method that is clearer, cleaner, easier to use, and faster. > > > > d.addlist(k, v) would work like d.setdefault(k, > > []).append(v) > > -1 > > There are other reasons to use setdefault. This one is pretty common > though, but I think a more generic solution could be implemented. > > Perhaps: > > d.setdefault(k, factory=list).append(v) ? d.setdefault(k, []).append(v) I'm not sure what any of the other suggestions buy you except avoiding a list instantiation, which doesn't seem like enough to warrant the extra complexity. I use setdefault() quite a bit (big surprise, huh?) and occasionally would like lazy evaluation of the second argument, but it usually doesn't bother me. -Barry From barry at python.org Tue Jan 20 09:52:37 2004 From: barry at python.org (Barry Warsaw) Date: Tue Jan 20 09:52:48 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: References: Message-ID: <1074610357.14688.90.camel@anthem> On Tue, 2004-01-20 at 09:25, Kevin Jacobs wrote: > -1: I use .setdefault all the time with non-list default arguments. > > The dict interface is getting complex enough that adding more clutter will > make it even harder to teach optimal idiomatic Python to newbies. I completely agree with both points. -Barry From pje at telecommunity.com Tue Jan 20 09:56:57 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 20 09:53:26 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: References: <001b01c3df60$30a54da0$d118c797@oemcomputer> <001b01c3df60$30a54da0$d118c797@oemcomputer> Message-ID: <5.1.0.14.0.20040120095346.024a86b0@mail.telecommunity.com> At 09:26 AM 1/20/04 -0500, Bob Ippolito wrote: >There are other reasons to use setdefault. This one is pretty common >though, but I think a more generic solution could be implemented. > >Perhaps: > >d.setdefault(k, factory=list).append(v) ? +100. :) An excellent replacement for my recurring use of: try: return self._somemapping[key] except: self._somemapping[key] = value = somethingExpensive(key) return value That becomes simply: return self._somemapping.setdefault( key, factory=lambda: somethingExpensive(key) ) From pje at telecommunity.com Tue Jan 20 10:05:41 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 20 10:02:10 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.a tosorigin.com> Message-ID: <5.1.0.14.0.20040120095943.02428a70@mail.telecommunity.com> At 02:41 PM 1/20/04 +0000, Moore, Paul wrote: >From: Raymond Hettinger > > [Bob Ippolito] > >> d.setdefault(k, factory=list).append(v) ? > > > > That is somewhat nice and backwards compatible too. > >-1 > >How is it more expressive than d.setdefault(k, []).append(v)? As >far as I can see, it's longer and contains an extra obscure(ish) >term (factory). It's more expressive because it allows lazy evaluation: the list isn't created unless a new one is needed. For lists, that's not such a big deal. Indeed, it's entirely possible that the cost of adding the keyword argument to the function call would outweigh the cost of creating and discarding an emtpy list. But, when using a dictionary as a cache for an expensive computation, this notation is much more compact than writing an if-then or try-except block to do the same thing. Even so, compactness is actually less the point than clarity. The factory idiom is more readable (IMO) because it gets out of the way of the point of the code. It says, "oh by the way, the default for new members of 'd' is to be a new list, and now let's get on with what we're going to do with it." From pje at telecommunity.com Tue Jan 20 10:12:21 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 20 10:08:56 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <5.1.0.14.0.20040120095346.024a86b0@mail.telecommunity.com> References: <001b01c3df60$30a54da0$d118c797@oemcomputer> <001b01c3df60$30a54da0$d118c797@oemcomputer> Message-ID: <5.1.0.14.0.20040120100637.02420a90@mail.telecommunity.com> At 09:56 AM 1/20/04 -0500, Phillip J. Eby wrote: >try: > return self._somemapping[key] >except: > self._somemapping[key] = value = somethingExpensive(key) > return value Oops, that should've been except KeyError. Which actually points up another reason to want to use a setdefault mechanism, although admittedly not for the builtin dictionary type. Using exceptions for control flow can mask actual errors occuring within a component being used. Thus, when I create mapping-like objects, I try as much as possible to push "defaulting" into the lower-level mappings, to avoid needing to trap errors and turn them into defaults. Sometimes, however, that default is expensive to create, so I actually use a factory argument, and it's often named 'factory'. Hence, my enthusiasm for the suggestion. From fdrake at acm.org Tue Jan 20 10:11:07 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Jan 20 10:11:14 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.atosorigin.com> Message-ID: <16397.17675.247663.995097@sftp.fdrake.net> Regarding d.setdefault(k, factory=list).append(v): Moore, Paul writes: > How is it more expressive than d.setdefault(k, []).append(v)? As > far as I can see, it's longer and contains an extra obscure(ish) > term (factory). The advantage I see is that it delays creation of the object being used as the default value; no list is created unless needed, whereas the setdefault() invocation shown always creates a new list, even if it's going to be thrown away. In many situations, if you're really using a list, this doesn't matter, but in other cases, especially if you're using a more complex object (or one with a more expensive constructor), using the factory makes a lot of sense. The idea that "factory" is an obscure term for Python programmers is scary. > And I agree with the other posters that other defaults are often > useful, and further complicating the dictionary interface isn't > particularly helpful. The basic setdefault() certainly remains useful on its own; I don't think anyone suggested otherwise. My objection is the complication of the dictionary interface; it has a lot of methods now, and avoiding new names with highly specialized meanings is good. -1 for addlist(k, v) -0 for setdefault(k, factory=...) > If conciseness is important, > > def addlist(d, k, v): > d.setdefault(k, []).append(v) Yep. Or whatever variation makes sense in context. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From exarkun at intarweb.us Tue Jan 20 10:10:05 2004 From: exarkun at intarweb.us (Jp Calderone) Date: Tue Jan 20 10:12:07 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <1074609962.14813.133.camel@localhost.localdomain> References: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.atosorigin.com> <1074609962.14813.133.camel@localhost.localdomain> Message-ID: <20040120151005.GA6382@intarweb.us> On Tue, Jan 20, 2004 at 09:46:02AM -0500, Jeremy Hylton wrote: > On Tue, 2004-01-20 at 09:41, Moore, Paul wrote: > > From: Raymond Hettinger > > > [Bob Ippolito] > > >> d.setdefault(k, factory=list).append(v) ? > > > > > > That is somewhat nice and backwards compatible too. > > > > -1 > > > > How is it more expressive than d.setdefault(k, []).append(v)? As > > far as I can see, it's longer and contains an extra obscure(ish) > > term (factory). > > One benefit is that it doesn't create an unnecessary empty list when the > key is already in the dictionary. > This sounds like a benefit if it is faster to not create the empty list? Naive preliminary testing does not indicate that it is, I imagine due to the keyword argument overhead. My test functions: def foo(x, y=None, z=None): pass def f1(): for x in xrange(10000000): foo(10, []) def f2(): for x in xrange(10000000): foo(10, z=list) 7.8 and 8.8 seconds to execute, respectively. Since f1 and f2 aren't C functions, maybe it's not an accurate prediction of how setdefault would behave... but it does seem that object creation overhead is pretty minimal. For other cases, where the overhead is much greater, it might be a different story. My own code shows no examples of that though (actually I seem to only use setdefault with 2 defaults, None and []). Jp From Jack.Jansen at cwi.nl Tue Jan 20 10:14:21 2004 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Tue Jan 20 10:13:22 2004 Subject: [Python-Dev] PEP 11 mistake? In-Reply-To: <16392.24824.367536.660812@montanaro.dyndns.org> References: <16392.24824.367536.660812@montanaro.dyndns.org> Message-ID: <52EF862C-4B5B-11D8-9B7E-000A958D1666@cwi.nl> On 16-jan-04, at 23:08, Skip Montanaro wrote: > Looking at the future unsupported platforms I saw this: > > Name: MacOS 9 > Unsupported in: Python 2.4 > Code removed in: Python 2.4 > > Surely the code shouldn't removed until 2.5, right? Too bad, it's gone already. I've been asking for over a year whether anyone had interest in maintaining 2.4 for OS9, with exactly 0 replies. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From ncoghlan at iinet.net.au Tue Jan 20 10:14:12 2004 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Jan 20 10:18:58 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <1074610357.14688.90.camel@anthem> References: <1074610357.14688.90.camel@anthem> Message-ID: <400D45C4.3000008@iinet.net.au> Barry Warsaw wrote: > On Tue, 2004-01-20 at 09:25, Kevin Jacobs wrote: > > >>-1: I use .setdefault all the time with non-list default arguments. >> >>The dict interface is getting complex enough that adding more clutter will >>make it even harder to teach optimal idiomatic Python to newbies. > > I completely agree with both points. As a relative newbie (and not having done much coding in recent months), I have to say it took me a while to figure out what "d.setdefault(k, []).append(v)" actually did (I had previously only encountered dict.getdefault, and the meaning of dict.setdefault was not immediately obvious to me). If I saw "d.addlist(some_variable, some_other_variable)", I certainly would not automatically interpret it as "append some_other_variable to the value keyed by some_variable, creating that value as the empty list if it is not present". At least the current approach breaks this into two steps, and gave me a chance to figure it out without diving into the docs to find out what the method does. This does seem to be another example where a 'defaulting dictionary' with a settable factory method would seem to be useful. Then: dd = defaultingdict(factory=list) ...other code uses dict... dd[k].append(v) (As others have pointed out, this can't be a keyword argument on standard dictionaries without interfering with the current automatic population of the created dictionary with the keyword dictionary) Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 From guido at python.org Tue Jan 20 10:34:53 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 20 10:34:58 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: Your message of "Tue, 20 Jan 2004 09:17:43 EST." <001b01c3df60$30a54da0$d118c797@oemcomputer> References: <001b01c3df60$30a54da0$d118c797@oemcomputer> Message-ID: <200401201534.i0KFYrk03503@c-24-5-183-134.client.comcast.net> > Here is anidea to kick around: > > Evolve and eventually replace dict.setdefault with a more specialized > method that is clearer, cleaner, easier to use, and faster. > > d.addlist(k, v) would work like d.setdefault(k, []).append(v) > > > Raymond Hettinger > > > >>> d = dict() > >>> for elem in 'apple orange ant otter banana asp onyx boa'.split(): > ... k = elem[0] > ... d.addlist(k, elem) > ... > >>> d > {'a': ['apple', 'ant', 'asp'], 'b': ['banana', 'boa'], 'o': ['orange', > 'otter', > 'onyx']} > > > bookindex = dict() (What's wrong with {}?) > for pageno, page in enumerate(pages): > for word in page: > bookindex.addlist(word, pageno) I'm -0 on the idea (where does it stop? when we have reimplemented Perl?), and -1 on the name (it suggests adding a list). It *is* a somewhat common idiom, but do we really need to turn all idioms into method calls? I think not -- only if the idiom is hard to read in its natural form. bookindex = {} for pageno, page in enumerate(pages): for word in page: lst = bookindex.get(word) if lst is None: bookindex[word] = lst = [] lst.append(pageno) works for me. (I think setdefault() was already a minor mistake -- by the time I've realized it applies I have already written working code without it. And when using setdefault() I always worry about the waste of the new empty list passed in each time that's ignored most times.) If you really want library support for this idiom, I'd propose adding a higher-level abstraction that represents an index: bookindex = Index() for pageno, page in enumerate(pages): for word in page: bookindex.add(word, pageno) What you're really doing here is inverting an index; maybe that idea can be leveraged? bookindex = Index() bookindex.fill(enumerate(pages), lambda page: iter(page)) This would require something like class Index: def __init__(self): self.map = {} def fill(self, items, extract): for key, subsequence in items: for value in extract(subsequence): self.map.setdefault(value, []).append(key) Hm, maybe it should take a key function instead: bookindex = Index() bookindex.fill(enumerate(pages), lambda x: iter(x[1]), lambda x: x[0]) with a definition of class Index: def __init__(self): self.map = {} def fill(self, seq, getitems, getkey): for item in seq: key = getkey(item) for value in getitems(item): self.map.setdefault(value, []).append(key) Hmm... Needs more work... I don't like using extractor functions. But I like addlist() even less. --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at iinet.net.au Tue Jan 20 10:39:18 2004 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Jan 20 10:39:25 2004 Subject: [Python-Dev] Re: Accumulation module In-Reply-To: References: <00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer> Message-ID: <400D4BA6.1070804@iinet.net.au> James Kew wrote: > "Raymond Hettinger" wrote in message > news:00cf01c3da0e$b3a2e3e0$6e02a044@oemcomputer... > >>I'm working a module with some accumulation/reduction/statistical >>formulas: >> >>average(iterable) > > > "Average" is a notoriously vague word. Prefer "mean" -- which opens the door > for "median" and "mode" to join the party. > > I could have used a stats module a few months ago -- feels like a bit of an > omission from the standard library, especially compared against the richness > of the random module. If this basic statistics module makes it into the standard library, perhaps the associated module docs could point to some of the more 'full-featured' statistical packages that are out there? Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 From FBatista at uniFON.com.ar Tue Jan 20 11:02:14 2004 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Tue Jan 20 11:05:18 2004 Subject: [Python-Dev] Draft: PEP for imports Message-ID: Aahz wrote: #- Here are the contenders: #- #- from .foo import #- #- from ...foo import I think the following idea maybe is too radical, but well... I don't use packages, but when I read the tutorial first time, I wondered why, if the packages appeared to solve a 'pack into a directory' problem, the names were joined with '.' instead of '/'. Later I found some other uses to the 'unix path' notation. For example: - Module inside a package: import package/module - Relative import: import ../../otherpack/webmodules/somemodule - Importing a personal module when you have a program in several places but not enough permission to write in the site-packages directory: import /home/facundo/python/modules/myModule (you can do this with "import sys" and "sys.path.append('...')", but seems an overhead to me). Regards, . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20040120/d998a541/attachment.html From ncoghlan at iinet.net.au Tue Jan 20 11:23:37 2004 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Jan 20 11:23:47 2004 Subject: [Python-Dev] Python in 2003 summary In-Reply-To: <20040119144815.GA22016@rogue.amk.ca> References: <20040117171401.GA18383@rogue.amk.ca> <2m3caescu6.fsf@starship.python.net> <20040118142817.GC19930@rogue.amk.ca> <200401190017.i0J0HjO03474@c-24-5-183-134.client.comcast.net> <20040119144815.GA22016@rogue.amk.ca> Message-ID: <400D5609.6080905@iinet.net.au> A.M. Kuchling wrote: > On Sun, Jan 18, 2004 at 04:17:45PM -0800, Guido van Rossum wrote: > >>It's a very different conference than PyCon or >>EuroPython, probably because of the presence of lots of non-Python >>folks (it's the yearly ACCU conference, and they do C, C++, Java and >>other OO languages). The Python offerings are limited in volume but >>(last year at least) were of very high quality. > > > Can I quote that description in the summary, please? It's a good summation. I'm reading the revised version, and it may need to be made clearer that Guido made this comment in 2004 - the way it currently reads, it looks like his last comment is referring to a 2002 conference! Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 From martin at v.loewis.de Tue Jan 20 13:01:18 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 20 13:02:00 2004 Subject: [Python-Dev] Re: Changing codec lookup order In-Reply-To: <400CF297.8050803@egenix.com> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> <400AF97C.7060609@v.loewis.de> <20040118220108.GA75991@i18n.org> <400BB39E.9040305@egenix.com> <400C5A96.1070604@v.loewis.de> <400CF297.8050803@egenix.com> Message-ID: <400D6CEE.7010802@v.loewis.de> M.-A. Lemburg wrote: >> No. Instead, it applies uniformly to all codecs - so in some cases, >> users may see changes in the behaviour. > > > Could you give an example ? > > The only change that I can think of is that setting up aliases > to override builtin codecs would start working and that's more > like a bug fix than a new feature. Currently, if somebody defines a codec windows_1252.py, foo.encode("windows_1252") uses this codec. With the proposed change, it uses cp1252.py. To see the behaviour change, no changes to the aliases list is necessary. Regards, Martin From mal at egenix.com Tue Jan 20 13:16:39 2004 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Jan 20 13:17:41 2004 Subject: [Python-Dev] Re: Changing codec lookup order In-Reply-To: <400D6CEE.7010802@v.loewis.de> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> <400AF97C.7060609@v.loewis.de> <20040118220108.GA75991@i18n.org> <400BB39E.9040305@egenix.com> <400C5A96.1070604@v.loewis.de> <400CF297.8050803@egenix.com> <400D6CEE.7010802@v.loewis.de> Message-ID: <400D7087.7040605@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>> No. Instead, it applies uniformly to all codecs - so in some cases, >>> users may see changes in the behaviour. >> >> Could you give an example ? >> >> The only change that I can think of is that setting up aliases >> to override builtin codecs would start working and that's more >> like a bug fix than a new feature. > > Currently, if somebody defines a codec windows_1252.py, > foo.encode("windows_1252") uses this codec. With the proposed > change, it uses cp1252.py. To see the behaviour change, no > changes to the aliases list is necessary. Thanks for the example. I agree that we should not backport the change then. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 20 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Tue Jan 20 13:12:28 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 20 13:22:32 2004 Subject: [Python-Dev] PEP 11 mistake? In-Reply-To: <52EF862C-4B5B-11D8-9B7E-000A958D1666@cwi.nl> References: <16392.24824.367536.660812@montanaro.dyndns.org> <52EF862C-4B5B-11D8-9B7E-000A958D1666@cwi.nl> Message-ID: <400D6F8C.9020403@v.loewis.de> Jack Jansen wrote: > Too bad, it's gone already. > > I've been asking for over a year whether anyone had interest in maintaining > 2.4 for OS9, with exactly 0 replies. If you are willing to take the blame :-) PEP 11 would normally recommend to leave the code in, but disable it, so attempts to use it cause build errors. Then wait if any body complains about seeing these build errors, and ask whether they volunteer. In this case, anybody volunteering now would have to bring back the source to maintain first - which may be acceptable in the specific case. Regards, Martin From tjreedy at udel.edu Tue Jan 20 13:26:53 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Jan 20 13:26:48 2004 Subject: [Python-Dev] Re: isinstance(x, types.UnboundMethodType) feature References: <400D2420.7040701@eris.qinetiq.com> Message-ID: "Gareth Ladd" wrote in message news:400D2420.7040701@eris.qinetiq.com... > Am I missing an subtelty of bound/unbound-ness? As of 2.2, C.hello.im_self == None while c.hello.im_self == c. Presume still same in 2.3. This is very much an implementation internal feature and not part of language definition. (Meaning: can be changed without notice.) I believe this is how repr() tells difference. This sort of question is better asked on comp.lang.python. It is about usage of current python rather than development of future python. Terry J. Reedy From python at rcn.com Tue Jan 20 13:32:33 2004 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 20 13:33:30 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <200401201534.i0KFYrk03503@c-24-5-183-134.client.comcast.net> Message-ID: <003001c3df83$c582cf60$d118c797@oemcomputer> [GvR] > > bookindex = dict() > > (What's wrong with {}?) Nothing at all. {} is shorter, faster, and everyone understands it. When teasing out ideas at the interpreter prompt, I tend to use dict() because I find it easier to edit the line and add some initial values using dict(one=1, two=2, s='abc'). > works for me. (I think setdefault() was already a minor mistake -- by > the time I've realized it applies I have already written working code > without it. And when using setdefault() I always worry about the > waste of the new empty list passed in each time that's ignored most > times.) > > If you really want library support for this idiom ... Not really. It was more of a time machine question -- if we had setdefault() to do over again, what would be done differently: * Keep setdefault(). * Drop it and make do with get(), try/except, or if k in d. * Martin's idea for dicts to have an optional factory function for defaults. * Have a specialized method that just supports dicts of lists. After all the discussions, the best solution to the defaulting problem appears to be some variant of Martin's general purpose approach: d = {} d.setfactory(list) for k, v in myitems: d[k].append(v) # dict of lists d = {} d.setfactory(set): for v in mydata: d[f(v)].add(v) # partition into equivalence classes d = {} d.setfactory(int): for v in mydata: d[k] += 1 # bag Raymond Hettinger From fdrake at acm.org Tue Jan 20 13:37:10 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Jan 20 13:37:19 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <003001c3df83$c582cf60$d118c797@oemcomputer> References: <200401201534.i0KFYrk03503@c-24-5-183-134.client.comcast.net> <003001c3df83$c582cf60$d118c797@oemcomputer> Message-ID: <16397.30038.339892.429843@sftp.fdrake.net> Raymond Hettinger writes: > After all the discussions, the best solution to the defaulting problem > appears to be some variant of Martin's general purpose approach: > > d = {} > d.setfactory(list) > for k, v in myitems: > d[k].append(v) # dict of lists Oooh, this could get scary: L = [] L.setfactory(MyClassWithCostlyConstructor) L[0] ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From bob at redivi.com Tue Jan 20 13:50:43 2004 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 20 13:48:15 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <003001c3df83$c582cf60$d118c797@oemcomputer> References: <003001c3df83$c582cf60$d118c797@oemcomputer> Message-ID: <8D1193C0-4B79-11D8-837E-000A95686CD8@redivi.com> On Jan 20, 2004, at 1:32 PM, Raymond Hettinger wrote: > [GvR] >>> bookindex = dict() >> >> (What's wrong with {}?) > > Nothing at all. {} is shorter, faster, and everyone understands it. > > When teasing out ideas at the interpreter prompt, I tend to use > dict() because I find it easier to edit the line and add some > initial values using dict(one=1, two=2, s='abc'). > >> works for me. (I think setdefault() was already a minor mistake -- by >> the time I've realized it applies I have already written working code >> without it. And when using setdefault() I always worry about the >> waste of the new empty list passed in each time that's ignored most >> times.) >> >> If you really want library support for this idiom ... > > Not really. It was more of a time machine question -- if we had > setdefault() to do over again, what would be done differently: > > * Keep setdefault(). > * Drop it and make do with get(), try/except, or if k in d. > * Martin's idea for dicts to have an optional factory function for > defaults. > * Have a specialized method that just supports dicts of lists. Here's another idea: how about adding a special method, similar in implementation to __getattr__, but for __getitem__ -- let's say it's called __getdefaultitem__? You could then subclass dict, implement this method, and you'd probably be able to do what you want rather efficiently. You may or may not decide to "cache" the value inside your custom __getdefaultitem__, and if you do, you may or may not want to change the behavior of __contains__ as well. # warning: untested code class newdict(dict): """Implements the proposed protocol""" def __getitem__(self, item): try: return super(newdict, self).__getitem__(item) except KeyError: return self.__getdefaultitem__(item) def __getdefaultitem__(self, item): raise KeyError(repr(item)) class listdict(newdict): def __getdefaultitem__(self, item): self[item] = rval = [] return rval -bob From pf_moore at yahoo.co.uk Tue Jan 20 14:01:41 2004 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Tue Jan 20 14:01:53 2004 Subject: [Python-Dev] Re: dict.addlist() References: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.atosorigin.com> <16397.17675.247663.995097@sftp.fdrake.net> Message-ID: "Fred L. Drake, Jr." writes: > The idea that "factory" is an obscure term for Python programmers is > scary. As a concept, it's possibly not obscure (although I'm introducing Python to colleagues at work, who are used to VB and SQL, and for whom "object" is a fairly novel term!) But it took me a while to figure out how it related to what's going on here. I got it, and it's now obvious, but that initial "huh?" feeling is probably worth watching out for :-) Paul. -- This signature intentionally left blank From mwh at python.net Tue Jan 20 14:17:50 2004 From: mwh at python.net (Michael Hudson) Date: Tue Jan 20 14:17:53 2004 Subject: [Python-Dev] Re: isinstance(x, types.UnboundMethodType) feature In-Reply-To: (Terry Reedy's message of "Tue, 20 Jan 2004 13:26:53 -0500") References: <400D2420.7040701@eris.qinetiq.com> Message-ID: <2mwu7mmntt.fsf@starship.python.net> "Terry Reedy" writes: > "Gareth Ladd" wrote in message > news:400D2420.7040701@eris.qinetiq.com... >> Am I missing an subtelty of bound/unbound-ness? > > As of 2.2, C.hello.im_self == None while c.hello.im_self == c. Um, ITYM "As of 1.0.1" (which is the oldest version I could find when I went looking recently). > Presume still same in 2.3. This is very much an implementation internal > feature and not part of language definition. (Meaning: can be changed > without notice.) Doesn't mean it is changed very often, though :-) > I believe this is how repr() tells difference. It is. Cheers, mwh -- LINTILLA: You could take some evening classes. ARTHUR: What, here? LINTILLA: Yes, I've got a bottle of them. Little pink ones. -- The Hitch-Hikers Guide to the Galaxy, Episode 12 From edcjones at erols.com Tue Jan 20 14:28:42 2004 From: edcjones at erols.com (Edward C. Jones) Date: Tue Jan 20 14:32:50 2004 Subject: [Python-Dev] PY_VERSION is undocumented Message-ID: <400D816A.2060309@erols.com> PY_VERSION seems to be undocumented. Is anything else in the API undocumented? From martin at v.loewis.de Tue Jan 20 14:48:38 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 20 14:50:01 2004 Subject: [Python-Dev] PY_VERSION is undocumented In-Reply-To: <400D816A.2060309@erols.com> References: <400D816A.2060309@erols.com> Message-ID: <400D8616.4010209@v.loewis.de> Edward C. Jones wrote: > PY_VERSION seems to be undocumented. Is anything else in the API > undocumented? Yes, certainly. Why do you ask :-? Regards, Martin From barry at python.org Tue Jan 20 15:16:04 2004 From: barry at python.org (Barry Warsaw) Date: Tue Jan 20 15:16:23 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5 In-Reply-To: <400AF97C.7060609@v.loewis.de> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> <400AF97C.7060609@v.loewis.de> Message-ID: <1074629763.15390.60.camel@anthem> On Sun, 2004-01-18 at 16:24, "Martin v. L?wis" wrote: > Of course, users may have installed JapaneseCodecs with > --without-aliases, in which case the aliases won't be there > (neither automatically, nor after explicitly importing japanese). > I believe mailman use it that way > So people who currently do Actually Mailman 2.1 does this: import japanese import korean import korean.aliases It also does not install the codec packages --without-aliases. -Barry From barry at python.org Tue Jan 20 15:34:22 2004 From: barry at python.org (Barry Warsaw) Date: Tue Jan 20 15:34:37 2004 Subject: Changing codec lookup order. (was: Re: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5) In-Reply-To: <20040118220108.GA75991@i18n.org> References: <20040118122633.F068625AF47@bonanza.off.ekorp.com> <400A9EB3.6030602@v.loewis.de> <20040118152616.GA56270@i18n.org> <400AF97C.7060609@v.loewis.de> <20040118220108.GA75991@i18n.org> Message-ID: <1074630862.15390.78.camel@anthem> On Sun, 2004-01-18 at 17:01, Hye-Shik Chang wrote: > Now, I admit that CJK codecs are obviously un-backportable to 2.3 > due to this change. ;-) Hmm, sounds like trying to make those changes to the email package for 2.3 is doomed, or at least not worth it if CJK won't work well with 2.3. -Barry From fdrake at acm.org Tue Jan 20 16:04:26 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Jan 20 16:32:55 2004 Subject: [Python-Dev] Re: dict.addlist() In-Reply-To: References: <16E1010E4581B049ABC51D4975CEDB8802C09BC7@UKDCX001.uk.int.atosorigin.com> <16397.17675.247663.995097@sftp.fdrake.net> Message-ID: <16397.38874.429362.52878@sftp.fdrake.net> Paul Moore writes: > As a concept, it's possibly not obscure (although I'm introducing > Python to colleagues at work, who are used to VB and SQL, and for whom > "object" is a fairly novel term!) Interesting, and a good point. I guess it's not familiar to all programmers, especially where the idiom isn't common. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From ivano at deakin.edu.au Tue Jan 20 22:48:37 2004 From: ivano at deakin.edu.au (Ivano Broz) Date: Tue Jan 20 22:48:47 2004 Subject: [Python-Dev] Issue with HTTP basic proxy authentication in urllib2 Message-ID: <400DF695.8000804@deakin.edu.au> I have only been scripting in python for about 3 months and have mainly been performing HTTP requests to cgi scripts. I have had had a lot of trouble with POST method and believe I have tracked this down to the fact that I'm having to perform Basic Proxy authentication. When the HTTP header for Basic authentication is created in urllib2 the username and password are encoded using the encodestring() method of the base64 module. This adds a newline '\n' to the end of the returned encoded string (as documented). from urllib2.py def proxy_open(self, req, proxy, type): orig_type = req.get_type() type, r_type = splittype(proxy) host, XXX = splithost(r_type) if '@' in host: user_pass, host = host.split('@', 1) if ':' in user_pass: user, password = user_pass.split(':', 1) user_pass = base64.encodestring('%s:%s' % (unquote(user),Harry_Connick_Jr - She.albm unquote(password))) req.add_header('Proxy-authorization', 'Basic ' + user_pass) This newline then becomes the end of the HTTP Proxy authentication header: Proxy-authorization: Basic asdfSADFwaer%asfdas=\n When the HTTP request is assembled the required CRLF is added to the end of each header along with the one extra to indicate the end of the request headers. This would seem OK except some HTTP servers appear to interperet the newline after the proxy authentication header as an extra blank line in the request, thus missing any data sent through in the body of the request. This is not a problem with the GET method (obviously) or a POST method which is not sent though an HTTP Proxy server requiring authentication. I made the very simple addition to urllib2.py of user_pass=user_pass.rstrip() to remove the trailing '\n' from the encoded username/password before it was added to the HTTP headers and everything works without a problem. I'm not sure if this is an issue which has been raised before. I glanced through the summaries of the python-dev mailing list for the past year and found no mention of it. Technically I don't see it as a bug in urllib2.py but an incorrect implementation of HTTP header parsing by servers (The Proxy server may even be the culprit) Cheers PS. If there is any need to discuss this could I please be CC'd as I am not a member of the mailing, Thanks -- Ivano Broz Metabolic Research Unit, Pigdons Road, Waurn Ponds 3217, Victoria Australia Ph: 61 3 52272195 "the money I save won't buy my youth again" From pinard at iro.umontreal.ca Tue Jan 20 23:15:03 2004 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Tue Jan 20 23:15:27 2004 Subject: [Python-Dev] Hot-To Guide for Descriptors - Nits! :-) Message-ID: <20040121041503.GA24386@alcyon.progiciels-bpi.ca> Hello, Raymond. Cc:ing python-dev, so others can bash me as well! :-) This is about `http://users.rcn.com/python/download/Descriptor.htm'. Before anything else, let me thank you for having written this essay! Please also take all my remarks with a grain of salt or two, as I'm far for pretending I really understand all the matters here. * In `Definition and Introduction', a parenthetical note "(ones that subclass `object' or `type')" qualifies the expression "new style objects or classes". Not being an English speaker, I'm a little mixed up with the plural "ones" and the fact I was expecting "subclasses" instead of "subclass", maybe replacing "ones" by "those" would be clearer. * Another point is that the phrasing suggests that the parenthetical note is a definition for what precedes it. In my understanding, the parenthetical note implies what precedes it, but is not implied by it. A new-style instance may well have no base class at all, only given its meta-class is a subtype of `type'. Would it be more precise to state: "... new style objects or classes (those for which the meta-class is a subtype of `type')"? Being sub-classed from object or type is just a way, among others, for identifying `type' as the meta-class; but being sub-classed from object is not really required. * In `Descriptor Protocol', it might help a bit saying immediately a word or two about the `type=None' argument. Looking down for some hint, the box holding "def __getattribute__(self, key): ..." uses the statement "return v.__get__(None, self)", which I find confusing respective to the `__get__' prototype, as `obj' corresponds to None, and `type=None' likely corresponds to an instance instead of a type. * In `Invoking Descriptors', it might help a bit insisting on the (presumed) fact that `hasattr' itself does not look for descriptors. For symmetry, it would also be nice having two boxes defining `__getattribute__, one for objects, another for classes, instead of only the second one. I was not sure if the assymetry had a purpose, like maybe telling us that the first case may not be expressed using Python. * A little below is a quick description of `super()'. It says that "The call ... searches ... mro ... for the base class ... immediately preceding ...". Should not it be `following' instead of `preceding'? In case yes, than maybe `A' and `B' might be exchanged for clarity. * In `Properties', the box starting with "class Property(object):" raises a few questions. First is the `doc or ""' to defined `self.__doc__', it does not seem that the ` or ""' exists in Python 2.2.1 nor 2.3.2, the __doc__ really defaults to None. Second is that the argument `objtype=None' to `__get__' is unused, and once more, this does not help understanding its purpose, maybe some explanation would be welcome about why it exists and is unneeded in the context of `property'. Third is that on the last line, "self.fdel(obj, value)" was probably meant to be "self.fdel(obj)". -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From aleaxit at yahoo.com Wed Jan 21 04:38:19 2004 From: aleaxit at yahoo.com (Alex Martelli) Date: Wed Jan 21 04:38:27 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <003001c3df83$c582cf60$d118c797@oemcomputer> References: <003001c3df83$c582cf60$d118c797@oemcomputer> Message-ID: <8BCD26A6-4BF5-11D8-8B42-000A95EFAE9E@yahoo.com> On 2004 Jan 20, at 19:32, Raymond Hettinger wrote: ... > d = {} > d.setfactory(list) > for k, v in myitems: > d[k].append(v) # dict of lists > > d = {} > d.setfactory(set): > for v in mydata: > d[f(v)].add(v) # partition into equivalence classes > > d = {} > d.setfactory(int): > for v in mydata: > d[k] += 1 # bag Yes, except that a .factory property seems preferable to me to a .setfactory setter-method (which would have to come with .getfactory or equivalent if introspection, pickling etc are to work...) except perhaps for the usual "we don't have a built-in curry" issues (so .setfactory might carry arguments after the first [callable factory] one to perform the usual "ad hoc currying" hac^H^H^H idiom). In fact I'd _love_ this approach, were it not for the fact that in some use cases I'd like the factory to receive the key as its argument. E.g.: squares_of_ints = {} def swe_factory(k): assert isinstance(k, (int, long)) return k*k squares_of_ints.setfactory_receiving_key(swe_factory) Alex From aahz at pythoncraft.com Wed Jan 21 10:14:26 2004 From: aahz at pythoncraft.com (Aahz) Date: Wed Jan 21 10:14:31 2004 Subject: [Python-Dev] Issue with HTTP basic proxy authentication in urllib2 In-Reply-To: <400DF695.8000804@deakin.edu.au> References: <400DF695.8000804@deakin.edu.au> Message-ID: <20040121151426.GA6702@panix.com> On Wed, Jan 21, 2004, Ivano Broz wrote: > > I have only been scripting in python for about 3 months > and have mainly been performing HTTP requests to cgi scripts. > I have had had a lot of trouble with POST method and believe I have > tracked this down to the fact that I'm having to perform Basic Proxy > authentication. python-dev isn't a good place to discuss this. Please do one of two things: * If you're sure of your bug and your patch, please submit an item on SourceForge. * If you want to discuss this, please post to comp.lang.python and then post an item to SF. (python-dev focuses more on overall development than on small bugfixes like this, although issues will generally move here for consensus if there's been a long debate on c.l.py. SF is essential for keeping track of bugs and patches -- we've lost too many that were only posted to python-dev.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From python at rcn.com Wed Jan 21 10:15:46 2004 From: python at rcn.com (Raymond Hettinger) Date: Wed Jan 21 10:16:55 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <20040121041503.GA24386@alcyon.progiciels-bpi.ca> Message-ID: <004201c3e031$722bd380$d118c797@oemcomputer> > * In `Definition and Introduction', a parenthetical note "(ones > that subclass `object' or `type')" qualifies the expression "new > style objects or classes". Not being an English speaker, I'm a > little mixed up with the plural "ones" and the fact I was expecting > "subclasses" instead of "subclass", maybe replacing "ones" by "those" > would be clearer. Clarified. > * Another point is that the phrasing suggests that the parenthetical > note is a definition for what precedes it. In my understanding, the > parenthetical note implies what precedes it, but is not implied by it. Sounds like an obscure grammar nit (one that doesn't appear in any of my grammar books). I think it reads well enough as is. > A new-style instance may well have no base class at all, only given its > meta-class is a subtype of `type'. Would it be more precise to state: > "... new style objects or classes (those for which the meta-class is a > subtype of `type')"? Being sub-classed from object or type is just a > way, among others, for identifying `type' as the meta-class; but being > sub-classed from object is not really required. Nope, new-style is taken to mean objects/classes inheriting from object/type. Meta-class objects are neither new-style nor old-style. While there is room to argue with this arbitrary distinction, it is consistent with Guido's essay and especially relevant to my article because most of the rules don't necessarily apply when meta-classes are used. This is because the machinery for descriptors is embedded in type.__getattribute__ and object.__getattribute__. Override or fail to inherit either of these and all bets are off. > * In `Descriptor Protocol', it might help a bit saying immediately a > word or two about the `type=None' argument. Looking down for some > hint, the box holding "def __getattribute__(self, key): ..." uses > the statement "return v.__get__(None, self)", which I find confusing > respective to the `__get__' prototype, as `obj' corresponds to None, and > `type=None' likely corresponds to an instance instead of a type. I'll keep an open mind on this one but think it is probably fine as it stands. Though the calling pattern may not be beautiful and could be confusing to some, my description accurately represents the signature and cases where are argument is potentially ignored. That is just the way it is. > * In `Invoking Descriptors', it might help a bit insisting on the > (presumed) fact that `hasattr' itself does not look for descriptors. I considered this and other explanations when writing the article. The subject is already complex enough that additional explanations can make it harder to understand rather than easier. Sidetrips can add more confusion than they cure. I see no reason to delve into what hasattr() doesn't do. > For symmetry, it would also be nice having two boxes defining > `__getattribute__, one for objects, another for classes, instead of only > the second one. I was not sure if the assymetry had a purpose, like > maybe telling us that the first case may not be expressed using Python. It did have a purpose. I put in pure python code only when it served as an aid to understanding. In this case, the operation of __getattribute__ is more readily described with prose. I do include a C code reference for those wanting all the gory details. If you look at the reference, you'll see that a pure python equivalent would not be succinct or provide any deep insights not covered the text. Once again, it was a question of whether to break-up the flow of the argument/explanation. I like it the way it is. > * A little below is a quick description of `super()'. It says that "The > call ... searches ... mro ... for the base class ... immediately > preceding ...". Should not it be `following' instead of `preceding'? It is a matter of perspective. A is defined before its derived class B and can be said to precede it in the inheritance graph; however, the MRO list would have B ... A ... O so there is a case for the word "following". To keep consistent with Guido's essay, I'll make the change and adopt your convention. > In case yes, than maybe `A' and `B' might be exchanged for clarity. No. A is higher than B on the inheritance hierarchy. Also, the lettering is consistent with Guido's essay. > * In `Properties', the box starting with "class Property(object):" > raises a few questions. First is the `doc or ""' to defined > `self.__doc__', it does not seem that the ` or ""' exists in Python > 2.2.1 nor 2.3.2, the __doc__ really defaults to None. Fixed. > Second is that > the argument `objtype=None' to `__get__' is unused, and once more, > this does not help understanding its purpose, maybe some explanation > would be welcome about why it exists and is unneeded in the context of > `property'. That's the way it is. > Third is that on the last line, "self.fdel(obj, value)" was > probably meant to be "self.fdel(obj)". Fixed. Raymond Hettinger From Jack.Jansen at cwi.nl Wed Jan 21 11:08:04 2004 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Wed Jan 21 11:06:56 2004 Subject: [Python-Dev] Configure option to change Python's name? Message-ID: This idea is only half-baked, so please fire away, or tell me to go ahead, or go hide under a stone. I would like a configure option that changes the name of "python" everywhere: in the executable, in the dynamic library name, the MacOSX framework name, etc. This would allow you to have two Pythons on a system that are completely 100% independent of each other. The idea was triggered by Michael Hudson, who wanted to install a debugging-enabled framework build of Python on MacOSX, which shouldn't interfere with the normal Python (either Apple-installed or user-installed). It turns out this is doable through meticulous hacking of the Makefile after configure (and, for now, ignoring the few hardcoded "Python.framework" references in Lib, but those need fixing anyway). For other unix systems a similar approach would work to isolate a debug-python from the normal production python. But then I thought of another use: this would allow you to have a MacPython 2.3.3 distribution as well as the Apple-installed 2.3 distribution without any interference possible. And there are probably similar uses on other platforms. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From bob at redivi.com Wed Jan 21 11:17:46 2004 From: bob at redivi.com (Bob Ippolito) Date: Wed Jan 21 11:15:12 2004 Subject: [Python-Dev] Configure option to change Python's name? In-Reply-To: References: Message-ID: <599EE6A8-4C2D-11D8-907F-000A95686CD8@redivi.com> On Jan 21, 2004, at 11:08 AM, Jack Jansen wrote: > This idea is only half-baked, so please fire away, or tell me to go > ahead, or go hide under a stone. > > I would like a configure option that changes the name of "python" > everywhere: in the executable, in the dynamic library name, the MacOSX > framework name, etc. This would allow you to have two Pythons on a > system that are completely 100% independent of each other. > > The idea was triggered by Michael Hudson, who wanted to install a > debugging-enabled framework build of Python on MacOSX, which shouldn't > interfere with the normal Python (either Apple-installed or > user-installed). It turns out this is doable through meticulous > hacking of the Makefile after configure (and, for now, ignoring the > few hardcoded "Python.framework" references in Lib, but those need > fixing anyway). For other unix systems a similar approach would work > to isolate a debug-python from the normal production python. > > But then I thought of another use: this would allow you to have a > MacPython 2.3.3 distribution as well as the Apple-installed 2.3 > distribution without any interference possible. And there are probably > similar uses on other platforms. I'm +100000 on this; it's good for MacPython, python-dev'ers, and it would also help forks like stackless. -bob From amk at amk.ca Wed Jan 21 11:20:31 2004 From: amk at amk.ca (A.M. Kuchling) Date: Wed Jan 21 11:25:43 2004 Subject: [Python-Dev] Re: Python in 2003 summary References: <20040117171401.GA18383@rogue.amk.ca> Message-ID: The current version of the 2003 summary is now on python.org at http://www.python.org/topics/2003.html I'm now happy with it, so this is a last call for anything I missed. If no further changes are needed, this afternoon I'll start sending out e-mails to publicize it. Those of you who want to weblog it can now go ahead and do so. Thanks, all! ----amk From amk at amk.ca Wed Jan 21 11:22:59 2004 From: amk at amk.ca (A.M. Kuchling) Date: Wed Jan 21 11:30:35 2004 Subject: [Python-Dev] Re: Python in 2003 summary References: <20040117171401.GA18383@rogue.amk.ca> Message-ID: The current version of the 2003 summary is now on python.org at http://www.python.org/topics/2003.html I'm now happy with it, so this is a last call for anything I missed. If no further changes are needed, this afternoon I'll start sending out e-mails to publicize it. Those of you who want to weblog it can now go ahead and do so. Thanks, all! ----amk From ncoghlan at iinet.net.au Wed Jan 21 11:14:23 2004 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Jan 21 11:47:04 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <8BCD26A6-4BF5-11D8-8B42-000A95EFAE9E@yahoo.com> References: <003001c3df83$c582cf60$d118c797@oemcomputer> <8BCD26A6-4BF5-11D8-8B42-000A95EFAE9E@yahoo.com> Message-ID: <400EA55F.7030009@iinet.net.au> Alex Martelli wrote: > > On 2004 Jan 20, at 19:32, Raymond Hettinger wrote: > ... > >> d = {} >> d.setfactory(list) >> for k, v in myitems: >> d[k].append(v) # dict of lists >> >> d = {} >> d.setfactory(set): >> for v in mydata: >> d[f(v)].add(v) # partition into equivalence classes >> >> d = {} >> d.setfactory(int): >> for v in mydata: >> d[k] += 1 # bag > > > Yes, except that a .factory property seems preferable to me to a > .setfactory setter-method (which would have to come with .getfactory or > equivalent if introspection, pickling etc are to work...) except perhaps > for the usual "we don't have a built-in curry" issues (so .setfactory > might carry arguments after the first [callable factory] one to perform > the usual "ad hoc currying" hac^H^H^H idiom). In fact I'd _love_ this > approach, were it not for the fact that in some use cases I'd like the > factory to receive the key as its argument. E.g.: > > squares_of_ints = {} > def swe_factory(k): > assert isinstance(k, (int, long)) > return k*k > squares_of_ints.setfactory_receiving_key(swe_factory) > > > Alex This could be seen as favouring the protocol approach that Bob suggested. In that approach, the 'defaulting' is done by having dict access call __getdefaultitem__ if it exists, and the item being looked up is not found. The signature of __getdefaultitem__ is the same as that for __getitem__. If __getdefaultitem__ isn't found, then the current behaviour of raising KeyError is retained. I suspect the aim of this approach would be to reduce errors by avoiding rewriting the 'try/except KeyError' block everytime we wanted a defaulting dictionary. I imagine it could be made faster than the current "k in d" or "try...except Keyerror..." idioms, too. The more I think about it, the more I'm leaning towards the class-based approach - the version with 'factory' or 'setfactory' seems to lend itself to too many dangerous usages, especially: d = {} ...do some stuff... [A] d.factory = factory_func ...do some more stuff... [B] The meaning of d[k] is significantly different in section A than it was in section B. Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 From martin at v.loewis.de Wed Jan 21 12:00:04 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 21 12:03:19 2004 Subject: [Python-Dev] Configure option to change Python's name? In-Reply-To: References: Message-ID: <400EB014.5010408@v.loewis.de> Jack Jansen wrote: > The idea was triggered by Michael Hudson, who wanted to install a > debugging-enabled framework build of Python on MacOSX, which shouldn't > interfere with the normal Python (either Apple-installed or > user-installed). It turns out this is doable through meticulous hacking > of the Makefile after configure (and, for now, ignoring the few > hardcoded "Python.framework" references in Lib, but those need fixing > anyway). For other unix systems a similar approach would work to isolate > a debug-python from the normal production python. Why can't you configure with an entirely different --prefix, such as /just/for/me? Regards, Martin From mchermside at ingdirect.com Wed Jan 21 15:19:22 2004 From: mchermside at ingdirect.com (Chermside, Michael) Date: Wed Jan 21 15:21:00 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) Message-ID: <0CFFADBB825C6249A26FDF11C1772AE1024162@ingdexj1.ingdirect.com> Fran?ois Pinard writes: > Would it be more precise to state: > "... new style objects or classes (those for which the meta-class is a > subtype of `type')"? Being sub-classed from object or type is just a > way, among others, for identifying `type' as the meta-class; but being > sub-classed from object is not really required. Raymond Hettinger responds: > Nope, new-style is taken to mean objects/classes inheriting from > object/type. Meta-class objects are neither new-style nor old-style. > While there is room to argue with this arbitrary distinction, it is > consistent with Guido's essay and especially relevant to my article > because most of the rules don't necessarily apply when meta-classes > are used. This is because the machinery for descriptors is > embedded in type.__getattribute__ and object.__getattribute__. > Override or fail to inherit either of these and all bets are off. Really? I realize the utility of having a term for objects-with-a- meta-type-of-type, but I had always understood "new-style" to mean things-that-are-not-old-style-classes. I can live with either definition, but the Python community should make sure that we use the term "new-style class" in a consistent fashion. Did everyone else agree with Raymond so Fran?ois and I are the odd men out, or is there a larger confusion over how we use the term? -- Michael Chermside This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it. From jcarlson at uci.edu Wed Jan 21 15:22:00 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Jan 21 15:33:16 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <400EA55F.7030009@iinet.net.au> References: <8BCD26A6-4BF5-11D8-8B42-000A95EFAE9E@yahoo.com> <400EA55F.7030009@iinet.net.au> Message-ID: <20040121120433.FB3D.JCARLSON@uci.edu> > d = {} > ...do some stuff... [A] > d.factory = factory_func > ...do some more stuff... [B] > > The meaning of d[k] is significantly different in section A than it was > in section B. For most any reasonable program, previous lines of execution determine the context of later lines. Saying that d[k] in section A is different than d[k] in section B is an assumption that all people make when writing useful programs in most languages. We can say the same thing about the meaning of d[1] in sections A and B of the following snippet. d = {} d[1] = 1 #A del d[1] #B While the above it is not good programming style, it does highlight the fact that the existance of dict.factory (or any equivalent behavior) doesn't remove meaning or expressiveness of the statements using, preceeding, or following it. Certainly dict.factory ends up overlapping with a portion of the functionality of dict.setdefault, but it has the potential for reducing the peppering of dict.setdefault(key, value) in some programs. As an aside, when I first started using Python, I thought (before using it) that dict.setdefault had the behavior of what dict.factory is suggested as having now. - Josiah From guido at python.org Wed Jan 21 15:34:46 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 21 15:35:54 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: Your message of "Wed, 21 Jan 2004 15:19:22 EST." <0CFFADBB825C6249A26FDF11C1772AE1024162@ingdexj1.ingdirect.com> References: <0CFFADBB825C6249A26FDF11C1772AE1024162@ingdexj1.ingdirect.com> Message-ID: <200401212034.i0LKYkd08673@c-24-5-183-134.client.comcast.net> > François Pinard writes: > > Would it be more precise to state: > > "... new style objects or classes (those for which the meta-class is a > > subtype of `type')"? Being sub-classed from object or type is just a > > way, among others, for identifying `type' as the meta-class; but being > > sub-classed from object is not really required. > > Raymond Hettinger responds: > > Nope, new-style is taken to mean objects/classes inheriting from > > object/type. Meta-class objects are neither new-style nor old-style. > > While there is room to argue with this arbitrary distinction, it is > > consistent with Guido's essay and especially relevant to my article > > because most of the rules don't necessarily apply when meta-classes > > are used. This is because the machinery for descriptors is > > embedded in type.__getattribute__ and object.__getattribute__. > > Override or fail to inherit either of these and all bets are off. Michael Chermside > Really? I realize the utility of having a term for objects-with-a- > meta-type-of-type, but I had always understood "new-style" to mean > things-that-are-not-old-style-classes. I can live with either > definition, but the Python community should make sure that we use > the term "new-style class" in a consistent fashion. > > Did everyone else agree with Raymond so François and I are the odd > men out, or is there a larger confusion over how we use the term? I'm with Raymond. Metaclasses that don't derive from 'type' can create objects that are neither fish nor flesh. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jan 21 15:48:35 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 21 15:52:15 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: Your message of "Wed, 21 Jan 2004 12:34:46 PST." <200401212034.i0LKYkd08673@c-24-5-183-134.client.comcast.net> References: <0CFFADBB825C6249A26FDF11C1772AE1024162@ingdexj1.ingdirect.com> <200401212034.i0LKYkd08673@c-24-5-183-134.client.comcast.net> Message-ID: <200401212048.i0LKmZG08737@c-24-5-183-134.client.comcast.net> > I'm with Raymond. Metaclasses that don't derive from 'type' can > create objects that are neither fish nor flesh. After thinking this through some more, I have to retract that. After all, even classic classes and their instances are derive from object: >>> class C: pass >>> isinstance(C, object) True >>> >>> isinstance(C(), object) True What makes a classic class is one very specific metaclass. What makes a classic instance is a class using that very specific metaclass. Everything else is a new-style class. Perhaps an exception should be made for Zope's ExtensionClasses, which are not classic classes but old-style "extension types", but they will disappear in the future and there's really nothing else like them. --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Wed Jan 21 16:48:02 2004 From: python at rcn.com (Raymond Hettinger) Date: Wed Jan 21 16:49:03 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <200401212048.i0LKmZG08737@c-24-5-183-134.client.comcast.net> Message-ID: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> > > I'm with Raymond. Metaclasses that don't derive from 'type' can > > create objects that are neither fish nor flesh. > > After thinking this through some more, I have to retract that. After > all, even classic classes and their instances are derive from object: > > >>> class C: pass > > >>> isinstance(C, object) > True > >>> > >>> isinstance(C(), object) > True Hmm, that surprises me. After all, writing class C(object): pass makes it new-style. Also dir(C()) or dir(C) does not show any of the methods given by dir(object) or dir(object()). > What makes a classic class is one very specific metaclass. What makes > a classic instance is a class using that very specific metaclass. > Everything else is a new-style class. While true, I think it more pragmatic to have the docs and terminology *not* expressed in terms of meta-classes; rather, it is simpler to focus on the distinction between class C and class C(object), noting that all the new gizmos only work with the latter. The theory is simple, one shouldn't have to understand meta-classes in order be able to use property, classmethod, staticmethod, super, __slots__, __getattribute__, descriptors, etc. Practically, in all code that doesn't explicitly create meta-classes, the sole trigger between new and old is adding object, list, or other builtin type as a base class. IOW, I think the docs ought to continue using wording like this for property: "Return a property attribute for new-style classes (classes that derive from object)." There would be much loss of clarity and understanding with wording like: "Return a property attribute for a class whose metaclass is type or that implements the class semantics of type.__getattribute__ and the instance semantics of object.__getattribute__." For purposes of my article on descriptors, I'll continue with the current approach of ignoring meta-classes. There is enough information presented that meta-class writers will quickly understand why all the new-style gizmos stopped working when they didn't carry forward the semantics of object.__getattribute__ and type.__getattribute__. I think all of the topics are best understood in terms of three groups: classic classes, new-style classes, and roll your own classes. Lumping the latter two together won't help one bit. 'nuff said, Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From barry at python.org Wed Jan 21 16:55:51 2004 From: barry at python.org (Barry Warsaw) Date: Wed Jan 21 16:55:55 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> References: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> Message-ID: <1074722150.27294.55.camel@geddy> On Wed, 2004-01-21 at 16:48, Raymond Hettinger wrote: > IOW, I think the docs ought to continue using wording like this for > property: "Return a property attribute for new-style classes (classes > that derive from object)." I'd love it if we could come up with a different term for these beasts that "new style classes". Some day they won't be new any more, and may in fact be the only kind of classes in Python. Not that I have any good idea for what they should be called... -Barry From guido at python.org Wed Jan 21 17:03:15 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 21 17:03:22 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: Your message of "Wed, 21 Jan 2004 16:48:02 EST." <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> References: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> Message-ID: <200401212203.i0LM3FP09011@c-24-5-183-134.client.comcast.net> > > After thinking this through some more, I have to retract that. After > > all, even classic classes and their instances are derive from object: > > > > >>> class C: pass > > > > >>> isinstance(C, object) > > True > > >>> > > >>> isinstance(C(), object) > > True > > Hmm, that surprises me. After all, writing class C(object): pass > makes it new-style. Also dir(C()) or dir(C) does not show any of > the methods given by dir(object) or dir(object()). dir() special-cases classic classes. Classic class-ness and classic instance-ness are really subsumed by the new-style machinery -- there's a specific metaclass that produces classic classes, and those produce classic instances. > > What makes a classic class is one very specific metaclass. What makes > > a classic instance is a class using that very specific metaclass. > > Everything else is a new-style class. > > While true, I think it more pragmatic to have the docs and > terminology *not* expressed in terms of meta-classes; rather, it is > simpler to focus on the distinction between class C and class > C(object), noting that all the new gizmos only work with the latter. That's fine; there's nothing wrong with simply stating that classic classes don't *derive* from object. Note that in my little session I showed isinstance(C, object), which is True, but I didn't show issubclass(C, object), which is False -- it is the latter that makes C not a new-style class! (Not necessarily a classic class; it could be be an old-style extension type.) > The theory is simple, one shouldn't have to understand meta-classes > in order be able to use property, classmethod, staticmethod, super, > __slots__, __getattribute__, descriptors, etc. Mostly right. But the fact that a class is also an object makes in unavoidable that the question is asked, and then you better have a good answer prepared. > Practically, in all code that doesn't explicitly create meta-classes, > the sole trigger between new and old is adding object, list, or > other builtin type as a base class. Right. Or playing with __metaclass__, which is rare. > IOW, I think the docs ought to continue using wording like this for > property: "Return a property attribute for new-style classes (classes > that derive from object)." I never said that was wrong. In fact IMO it is right. > There would be much loss of clarity and understanding with wording like: > "Return a property attribute for a class whose metaclass is type or that > implements the class semantics of type.__getattribute__ and the instance > semantics of object.__getattribute__." Of course. > For purposes of my article on descriptors, I'll continue with the > current approach of ignoring meta-classes. There is enough information > presented that meta-class writers will quickly understand why all the > new-style gizmos stopped working when they didn't carry forward the > semantics of object.__getattribute__ and type.__getattribute__. Right. And meta-classes are still head-exploding material. > I think all of the topics are best understood in terms of three groups: > classic classes, new-style classes, and roll your own classes. Lumping > the latter two together won't help one bit. Especially since it can be done using either classic or new-style classes. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jan 21 17:03:52 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 21 17:03:59 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: Your message of "Wed, 21 Jan 2004 16:55:51 EST." <1074722150.27294.55.camel@geddy> References: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> <1074722150.27294.55.camel@geddy> Message-ID: <200401212203.i0LM3qw09035@c-24-5-183-134.client.comcast.net> > I'd love it if we could come up with a different term for these beasts > that "new style classes". Some day they won't be new any more, and may > in fact be the only kind of classes in Python. Not that I have any good > idea for what they should be called... Well, then we'd call them "classes", of course. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at zope.com Wed Jan 21 17:04:27 2004 From: jeremy at zope.com (Jeremy Hylton) Date: Wed Jan 21 17:10:41 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <1074722150.27294.55.camel@geddy> References: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> <1074722150.27294.55.camel@geddy> Message-ID: <1074722666.27007.69.camel@localhost.localdomain> On Wed, 2004-01-21 at 16:55, Barry Warsaw wrote: > On Wed, 2004-01-21 at 16:48, Raymond Hettinger wrote: > > > IOW, I think the docs ought to continue using wording like this for > > property: "Return a property attribute for new-style classes (classes > > that derive from object)." > > I'd love it if we could come up with a different term for these beasts > that "new style classes". Some day they won't be new any more, and may > in fact be the only kind of classes in Python. Not that I have any good > idea for what they should be called... Someday new-style classes will be old enough that they'll just be called classes. Classic classes will be a special case that people will have to explain. I don't think that will happen with current Python, because a class is a classic class unless you perform some explicit step to make it otherwise. If new-style classes were the default, then we could just call them classes. (And then we'd need some special way to construct classic classes.) Perhaps it's unfortunate that we have the names "new" and "classic." There could be users thinking that, like Coke, we'll forget about new classes and go back to classic classes. Jeremy From nas-python at python.ca Wed Jan 21 17:23:04 2004 From: nas-python at python.ca (Neil Schemenauer) Date: Wed Jan 21 17:14:18 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <1074722150.27294.55.camel@geddy> References: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> <1074722150.27294.55.camel@geddy> Message-ID: <20040121222304.GA27900@mems-exchange.org> On Wed, Jan 21, 2004 at 04:55:51PM -0500, Barry Warsaw wrote: > I'd love it if we could come up with a different term for these beasts > that "new style classes". Some day they won't be new any more, and may > in fact be the only kind of classes in Python. Yes. We should also rename 'newcompile.c' to 'astcompile.c' or something else before merging the AST branch. I've been on a chess kick recently and I find it humorous that the era of hyper-modern openings was about 80 years ago. Neil From nbastin at opnet.com Wed Jan 21 17:34:17 2004 From: nbastin at opnet.com (Nick Bastin) Date: Wed Jan 21 17:34:20 2004 Subject: [Python-Dev] PyCore sprint for 2004? Message-ID: This may not be the right place for this, but is there not going to be a PyCore sprint at PyDC 2004? I presume at least *some* people here are planning on attending.. ;-) -- Nick From jeremy at alum.mit.edu Wed Jan 21 17:31:27 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed Jan 21 17:37:10 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <20040121222304.GA27900@mems-exchange.org> References: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> <1074722150.27294.55.camel@geddy> <20040121222304.GA27900@mems-exchange.org> Message-ID: <1074724287.27007.73.camel@localhost.localdomain> On Wed, 2004-01-21 at 17:23, Neil Schemenauer wrote: > On Wed, Jan 21, 2004 at 04:55:51PM -0500, Barry Warsaw wrote: > > I'd love it if we could come up with a different term for these beasts > > that "new style classes". Some day they won't be new any more, and may > > in fact be the only kind of classes in Python. > > Yes. We should also rename 'newcompile.c' to 'astcompile.c' or > something else before merging the AST branch. I've been on a chess > kick recently and I find it humorous that the era of hyper-modern > openings was about 80 years ago. I had originally planned to just call it compile.c after the merge, but kept it separate so I could also have the original compiler on the same branch. astcompile.c sounds better. The computing literature has got lots of papers with new in the title, but most of the papers are old. Jeremy From guido at python.org Wed Jan 21 17:37:33 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 21 17:37:38 2004 Subject: [Python-Dev] PyCore sprint for 2004? In-Reply-To: Your message of "Wed, 21 Jan 2004 17:34:17 EST." References: Message-ID: <200401212237.i0LMbXn09176@c-24-5-183-134.client.comcast.net> > This may not be the right place for this, but is there not going to be > a PyCore sprint at PyDC 2004? I presume at least *some* people here > are planning on attending.. ;-) I woll be there, and so will Raymond Hettinger and Brett Cannon. I'm short on time before then, though, so I hope someone elsewill take the lead of organizing troups. Raymond once mailed me a list of possible topics to sprint on, maybe he can try to add it to the Wiki once more: http://www.python.org/cgi-bin/moinmoin/SprintPlan2004 --Guido van Rossum (home page: http://www.python.org/~guido/) From Jack.Jansen at cwi.nl Wed Jan 21 17:39:32 2004 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Wed Jan 21 17:39:43 2004 Subject: [Python-Dev] PEP 11 mistake? In-Reply-To: <400D6F8C.9020403@v.loewis.de> References: <16392.24824.367536.660812@montanaro.dyndns.org> <52EF862C-4B5B-11D8-9B7E-000A958D1666@cwi.nl> <400D6F8C.9020403@v.loewis.de> Message-ID: On 20-jan-04, at 19:12, Martin v. L?wis wrote: > Jack Jansen wrote: >> Too bad, it's gone already. >> I've been asking for over a year whether anyone had interest in >> maintaining >> 2.4 for OS9, with exactly 0 replies. > > If you are willing to take the blame :-) Definitely. > PEP 11 would normally recommend to leave the code in, but disable it, > so > attempts to use it cause build errors. Then wait if any body complains > about seeing these build errors, and ask whether they volunteer. > > In this case, anybody volunteering now would have to bring back the > source to maintain first - which may be acceptable in the specific > case. *If* someone shows up, *and* they demonstrate the ability to build and maintain 2.3.X for MacOS9 I will gladly help them to revive 2.4 for MacOS9. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From greg at cosc.canterbury.ac.nz Wed Jan 21 19:11:35 2004 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Jan 21 19:11:47 2004 Subject: [Python-Dev] Re: Tracebacks into C code In-Reply-To: <4001B56A.2020607@v.loewis.de> Message-ID: <200401220011.i0M0BZf06880@oma.cosc.canterbury.ac.nz> Martin: > Edward C. Jones wrote: > > Should some traceback-through-C capability be put in the Python API? > > But there is traceback-through-C capability in the Python API! Is there? Can you point me towards some documentation? Anything that could reduce the contortions Pyrex currently performs to achieve this would be welcome... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aahz at pythoncraft.com Wed Jan 21 19:19:41 2004 From: aahz at pythoncraft.com (Aahz) Date: Wed Jan 21 19:19:45 2004 Subject: [Python-Dev] PyCore sprint for 2004? In-Reply-To: References: Message-ID: <20040122001941.GA27940@panix.com> On Wed, Jan 21, 2004, Nick Bastin wrote: > > This may not be the right place for this, but is there not going to be > a PyCore sprint at PyDC 2004? I presume at least *some* people here > are planning on attending.. ;-) I'm planning to be there, but haven't decided whether to sprint on core or reST. Most likely after my experience with trying to work with Twisted last year, whichever group seems more organized.... -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From aahz at pythoncraft.com Wed Jan 21 19:22:53 2004 From: aahz at pythoncraft.com (Aahz) Date: Wed Jan 21 19:23:00 2004 Subject: [Python-Dev] Configure option to change Python's name? In-Reply-To: <400EB014.5010408@v.loewis.de> References: <400EB014.5010408@v.loewis.de> Message-ID: <20040122002253.GB27940@panix.com> On Wed, Jan 21, 2004, "Martin v. L?wis" wrote: > Jack Jansen wrote: >> >>The idea was triggered by Michael Hudson, who wanted to install a >>debugging-enabled framework build of Python on MacOSX, which shouldn't >>interfere with the normal Python (either Apple-installed or >>user-installed). It turns out this is doable through meticulous hacking >>of the Makefile after configure (and, for now, ignoring the few >>hardcoded "Python.framework" references in Lib, but those need fixing >>anyway). For other unix systems a similar approach would work to isolate >>a debug-python from the normal production python. > > Why can't you configure with an entirely different --prefix, such as > /just/for/me? That only works if you want to provide the full path to the executable each time you use it (or if you are willing to set up a link or alias that sits in the PATH that does what Jack suggests). I'm somewhere between +0 and +1; the former is because I worry a bit about dealing with people who do this when they don't know what they're doing. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From martin at v.loewis.de Wed Jan 21 19:34:33 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 21 19:34:45 2004 Subject: [Python-Dev] Re: Tracebacks into C code In-Reply-To: <200401220011.i0M0BZf06880@oma.cosc.canterbury.ac.nz> References: <200401220011.i0M0BZf06880@oma.cosc.canterbury.ac.nz> Message-ID: <400F1A99.6070806@v.loewis.de> Greg Ewing wrote: >>But there is traceback-through-C capability in the Python API! > > > Is there? Can you point me towards some documentation? It's undocumented. See pyexpat.c for an example. > Anything that could reduce the contortions Pyrex currently > performs to achieve this would be welcome... That is unlikely. The mere fact that Pyrex made it possible through contortions demonstrates that the capability is in the API. Regards, Martin From martin at v.loewis.de Wed Jan 21 19:38:11 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 21 19:39:22 2004 Subject: [Python-Dev] Configure option to change Python's name? In-Reply-To: <20040122002253.GB27940@panix.com> References: <400EB014.5010408@v.loewis.de> <20040122002253.GB27940@panix.com> Message-ID: <400F1B73.8070500@v.loewis.de> Aahz wrote: >>Why can't you configure with an entirely different --prefix, such as >>/just/for/me? > > > That only works if you want to provide the full path to the executable > each time you use it (or if you are willing to set up a link or alias > that sits in the PATH that does what Jack suggests). Right. I see no problems with creating symlinks, though. > I'm somewhere between +0 and +1; the former is because I worry a bit > about dealing with people who do this when they don't know what they're > doing. I'm more concerned with the maintenance burden that the feature adds. There are many places that need attention, those places become unreadable after the change, it will take ages to discover them all, and the feature will break every few weaks when a new such place is added. Regards, Martin From aahz at pythoncraft.com Wed Jan 21 20:14:51 2004 From: aahz at pythoncraft.com (Aahz) Date: Wed Jan 21 20:15:01 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: References: Message-ID: <20040122011450.GA25795@panix.com> On Tue, Jan 20, 2004, Batista, Facundo wrote: > > I don't use packages, but when I read the tutorial first time, I wondered > why, if the packages appeared to solve a 'pack into a directory' problem, > the names were joined with '.' instead of '/'. Later I found some other uses > to the 'unix path' notation. > > For example: > > - Module inside a package: import package/module > > - Relative import: import ../../otherpack/webmodules/somemodule Haven't seen any support for this; in the absence of support, I'll have to assume that it's a non-starter. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From pinard at iro.umontreal.ca Wed Jan 21 20:24:57 2004 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Wed Jan 21 20:25:56 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> References: <200401212048.i0LKmZG08737@c-24-5-183-134.client.comcast.net> <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> Message-ID: <20040122012457.GA3123@alcyon.progiciels-bpi.ca> [Raymond Hettinger] > Hmm, that surprises me. After all, writing class C(object): pass > makes it new-style. This is only because there is not __metaclass__ assignment within the class. In this case, according to: http://www.python.org/2.2.3/descrintro.html in the `Metaclasses' section, under "How is M determined?", we read: "... if there is at least one base class, its metaclass is used". The type of object is type itself. If there is no base class, then "if there's a global variable named __metaclass__, it is used." So my guess would be that adding a mere __metaclass__ = type in global scope for a module would make all classes be "new-style", without the need to subclass them from object explicitly. Experimenting a bit with these things, I noticed that `object.__new__' may be called from a class `__new__', and `object.__init__' the same, without that class needing to be subclassed from object. This does not help me getting away from the feeling that new-style classes do not need to be sub-classed from object. Another point is that in Python 3.0, it is unlikely that Guido forces everybody to subclass all classes from object, for these classes to work. I would not think that being sub-classed from object, at least explicitely in the class declaration, is a proper characterisation of new-style classes. This is one way of course, but not the necessary one. I was reading the "How-to ... Descriptors" as asserting this. > While true, I think it more pragmatic to have the docs and terminology > *not* expressed in terms of meta-classes; While docs and terminology need to be pragmatic, it is important that they could be dependable. One thing I would like to do is experimenting a lot of already written Python code in the context of new-style classes, just to see how solid things would be in that context. I suspect nothing will break, but we have to experiment it. My guess is that merely adding: __metaclass__ = type at the beginning of each module is sufficient for the experiment. It is simpler adding one line per module than to edit all base class declarations for adding (object) to them. I am not tempted to add them without at least understanding that they are really necessary. Adding at random would be symptomatic of my lack of grasp on the implied matters. In one other message, you assert that "All bets are off!" if I do not sub-class from object, but are you really damn fully sure of this? :-) It looks a bit like an argument of authority, and maybe more a guess than knowledge (no offence intended, of course). I just want to make sure I really got a definitive reply on this, and preferably, one that I can understand enough to repeat it myself with conviction! :-) > IOW, I think the docs ought to continue using wording like this for > property: "Return a property attribute for new-style classes (classes > that derive from object)." Yes, as long as this wording does not mislead the reader into thinking untrue things. Now, I'm not sure what `derive' really means. I read it as meaning `derive by explicit sub-classing', but it might imply something else. Without a precise meaning, the sentence is not so meaningful. The classic metatype itself is an object, you know, so one could say that classic classes also derive from object. > I think all of the topics are best understood in terms of three groups: > classic classes, new-style classes, and roll your own classes. Lumping > the latter two together won't help one bit. Unless there is only two groups, in which case "lumping" would just get nearer the truth. I surely have to study a lot more, descriptors in particular, before I can make more sense of all this. Once again, I apologise for any wrong assertion I could have made above, or elsewhere, I do not intend to hurt or contradict anybody, but only to develop a more solid understanding. -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From goodger at python.org Wed Jan 21 20:33:56 2004 From: goodger at python.org (David Goodger) Date: Wed Jan 21 20:33:59 2004 Subject: [Python-Dev] Re: PyCore sprint for 2004? In-Reply-To: <20040122001941.GA27940@panix.com> References: <20040122001941.GA27940@panix.com> Message-ID: <400F2884.5020102@python.org> Aahz wrote: > I'm planning to be there, but haven't decided whether to sprint on > core or reST. There are four days planned for sprints. Is it expected that all sprints will last for four days, or will we have two sets of two-day sprints, or a mixture of two-day and four-day sprints? As a prospective sprint coach (see http://www.python.org/cgi-bin/moinmoin/DocutilsSprint), all input is welcome. -- David Goodger From tim.one at comcast.net Wed Jan 21 20:36:07 2004 From: tim.one at comcast.net (Tim Peters) Date: Wed Jan 21 20:36:11 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <1074722150.27294.55.camel@geddy> Message-ID: [Barry Warsaw] > ... > I'd love it if we could come up with a different term for these beasts > that "new style classes". Then let me suggest ectomorphic classes, as opposed to the previous endomorphic classes. Roll-your-own classes can be called mesomorphic classes. Extension classes can be called paleomorphic classes. Then there's a precise term for each kind, and bystanders are automically informed by the very terminology used that their lives are too short to worry about the distinctions being drawn . From pinard at iro.umontreal.ca Wed Jan 21 20:55:43 2004 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Wed Jan 21 20:56:07 2004 Subject: [Python-Dev] Re: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <004201c3e031$722bd380$d118c797@oemcomputer> References: <20040121041503.GA24386@alcyon.progiciels-bpi.ca> <004201c3e031$722bd380$d118c797@oemcomputer> Message-ID: <20040122015543.GB3123@alcyon.progiciels-bpi.ca> [Raymond Hettinger] > Clarified. [...] Fixed. [...] (etc.) Thanks, Raymond, for all those. > This is because the machinery for descriptors is embedded in > type.__getattribute__ and object.__getattribute__. Override or fail > to inherit either of these and all bets are off. I thinks I understand better what you are saying. Maybe part of my problem is that I'm not sure how `__getattribute__' is itself fetched. While I know you do not want to speak about metatypes, it might be that whenever you write `type.__getattribute__', you really mean the `__getattribute__' found in the dict of the metatype (or is it?); while when you write `object.__getattribute__', you really mean the `__getattribute__' found by scanning the base classes of the current class, and `object' always when there is no base class for a "new-style" class. These are sensible hypotheses, yet I'm not sure, of course. Maybe that failing to say the full truth, _if_ this is the case, might raise questions more than calm them! :-) > > * In `Descriptor Protocol', it might help a bit saying immediately a > > word or two about the `type=None' argument. Looking down for some > > hint, the box holding "def __getattribute__(self, key): ..." uses > > the statement "return v.__get__(None, self)", which I find confusing > > respective to the `__get__' prototype, as `obj' corresponds to None, > > and `type=None' likely corresponds to an instance instead of a type. > I'll keep an open mind on this one but think it is probably fine as it > stands. Though the calling pattern may not be beautiful and could be > confusing to some, my description accurately represents the signature > and cases where are argument is potentially ignored. That is just the > way it is. Just adding your own quote above might help the How-To. About: descr.__get__(self, obj, type=None) --> value I really think it would help readers like me saying that the pattern is not beautiful, that `obj' may be None, and that `type' may be something else than a type, and this is the way it is. It would invite the reader to swallow the snake, instead of fighting it. > > * In `Invoking Descriptors', it might help a bit insisting on the > > (presumed) fact that `hasattr' itself does not look for descriptors. > I see no reason to delve into what hasattr() doesn't do. One has to memorise a few rules about the order in which the lookups are done for finding an attribute, and unless `hasattr' is more used as pseudo-code than Python code, my fear is that `hasattr' does not look at the same place, nor in the same order, as the few rules above. Maybe there is no danger of any confusion, maybe there is, I surely wondered. > > Second is that the argument `objtype=None' to `__get__' is unused, > > and once more, this does not help understanding its purpose, maybe > > some explanation would be welcome about why it exists and is > > unneeded in the context of `property'. > That's the way it is. I would not doubt it, but my suggestion was that the How-To could offer more help for readers to understand or find some logic in the way it is, enough to get a grasp and assimilate the notion. If there is logic to present, merely saying that it should be assimilated without understanding might help readers like me at not stumbling on the matter for too long. I hope you read my criticism or requests as constructive... -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From jeremy at alum.mit.edu Wed Jan 21 21:01:15 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed Jan 21 21:06:59 2004 Subject: [Python-Dev] Re: PyCore sprint for 2004? In-Reply-To: <400F2884.5020102@python.org> References: <20040122001941.GA27940@panix.com> <400F2884.5020102@python.org> Message-ID: <1074736872.27007.75.camel@localhost.localdomain> On Wed, 2004-01-21 at 20:33, David Goodger wrote: > Aahz wrote: > > I'm planning to be there, but haven't decided whether to sprint on > > core or reST. > > There are four days planned for sprints. Is it expected that all > sprints will last for four days, or will we have two sets of two-day > sprints, or a mixture of two-day and four-day sprints? A mixture of the two. Which would you prefer for the rest sprint? Jeremy From jeremy at alum.mit.edu Wed Jan 21 21:05:02 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed Jan 21 21:10:45 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <20040122012457.GA3123@alcyon.progiciels-bpi.ca> References: <200401212048.i0LKmZG08737@c-24-5-183-134.client.comcast.net> <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> <20040122012457.GA3123@alcyon.progiciels-bpi.ca> Message-ID: <1074737102.27007.78.camel@localhost.localdomain> On Wed, 2004-01-21 at 20:24, Fran?ois Pinard wrote: > If there is no base class, then "if there's a global variable named > __metaclass__, it is used." So my guess would be that adding a mere > > __metaclass__ = type > > in global scope for a module would make all classes be "new-style", > without the need to subclass them from object explicitly. Unless you define a class that inherits from a base class defined in another module. __metaclass__ does not apply when there are base classes. I prefer to inherit from object, because it is more explicit, and avoids possible confusion when some classes have classic bases. Jeremy From pinard at iro.umontreal.ca Wed Jan 21 21:11:30 2004 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Wed Jan 21 21:11:59 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: References: <1074722150.27294.55.camel@geddy> Message-ID: <20040122021130.GA3919@alcyon.progiciels-bpi.ca> [Tim Peters] > Extension classes can be called paleomorphic classes. [...] :-) -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From goodger at python.org Wed Jan 21 21:15:11 2004 From: goodger at python.org (David Goodger) Date: Wed Jan 21 21:15:15 2004 Subject: [Python-Dev] Re: PyCore sprint for 2004? In-Reply-To: <1074736872.27007.75.camel@localhost.localdomain> References: <20040122001941.GA27940@panix.com> <400F2884.5020102@python.org> <1074736872.27007.75.camel@localhost.localdomain> Message-ID: <400F322F.5000504@python.org> > On Wed, 2004-01-21 at 20:33, David Goodger wrote: >> There are four days planned for sprints. Is it expected that all >> sprints will last for four days, or will we have two sets of two-day >> sprints, or a mixture of two-day and four-day sprints? Jeremy Hylton wrote: > A mixture of the two. Which would you prefer for the rest sprint? I'd *prefer* to have everyone work on Docutils for the full 4 days! But I'd also like to be a good neighbor and share the wealth. I think it's up to the sprinters, who will sign up for the sprint(s) they want to participate in. BTW, in the 2003 sprint plan (section 1.2 of http://www.python.org/cgi-bin/moinmoin/SprintPlan), you were listed as the contact. Are you again for 2004 (http://www.python.org/cgi-bin/moinmoin/SprintPlan2004)? -- David Goodger http://python.net/~goodger For hire: http://python.net/~goodger/cv From jcarlson at uci.edu Wed Jan 21 21:30:21 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Jan 21 21:33:26 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: References: <1074722150.27294.55.camel@geddy> Message-ID: <20040121182714.D8EF.JCARLSON@uci.edu> > Then let me suggest ectomorphic classes, as opposed to the previous > endomorphic classes. Roll-your-own classes can be called mesomorphic > classes. Extension classes can be called paleomorphic classes. Then > there's a precise term for each kind, and bystanders are automically > informed by the very terminology used that their lives are too short to > worry about the distinctions being drawn . I'd be +1 on this being placed in the documentation, just to give readers a chuckle every time they read about classes. A little humor in the docs never hurt. - Josiah From pinard at iro.umontreal.ca Wed Jan 21 21:33:38 2004 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Wed Jan 21 21:34:08 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <1074737102.27007.78.camel@localhost.localdomain> References: <200401212048.i0LKmZG08737@c-24-5-183-134.client.comcast.net> <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> <20040122012457.GA3123@alcyon.progiciels-bpi.ca> <1074737102.27007.78.camel@localhost.localdomain> Message-ID: <20040122023338.GA4084@alcyon.progiciels-bpi.ca> [Jeremy Hylton] > On Wed, 2004-01-21 at 20:24, Fran?ois Pinard wrote: > > If there is no base class, then "if there's a global variable named > > __metaclass__, it is used." So my guess would be that adding a mere > > __metaclass__ = type > > in global scope for a module would make all classes be "new-style", > > without the need to subclass them from object explicitly. > Unless you define a class that inherits from a base class defined in > another module. I thought about this of course. This is no problem in our case, since these other modules will have their `__metaclass__ = type' themselves. > I prefer to inherit from object, because it is more explicit, and > avoids possible confusion when some classes have classic bases. A `__metaclass__ = type' at the beginning of a module is explicit enough to my eyes. If I can demonstrate that this does not harm, it will become a convention here to include this line in each and every module for all in-house projects, and we should be done with classic classes. Unless we sub-class classic classes from the Python library, I do not know how frequently we do that. By the way :-), I presume there are plans for the Python library to progressively switch to new-style classes whenever possible? If not, should they be? -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From jeremy at alum.mit.edu Wed Jan 21 22:28:51 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed Jan 21 22:34:36 2004 Subject: [Python-Dev] Re: PyCore sprint for 2004? In-Reply-To: <400F322F.5000504@python.org> References: <20040122001941.GA27940@panix.com> <400F2884.5020102@python.org> <1074736872.27007.75.camel@localhost.localdomain> <400F322F.5000504@python.org> Message-ID: <1074742130.27007.81.camel@localhost.localdomain> On Wed, 2004-01-21 at 21:15, David Goodger wrote: > > On Wed, 2004-01-21 at 20:33, David Goodger wrote: > >> There are four days planned for sprints. Is it expected that all > >> sprints will last for four days, or will we have two sets of two-day > >> sprints, or a mixture of two-day and four-day sprints? > > Jeremy Hylton wrote: > > A mixture of the two. Which would you prefer for the rest sprint? > > I'd *prefer* to have everyone work on Docutils for the full 4 days! > But I'd also like to be a good neighbor and share the wealth. I think > it's up to the sprinters, who will sign up for the sprint(s) they > want to participate in. Fair enough. I'm going to ask the sprint coaches to choose the days they want, then we'll see if the space works out. I suspect it's possible to get an extra room for the weekend days if there is a lot of demand. The primary constraint, then, is how many days participants are up for. I hope you'll help round up participants. Then you can take a straw poll. > BTW, in the 2003 sprint plan (section 1.2 of > http://www.python.org/cgi-bin/moinmoin/SprintPlan), you were listed as > the contact. Are you again for 2004 > (http://www.python.org/cgi-bin/moinmoin/SprintPlan2004)? Yes. Jeremy From ncoghlan at iinet.net.au Wed Jan 21 22:57:36 2004 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Jan 21 23:11:23 2004 Subject: [Python-Dev] dict.addlist() In-Reply-To: <20040121120433.FB3D.JCARLSON@uci.edu> References: <8BCD26A6-4BF5-11D8-8B42-000A95EFAE9E@yahoo.com> <400EA55F.7030009@iinet.net.au> <20040121120433.FB3D.JCARLSON@uci.edu> Message-ID: <400F4A30.10108@iinet.net.au> Josiah Carlson wrote: >>The meaning of d[k] is significantly different in section A than it was >>in section B. > > For most any reasonable program, previous lines of execution determine > the context of later lines. True. I guess it was more a matter of dictionaries acquiring a piece of 'magical state' which had fairly profound non-local effects on their behaviour. The dictionaries 'default value' just seems to be something that shouldn't change, whereas the actual data stored in the dictionary changes all the time. Still, I can't even convince myself that there's actually anything to be concerned about there, so that's the last I'll say about it. Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 From ncoghlan at iinet.net.au Thu Jan 22 00:26:31 2004 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Thu Jan 22 00:31:47 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040122011450.GA25795@panix.com> References: <20040122011450.GA25795@panix.com> Message-ID: <400F5F07.1000404@iinet.net.au> Aahz wrote: > On Tue, Jan 20, 2004, Batista, Facundo wrote: > >>I don't use packages, but when I read the tutorial first time, I wondered >>why, if the packages appeared to solve a 'pack into a directory' problem, >>the names were joined with '.' instead of '/'. Later I found some other uses >>to the 'unix path' notation. >> >>For example: >> >>- Module inside a package: import package/module >> >>- Relative import: import ../../otherpack/webmodules/somemodule > > > Haven't seen any support for this; in the absence of support, I'll have > to assume that it's a non-starter. It has some appeal to me, as when we use the standard 'dot' notation, Python is invoking some magic on our behalf to search several 'standard' locations (the Python standard library, site-packages, sys.path, current directory(?) are the locations I am aware of). I believe the proposal is to drop the current directory from that list of locations searched, which leaves the unresolved question of how to spell an explicit request for a relative import. Since it _is_ the file system heirarchy we're traversing in a relative import, why _not_ use a Unix-style relative path? So: from ./mypy import stuff from ../mypkg2/mypy2 import other_stuff (I don't see any reason to change the proposal to require the 'from ... import ...' arrangement when doing relative imports) To import an entire module from the current directory, it may be enough to allow a trailing slash: from ./ import mypy Certainly, it would be easy to understand - and the slash characters would hopefully be enough to signal the import machinery that a relative import is being requested. Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 From tanzer at swing.co.at Thu Jan 22 02:31:00 2004 From: tanzer at swing.co.at (Christian Tanzer) Date: Thu Jan 22 02:31:34 2004 Subject: [Python-Dev] Re: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: Your message of "Wed, 21 Jan 2004 20:55:43 EST." <20040122015543.GB3123@alcyon.progiciels-bpi.ca> Message-ID: Fran?ois Pinard wrote: > > This is because the machinery for descriptors is embedded in > > type.__getattribute__ and object.__getattribute__. Override or fail > > to inherit either of these and all bets are off. > > I thinks I understand better what you are saying. Maybe part of my > problem is that I'm not sure how `__getattribute__' is itself fetched. > While I know you do not want to speak about metatypes, it might be > that whenever you write `type.__getattribute__', you really mean the > `__getattribute__' found in the dict of the metatype (or is it?); > while when you write `object.__getattribute__', you really mean the > `__getattribute__' found by scanning the base classes of the current > class, and `object' always when there is no base class for a "new-style" > class. As always, a little experiment at the interactive prompt is instructive: Python 2.3.3 (#1, Dec 21 2003, 09:30:26) [GCC 2.95.4 20011002 (Debian prerelease)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> class A : ... __metaclass__ = type ... >>> A >>> A.__bases__ (,) >>> class B : ... pass ... >>> B.__bases__ () IMHO, it does make sense to present a simplified view in introductory documentation, but I also think that there should be footnotes explaining where white lies are told and giving hints of the full picture with all the attendant complexity. -- Christian Tanzer http://www.c-tanzer.at/ From nextstepn at katamail.com Thu Jan 22 04:54:56 2004 From: nextstepn at katamail.com (NeXTstep) Date: Thu Jan 22 04:56:23 2004 Subject: [Python-Dev] [leadership/opensource] invitation to online survey Message-ID: <009201c3e0cd$f26700a0$0701a8c0@babayaro> Dear all, I have just put online a survey addressing the topic of "good leadership in the open-source environment". Basically, my objective is to identify the personal conceptions of good leadership that reside in the minds of the contributors, in terms of leaders' _behaviors_ and _characteristics_. What is a good open-source project leader, from the contributor's point of view? To what extent, those personal believes are shared among developers? Can the contributor's national cultural belonging and level of experience in contributing to open-source projects explain such differences in their idea of what a good leader is? I would really appreciate your participation to the survey! Contribution (completely anonymous) consists in rating a list of statements that may be used to describe the behaviors of an open-source leader. It will take around ten minutes - there aren't any "time-consuming" open ended questions;) Following, the link to the survey: http://freeonlinesurveys.com/rendersurvey.asp?id=49776 If you are interested, I will not miss to email you a link to the final report, when ready ;) (approx, a couple of months from now) Thank you in advance, and don't hesitate to contact me if you have any questions or comments :) Gianluca Bosco g.bosco@GETRIDOFTHISinwind.it Denmark Technical University Department of manufacturing engineering and management P.s. Please, feel free to spread the voice about this survey, if you think it's worth :) From paul-python at svensson.org Thu Jan 22 05:42:49 2004 From: paul-python at svensson.org (Paul Svensson) Date: Thu Jan 22 05:43:03 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <400F5F07.1000404@iinet.net.au> References: <20040122011450.GA25795@panix.com> <400F5F07.1000404@iinet.net.au> Message-ID: <20040122053136.I24155@familjen.svensson.org> Aahz wrote: > On Tue, Jan 20, 2004, Batista, Facundo wrote: > >>- Module inside a package: import package/module >> >>- Relative import: import ../../otherpack/webmodules/somemodule > > Haven't seen any support for this; in the absence of support, I'll have > to assume that it's a non-starter. In my case, I had a case of usenet nod syndrome the first time I saw this. I certainly hope it's not a non-starter -- it's the only new import syntax I've seen proposed so far that I've found immediately blatantly obvious. +1 /Paul From mwh at python.net Thu Jan 22 06:07:08 2004 From: mwh at python.net (Michael Hudson) Date: Thu Jan 22 06:07:11 2004 Subject: [Python-Dev] Configure option to change Python's name? In-Reply-To: (Jack Jansen's message of "Wed, 21 Jan 2004 17:08:04 +0100") References: Message-ID: <2md69cmecj.fsf@starship.python.net> Jack Jansen writes: > This idea is only half-baked, so please fire away, or tell me to go > ahead, or go hide under a stone. > > I would like a configure option that changes the name of "python" > everywhere: in the executable, in the dynamic library name, the MacOSX > framework name, etc. This would allow you to have two Pythons on a > system that are completely 100% independent of each other. > > The idea was triggered by Michael Hudson, who wanted to install a > debugging-enabled framework build of Python on MacOSX, which shouldn't > interfere with the normal Python (either Apple-installed or > user-installed). It turns out this is doable through meticulous > hacking of the Makefile after configure (and, for now, ignoring the > few hardcoded "Python.framework" references in Lib, but those need > fixing anyway). Can I take this to mean that you got it to work? I've given up for the moment on the grounds that I wanted a working Python install... I like the idea, but I'm not sure it has applicability outside MacOS X. I have some configure hacks that change the name of the framework if you combine --enable-framework and --with-pydebug, but haven't put them to the test fullly yet. Cheers, mwh -- Python enjoys making tradeoffs that drive *someone* crazy . -- Tim Peters, comp.lang.python From mwh at python.net Thu Jan 22 06:08:42 2004 From: mwh at python.net (Michael Hudson) Date: Thu Jan 22 06:08:45 2004 Subject: [Python-Dev] Configure option to change Python's name? In-Reply-To: <400EB014.5010408@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed, 21 Jan 2004 18:00:04 +0100") References: <400EB014.5010408@v.loewis.de> Message-ID: <2m8yk0me9x.fsf@starship.python.net> "Martin v. L?wis" writes: > Jack Jansen wrote: >> The idea was triggered by Michael Hudson, who wanted to install a >> debugging-enabled framework build of Python on MacOSX, which >> shouldn't interfere with the normal Python (either Apple-installed >> or user-installed). It turns out this is doable through meticulous >> hacking of the Makefile after configure (and, for now, ignoring the >> few hardcoded "Python.framework" references in Lib, but those need >> fixing anyway). For other unix systems a similar approach would work >> to isolate a debug-python from the normal production python. > > Why can't you configure with an entirely different --prefix, such as > /just/for/me? I think this would be tricky on OS X, which looks for frameworks in fixed places. It really would be nice to have the framework have a different name. But this may be an OS specific problem. Cheers, mwh -- ... so the notion that it is meaningful to pass pointers to memory objects into which any random function may write random values without having a clue where they point, has _not_ been debunked as the sheer idiocy it really is. -- Erik Naggum, comp.lang.lisp From allaugh at wongfaye.com Wed Jan 21 07:32:48 2004 From: allaugh at wongfaye.com (Quest) Date: Thu Jan 22 08:31:45 2004 Subject: [Python-Dev] nesws regarding vigros Bretons Message-ID: simplest, How Vigras works. And you can better understand, what Vigras can do for you. If you are sensible about your health, reflect on what you can do for your seual health, to keep the chances that you will need Vigras as low as possible. dancers circuitous bleats, relented. http://www.effectiveherbs.com/index.php?pid=genviag Inrease Seks Drive Bost Seual Performance Fuller & Harder Erecions Inrease Stamna & Endurance Quicker Rechages conspire Syrianizes plucking, genders. Burch soles Georgia, prophetic. Happy holidays, Goethe From theller at python.net Thu Jan 22 08:46:40 2004 From: theller at python.net (Thomas Heller) Date: Thu Jan 22 08:46:46 2004 Subject: ctypes anecdote (was: Re: [Python-Dev] The os module, unix and win32) In-Reply-To: <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> (Guido van Rossum's message of "Thu, 08 Jan 2004 14:27:21 -0800") References: <6.0.1.1.2.20040108214934.0221ac98@torment.chelsea.private> <048a01c3d635$e6e08370$6401fea9@YODA> <200401082227.i08MRL621946@c-24-5-183-134.client.comcast.net> Message-ID: Guido van Rossum writes: >> Another possibility: we'd get a lot more mileage out of simply >> adding ctypes to the standard install. That would solve the >> immediate popen problem, let us get rid of _winreg entirely (and >> replace it with a pure Python version), and make future Win32 >> problems easier to solve. > > I don't know that a Python version of _winreg using ctypes would be > preferable over a C version. I'd expect it to be slower, less > readable than the C version, and more susceptible to the possibility > of causing segfaults. (As for speed, I actually have a windows > registry application where speed of access is important.) Late followup, a potentially interesting anecdote: I was challanged to find out how much slower ctypes coded registry calls would be than the _winreg calls. So I naively wrote this code: """ import _winreg, time from ctypes import * if 1: start = time.time() i = 0 name = c_buffer(256) cchName = sizeof(name) while 0 == windll.advapi32.RegEnumKeyA(_winreg.HKEY_CLASSES_ROOT, i, name, cchName): i += 1 print i, time.time() - start if 1: start = time.time() i = 0 try: while 1: _winreg.EnumKey(_winreg.HKEY_CLASSES_ROOT, i) i += 1 except WindowsError: pass print i, time.time() - start """ Would you have guessed the result? """ 5648 0.0700001716614 5648 73.7660000324 """ So the *ctypes* version was about 1000 times faster! The comparison is not really right, the 'name' buffer creation must be inside the loop, of course. The real cause is that the _winreg version does a call to RegQueryInfoKey() before calling RegEnumKey(), to find out the size of the larest subkey's name, and this (internally in Windows) probably does have to look at each subkey. So the loop above probably has quadratic runtime behaviour, which explains the bad performance. To get on-topic for python-dev again, it seems there is much room for improvement, at least for the _winreg.EnumKey function. Thomas From guido at python.org Thu Jan 22 10:02:09 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 22 10:02:20 2004 Subject: [Python-Dev] PyCon price schedule correction In-Reply-To: Your message of "Mon, 19 Jan 2004 09:50:31 PST." Message-ID: <200401221502.i0MF29812035@c-24-5-183-134.client.comcast.net> In my previous announcement, I wrote: > This is a reminder that the deadline for early bird registration for > PyCon DC 2004 is February 1, 2004. Early bird registration is $175; > after that, it will be $200 through March 17, then $250 at the door. Unfortunately this gave a too rose-colored view on the not-so-early registration fees. The correct price schedule is: - Early bird (through Feb 1) $175 - Normal reg (until March 17) $250 <-- correction! - Walk-ins $300 <-- correction! To register, visit: http://www.pycon.org/dc2004/register/ I urge everybody coming to the conference to register ASAP -- save yourself or your employer $75!!! You have until February 1st. --Guido van Rossum (home page: http://www.python.org/~guido/) From pinard at iro.umontreal.ca Thu Jan 22 10:06:12 2004 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Thu Jan 22 10:06:35 2004 Subject: [Python-Dev] Re: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: References: <20040122015543.GB3123@alcyon.progiciels-bpi.ca> Message-ID: <20040122150612.GA17476@alcyon.progiciels-bpi.ca> [Christian Tanzer] > Fran?ois Pinard wrote: > > [...] found by scanning the base classes of the current class, and > > `object' always when there is no base class for a "new-style" class. > As always, a little experiment at the interactive prompt is > instructive: [...] Indeed. One more tiny experiment is also enlightening: Python 2.3.3 (#1, Jan 21 2004, 22:36:17) >>> __metaclass__ = type >>> class A: pass ... >>> A.__bases__ (,) >>> A.__mro__ (, ) >>> (I merely added __mro__ to your example) which seems to confirm that `object' is always implied if the metaclass is `type', exactly as if it was explicitly listed as a base. Unless the above experiment just reveals some unspecified behaviour, which just happens to not correctly be in-lined with Guido's intents? That is possible, yet it seems unlikely. Only Guido really knows! :-) Maybe we should not overly state that one ought to explicitly sub-class a class from `object' to get a new style class, nor suggest that not doing so might yield broken classes with unpredictable behaviour. Of course, we have been educated to read "class A(object):" as the standard way to flag `A' as being new-style, but if I could avoid adding such "(object)" everywhere in the code, and merely declare __metaclass__ once per module, it looks much neater to me. Taste varies :-). More it goes, more it looks like my fears were not really sounded. Thanks to all those who participated in clarifying this little issue. -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From aahz at pythoncraft.com Thu Jan 22 12:01:04 2004 From: aahz at pythoncraft.com (Aahz) Date: Thu Jan 22 12:01:22 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> References: <200401212048.i0LKmZG08737@c-24-5-183-134.client.comcast.net> <002501c3e068$3ee4aba0$37ae2c81@oemcomputer> Message-ID: <20040122170104.GA10769@panix.com> On Wed, Jan 21, 2004, Raymond Hettinger wrote: > > I think all of the topics are best understood in terms of three groups: > classic classes, new-style classes, and roll your own classes. Lumping > the latter two together won't help one bit. While that's technically true, most of the idiomatic uses of metaclasses that I've seen *do* result in the creation of "normal" new-style classes. We don't want to obscure that, either, IMO. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay? From python at rcn.com Thu Jan 22 13:27:28 2004 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 22 13:28:26 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <20040122170104.GA10769@panix.com> Message-ID: <000f01c3e115$64711040$9ab99d8d@oemcomputer> > > I think all of the topics are best understood in terms of three groups: > > classic classes, new-style classes, and roll your own classes. Lumping > > the latter two together won't help one bit. [Aahz] > While that's technically true, most of the idiomatic uses of metaclasses > that I've seen *do* result in the creation of "normal" new-style > classes. We don't want to obscure that, either, IMO. Okay, I've added a paragraph to the descriptor guide: """ The details above show that the mechanism for descriptors is embedded in the __getattribute__() methods for object, type, and super. Classes inherit this machinery when they derive from object or if they have a meta-class providing similar functionality. Likewise, classes can turn-off descriptor invocation by overriding __getattribute__(). """ Raymond Hettinger From tim.one at comcast.net Thu Jan 22 14:20:15 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 22 14:20:18 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs Message-ID: Under MSVC 6, tons of new warnings are generated for these new files. Then all the new tests fail on Win98SE, with errors like: LookupError: unknown encoding: shift_jisx0213 New warnings: multibytecodec.c _cp932.c C:\Code\python\Modules\cjkcodecs\_cp932.c(23) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _cp949.c C:\Code\python\Modules\cjkcodecs\_cp949.c(21) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _cp950.c C:\Code\python\Modules\cjkcodecs\_cp950.c(22) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _euc_jisx0213.c _euc_jp.c C:\Code\python\Modules\cjkcodecs\_euc_jisx0213.c(47) : warning C4761: integral size mismatch in argument; conversion supplied C:\Code\python\Modules\cjkcodecs\_euc_jisx0213.c(54) : warning C4761: integral size mismatch in argument; conversion supplied C:\Code\python\Modules\cjkcodecs\_euc_jisx0213.c(57) : warning C4761: integral size mismatch in argument; conversion supplied C:\Code\python\Modules\cjkcodecs\_euc_jp.c(21) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _euc_kr.c C:\Code\python\Modules\cjkcodecs\_euc_kr.c(20) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _gb18030.c _gb2312.c C:\Code\python\Modules\cjkcodecs\_gb2312.c(20) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _gbk.c C:\Code\python\Modules\cjkcodecs\_gbk.c(22) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _hz.c C:\Code\python\Modules\cjkcodecs\_hz.c(39) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_hz.c(42) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _iso2022_jp.c C:\Code\python\Modules\cjkcodecs\iso2022common.h(189) : warning C4018: '>=' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\iso2022common.h(196) : warning C4018: '<' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\_iso2022_jp.c(52) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp.c(64) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp.c(80) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp.c(104) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _iso2022_jp_1.c C:\Code\python\Modules\cjkcodecs\iso2022common.h(189) : warning C4018: '>=' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\iso2022common.h(196) : warning C4018: '<' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\_iso2022_jp_1.c(54) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_1.c(66) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_1.c(82) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_1.c(113) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _iso2022_jp_2.c C:\Code\python\Modules\cjkcodecs\iso2022common.h(189) : warning C4018: '>=' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\iso2022common.h(196) : warning C4018: '<' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\_iso2022_jp_2.c(60) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_2.c(72) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_2.c(88) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_2.c(143) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _iso2022_jp_3.c C:\Code\python\Modules\cjkcodecs\iso2022common.h(189) : warning C4018: '>=' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\iso2022common.h(196) : warning C4018: '<' : signed/unsigned mismatch _iso2022_jp_ext.c C:\Code\python\Modules\cjkcodecs\_iso2022_jp_3.c(84) : warning C4761: integral size mismatch in argument; conversion supplied C:\Code\python\Modules\cjkcodecs\_iso2022_jp_3.c(91) : warning C4761: integral size mismatch in argument; conversion supplied C:\Code\python\Modules\cjkcodecs\_iso2022_jp_3.c(94) : warning C4761: integral size mismatch in argument; conversion supplied C:\Code\python\Modules\cjkcodecs\iso2022common.h(189) : warning C4018: '>=' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\iso2022common.h(196) : warning C4018: '<' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\_iso2022_jp_ext.c(52) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_ext.c(64) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_ext.c(80) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_jp_ext.c(113) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _iso2022_kr.c C:\Code\python\Modules\cjkcodecs\iso2022common.h(189) : warning C4018: '>=' : signed/unsigned mismatch C:\Code\python\Modules\cjkcodecs\_iso2022_kr.c(46) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data C:\Code\python\Modules\cjkcodecs\_iso2022_kr.c(50) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _johab.c C:\Code\python\Modules\cjkcodecs\_johab.c(47) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _shift_jis.c C:\Code\python\Modules\cjkcodecs\_shift_jis.c(35) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data _shift_jisx0213.c C:\Code\python\Modules\cjkcodecs\_shift_jisx0213.c(36) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data mapdata_ja_JP.c C:\Code\python\Modules\cjkcodecs\_shift_jisx0213.c(51) : warning C4761: integral size mismatch in argument; conversion supplied C:\Code\python\Modules\cjkcodecs\_shift_jisx0213.c(58) : warning C4761: integral size mismatch in argument; conversion supplied C:\Code\python\Modules\cjkcodecs\_shift_jisx0213.c(61) : warning C4761: integral size mismatch in argument; conversion supplied mapdata_ko_KR.c mapdata_zh_CN.c mapdata_zh_TW.c _big5.c C:\Code\python\Modules\cjkcodecs\_big5.c(21) : warning C4244: '=' : conversion from 'unsigned short ' to 'unsigned char ', possible loss of data From pinard at iro.umontreal.ca Thu Jan 22 14:30:02 2004 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Thu Jan 22 14:31:23 2004 Subject: [Python-Dev] RE: Hot-To Guide for Descriptors - Nits! :-) In-Reply-To: <000f01c3e115$64711040$9ab99d8d@oemcomputer> References: <20040122170104.GA10769@panix.com> <000f01c3e115$64711040$9ab99d8d@oemcomputer> Message-ID: <20040122193002.GA21026@alcyon.progiciels-bpi.ca> [Raymond Hettinger] > """ The details above show that the mechanism for descriptors is > embedded in the __getattribute__() methods for object, type, and super. > Classes inherit this machinery when they derive from object or if they > have a meta-class providing similar functionality. Likewise, classes can > turn-off descriptor invocation by overriding __getattribute__(). > """ Nice! :-) -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From gerrit at nl.linux.org Thu Jan 22 15:03:44 2004 From: gerrit at nl.linux.org (Gerrit Holl) Date: Thu Jan 22 15:06:01 2004 Subject: [Python-Dev] PEP 8 addition: Preferred property style? Message-ID: <20040122200344.GA3185@nl.linux.org> Hello, currently, PEP 8 does not have any advice about the preferred style in defining properties. There are only two places in the standard library where they are used (socket.py and xml/dom/minicompat.py), so that doesn't provide much information either. Personally, I don't like the namespace cluttering caused by defining __get, __set and/or __del. Since I saw someone[0] on c.l.py using it, I always use this style: def mtime(): doc = "Modification time" def get(self): return os.path.getmtime(str(self)) def set(self, t): return os.utime(str(self), (self.atime, t)) return get, set, None, doc mtime = property(*mtime()) I like it, because it's very readable for me, and doesn't clutter the namespace. Two questions about this style: - Is it tolerable to use it in the standard library (PrePEP)? - Should it be advised for or against in PEP 8, or neither? Gerrit [0] http://tinyurl.com/2rydd From guido at python.org Thu Jan 22 15:11:11 2004 From: guido at python.org (Guido van Rossum) Date: Thu Jan 22 15:11:21 2004 Subject: [Python-Dev] PEP 8 addition: Preferred property style? In-Reply-To: Your message of "Thu, 22 Jan 2004 21:03:44 +0100." <20040122200344.GA3185@nl.linux.org> References: <20040122200344.GA3185@nl.linux.org> Message-ID: <200401222011.i0MKBBB13078@c-24-5-183-134.client.comcast.net> > currently, PEP 8 does not have any advice about the preferred style > in defining properties. There are only two places in the standard > library where they are used (socket.py and xml/dom/minicompat.py), so > that doesn't provide much information either. Personally, I don't like the > namespace cluttering caused by defining __get, __set and/or __del. Since > I saw someone[0] on c.l.py using it, I always use this style: > > def mtime(): > doc = "Modification time" > > def get(self): > return os.path.getmtime(str(self)) > > def set(self, t): > return os.utime(str(self), (self.atime, t)) > > return get, set, None, doc > mtime = property(*mtime()) > > I like it, because it's very readable for me, and doesn't clutter the > namespace. Two questions about this style: > > - Is it tolerable to use it in the standard library (PrePEP)? > - Should it be advised for or against in PEP 8, or neither? I don't like it, and I'd rather not see this become a standard idiom. Personally, I don't see the issue with the namespace cluttering (what's bad about it?) and I actually like that the getter/setter functions are also explicitly usable (hey, you might want to use one of these as the key argument to sort :-). My objection against the def is that it requires a lot of sophistication to realize that this is a throw-away function, called just once to return argument values for the property call. It might also confuse static analyzers that aren't familiar with this pattern. --Guido van Rossum (home page: http://www.python.org/~guido/) From gerrit at nl.linux.org Thu Jan 22 16:21:38 2004 From: gerrit at nl.linux.org (Gerrit Holl) Date: Thu Jan 22 16:30:36 2004 Subject: [Python-Dev] PEP 8 addition: Preferred property style? In-Reply-To: <200401222011.i0MKBBB13078@c-24-5-183-134.client.comcast.net> References: <20040122200344.GA3185@nl.linux.org> <200401222011.i0MKBBB13078@c-24-5-183-134.client.comcast.net> Message-ID: <20040122212138.GA3482@nl.linux.org> Hi, [Gerrit Holl] > > def mtime(): > > doc = "Modification time" > > > > def get(self): > > return os.path.getmtime(str(self)) > > > > def set(self, t): > > return os.utime(str(self), (self.atime, t)) > > > > return get, set, None, doc > > mtime = property(*mtime()) > > > > I like it, because it's very readable for me, and doesn't clutter the > > namespace. Two questions about this style: > > > > - Is it tolerable to use it in the standard library (PrePEP)? > > - Should it be advised for or against in PEP 8, or neither? [Guido van Rossum] > I don't like it, and I'd rather not see this become a standard idiom. OK. > Personally, I don't see the issue with the namespace cluttering > (what's bad about it?) and I actually like that the getter/setter > functions are also explicitly usable (hey, you might want to use one > of these as the key argument to sort :-). > > My objection against the def is that it requires a lot of > sophistication to realize that this is a throw-away function, called > just once to return argument values for the property call. It might > also confuse static analyzers that aren't familiar with this pattern. I was thinking of it as analogue to classmethod() and staticmethod(), but it isn't, of course. The getter/setter are still accesible through the property object itself, but I get your point. I think there is a problem with the current way of defining properties. A property commonly exists of several parts. These parts form one object. But the parts are not grouped by indentation. In my opinion, readability would be enhanced if they would. One way to do this is adding syntax - discussed about a year ago with no conclusion. Another way is to use a class with a metaclass - not clean, because you aren't creating a class after all. I thought this would be a good concession, but it has similar disadvantages. Do you agree that there is a a problem here? Should Guido van Rossum's opinion be included in PEP 8? yours, Gerrit. From python at rcn.com Thu Jan 22 18:17:51 2004 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 22 18:18:51 2004 Subject: [Python-Dev] PEP 8 addition: Preferred property style? In-Reply-To: <20040122212138.GA3482@nl.linux.org> Message-ID: <001101c3e13d$f791fba0$7cbd958d@oemcomputer> [Gerrit] > I think there is a problem with the current way of defining properties. > A property commonly exists of several parts. These parts form one > object. But the parts are not grouped by indentation. In my opinion, > readability would be enhanced if they would. One way to do this is > adding syntax - discussed about a year ago with no conclusion. Another > way is to use a class with a metaclass - not clean, because you aren't > creating a class after all. I thought this would be a good concession, > but it has similar disadvantages. > > Do you agree that there is a a problem here? I don't agree. The get/set/del functions can be defined anywhere (even outside the class definition). It is the line where property() is called that everything comes together and that works out fine, both in theory and practice. not-everything-needs-special-syntax-ly yours, Raymond Hettinger From tjreedy at udel.edu Thu Jan 22 20:46:52 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Thu Jan 22 20:47:00 2004 Subject: [Python-Dev] Re: PEP 8 addition: Preferred property style? References: <20040122200344.GA3185@nl.linux.org> Message-ID: "Gerrit Holl" wrote in message news:20040122200344.GA3185@nl.linux.org... > Hello, > > currently, PEP 8 does not have any advice about the preferred style > in defining properties. There are only two places in the standard > library where they are used (socket.py and xml/dom/minicompat.py), so > that doesn't provide much information either. Personally, I don't like the > namespace cluttering caused by defining __get, __set and/or __del. Since > I saw someone[0] on c.l.py using it, I always use this style: > > def mtime(): > doc = "Modification time" > > def get(self): > return os.path.getmtime(str(self)) > > def set(self, t): > return os.utime(str(self), (self.atime, t)) > > return get, set, None, doc > mtime = property(*mtime()) > > I like it, because it's very readable for me, and doesn't clutter the > namespace. Two questions about this style: The 'cute' is a mental burden for the naive reader. For compactness and no extra name bindings, I might prefer def mtime(): # temp binding to setter return os.utime(str(self), (self.atime, t)) #'return expr' is atypical mtime = property( "Modification time", lambda: os.path.getmtime(str(self)), mtime) But I like 'def _set():...', with 'mtime =...' followed either by 'del _set' or another 'def _set():...' even better. Terry J. Reedy From perky at i18n.org Fri Jan 23 09:42:06 2004 From: perky at i18n.org (Hye-Shik Chang) Date: Fri Jan 23 09:41:12 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: References: Message-ID: <20040123144206.GA76699@i18n.org> On Thu, Jan 22, 2004 at 02:20:15PM -0500, Tim Peters wrote: > Under MSVC 6, tons of new warnings are generated for these new files. Then > all the new tests fail on Win98SE, with errors like: > > LookupError: unknown encoding: shift_jisx0213 > > New warnings: > > multibytecodec.c > _cp932.c > C:\Code\python\Modules\cjkcodecs\_cp932.c(23) : warning C4244: '=' : > conversion from 'unsigned short ' to 'unsigned char ', possible loss of data [snip] Okay. It's fixed in CVS. If you find another warnings not fixed by this, please let me know. BTW, I got a traceback on committing. (maybe from a cvsmail script?) Checking in _big5.c; /cvsroot/python/python/dist/src/Modules/cjkcodecs/_big5.c,v <-- _big5.c new revision: 1.2; previous revision: 1.1 done Checking in _cp932.c; [snip] new revision: 1.2; previous revision: 1.1 done Traceback (most recent call last): File "/cvsroot/python/CVSROOT/syncmail", line 329, in ? main() File "/cvsroot/python/CVSROOT/syncmail", line 322, in main blast_mail(subject, people, specs[1:], contextlines, fromhost) File "/cvsroot/python/CVSROOT/syncmail", line 224, in blast_mail conn.connect(MAILHOST, MAILPORT) File "/usr/lib/python2.2/smtplib.py", line 276, in connect for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM): socket.gaierror: (-2, 'Name or service not known') Hye-Shik From barry at python.org Fri Jan 23 09:50:29 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 23 09:50:39 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: <20040123144206.GA76699@i18n.org> References: <20040123144206.GA76699@i18n.org> Message-ID: <1074869428.2263.8.camel@anthem> On Fri, 2004-01-23 at 09:42, Hye-Shik Chang wrote: > BTW, I got a traceback on committing. (maybe from a cvsmail script?) I recently got a message from SF saying they upgraded the Python on the cvs machine to 2.2.3. I'm wondering if they also turned off smtp on that machine (syncmail's MAILHOST variable points to localhost via its empty string value). I've seen this same error, or reports of this on several projects and have a message into SF to find out what's up. -Barry From kbk at shore.net Fri Jan 23 11:00:18 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri Jan 23 11:00:38 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: <1074869428.2263.8.camel@anthem> (Barry Warsaw's message of "Fri, 23 Jan 2004 09:50:29 -0500") References: <20040123144206.GA76699@i18n.org> <1074869428.2263.8.camel@anthem> Message-ID: <87d69ad59p.fsf@hydra.localdomain> Barry Warsaw writes: > I recently got a message from SF saying they upgraded the Python on the > cvs machine to 2.2.3. I'm wondering if they also turned off smtp on > that machine (syncmail's MAILHOST variable points to localhost via its > empty string value). I've seen this same error, or reports of this on > several projects and have a message into SF to find out what's up. Hm, no checkin messages yesterday, but I see Fred just got one through 10:30 EST so maybe its OK now. -- KBK From fdrake at acm.org Fri Jan 23 11:08:16 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Jan 23 11:08:25 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: <87d69ad59p.fsf@hydra.localdomain> References: <20040123144206.GA76699@i18n.org> <1074869428.2263.8.camel@anthem> <87d69ad59p.fsf@hydra.localdomain> Message-ID: <16401.18160.56039.687475@sftp.fdrake.net> Kurt B. Kaiser writes: > Hm, no checkin messages yesterday, but I see Fred just got one through > 10:30 EST so maybe its OK now. SourceForge changed some of their mail configuration, and we just worked out the fix. My commit this morning was really to test that it actually worked. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry at python.org Fri Jan 23 11:08:52 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 23 11:08:59 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: <87d69ad59p.fsf@hydra.localdomain> References: <20040123144206.GA76699@i18n.org> <1074869428.2263.8.camel@anthem> <87d69ad59p.fsf@hydra.localdomain> Message-ID: <1074874132.2263.27.camel@anthem> On Fri, 2004-01-23 at 11:00, Kurt B. Kaiser wrote: > Barry Warsaw writes: > > > I recently got a message from SF saying they upgraded the Python on the > > cvs machine to 2.2.3. I'm wondering if they also turned off smtp on > > that machine (syncmail's MAILHOST variable points to localhost via its > > empty string value). I've seen this same error, or reports of this on > > several projects and have a message into SF to find out what's up. > > Hm, no checkin messages yesterday, but I see Fred just got one through > 10:30 EST so maybe its OK now. We just fixed it. -Barry From tim.one at comcast.net Fri Jan 23 11:16:39 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 23 11:16:50 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: <1074869428.2263.8.camel@anthem> Message-ID: [Barry] > I recently got a message from SF saying they upgraded the Python on > the cvs machine to 2.2.3. I'm wondering if they also turned off smtp > on that machine (syncmail's MAILHOST variable points to localhost via > its empty string value). I've seen this same error, or reports of > this on several projects and have a message into SF to find out > what's up. I bet this recent blurb on SF's site status page is relevant: (2004-01-23 06:31:27 - Project CVS Service) Due to the recent OS upgrade on the SourceForge.net developer CVS server, it will be necessary for any per-project syncmail script installations to make a one-line change to their syncmail script. The central copy of the syncmail script which is used by a number of projects has already been modified in this manner. To function properly, the MAILHOST line should be hardcoded to MAILHOST = 'localhost'. Additional information on syncmail may be found in Enhancing Project CVS Services Using Scripts; the SourceForge.net team maintains a central copy of the syncmail script which your project may use. So did they upgrade to Windows ? Having to say 'localhost' instead of using an empty string is universal across Windows flavors. From fdrake at acm.org Fri Jan 23 11:21:23 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Jan 23 11:32:35 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: References: <1074869428.2263.8.camel@anthem> Message-ID: <16401.18947.674641.969055@sftp.fdrake.net> Tim Peters writes: > I bet this recent blurb on SF's site status page is relevant: Yes, very relevant. It was added when we brought the matter to their attention. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From kbk at shore.net Fri Jan 23 11:59:35 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri Jan 23 11:59:42 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: (Tim Peters's message of "Fri, 23 Jan 2004 11:16:39 -0500") References: Message-ID: <87isj2bnyg.fsf@hydra.localdomain> "Tim Peters" writes: > I bet this recent blurb on SF's site status page is relevant: Thanks! Hopefully they will email admins. heading off to fix IDLEfork syncmail.... -- KBK From fdrake at acm.org Fri Jan 23 12:06:17 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Jan 23 12:06:26 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: <87isj2bnyg.fsf@hydra.localdomain> References: <87isj2bnyg.fsf@hydra.localdomain> Message-ID: <16401.21641.241874.592766@sftp.fdrake.net> Kurt B. Kaiser writes: > Thanks! Hopefully they will email admins. > > heading off to fix IDLEfork syncmail.... I've sent a note to the cvs-syncmail-talk list, but I haven't seen it show up in my mailbox yet. lists.sourceforge.net seems to be *really* slow right now. ;-( -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From niemeyer at conectiva.com Fri Jan 23 12:01:30 2004 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Fri Jan 23 12:07:26 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040109114039.FB88.JCARLSON@uci.edu> References: <20040107104457.0F01.JCARLSON@uci.edu> <20040109174051.GA6480@burma.localdomain> <20040109114039.FB88.JCARLSON@uci.edu> Message-ID: <20040123170130.GA10485@burma.localdomain> [has this message hit the list? I haven't received it] > > - All given examples in the PEP are easily rewritable with existent > > logic (and I disagree that the new method would be easier to > > understand); > > You are free to disagree that two objects, which clearly state that they > are either the largest or smallest objects, are not clear. I don't know > how to address your concern. Both of them. > > - I can't think about any real usage cases for such object which > > couldn't be easily done without it; > > It is not whether "if we don't have it we can do it easy", it is about > "if we have it, we can do it easier". The Queue module, just recently Understood, but not agreed. > Arguably, there aren't any usage cases where None is necessary. But yet > we have None. [...] Please, let's not ignore how these usage cases differ in wideness. > > - The "One True Large Object" isn't "True Large" at all, since depending > > on the comparison order, another object might belive itself to be > > larger than this object. If this was to be implemented as a supported > > feature, Python should special case it internally to support the > > "True" in the given name. > > I don't know how difficult modifying the comparison function to test > whether or not one object in the comparison is the universal max or min, It's not hard at all. It's just a matter of being worth. > so cannot comment. I'm also don't know if defining a __rcmp__ method > would be sufficient. How would two objects with rcmp react? > > - It's possible to implement that object with a couple of lines, as > > you've shown; > > I don't see how the length of the implementation has anything to do with > how useful it is. Let me explain. If you have code that is useful very ocasionally and is implemented with three or four lines of code, the standard library is not the place for it. > I (and others) have provided examples of where they would be useful. > If you care to see more, we can discuss this further off-list. The best place to put useful examples is in the PEP itself. If you have further examples of this, please include there. So far I haven't seen any examples which are worth such implementation, as I've presented. > > - Any string is already a maximum object for any int/long comparison > > (IOW, do "cmp.high = 'My One True Large Object'" and you're done). > > That also ignores the idea of an absolute minimum. There are two parts Neither of these points are isolated. They just sum to each other to make my own opinion. > of the proposal, a Maximum and a Minimum. The existance of which has > already been agreed upon as being useful. Whether they are in the I'd like to be presented with some case where None is not enough, preferably with some other argument besides "that's one or two lines shorter", since having dozens of "shorter features" contradicts the Python oposition to the Perl model. > > - Your Dijkstra example is a case of abusement of tuple's sorting > > behavior. If you want readable code as you suggest, try implementing > > a custom object with a comparison operator, instead of writting > > "foo = (a,b,c,d)", and "(a,b,c,d) = foo", and "foo[x][y]" all over > > the place. > > I don't see how using the features of a data structure already in the > standard language is "abuse". Perhaps I should have created my own [...] Sorry. I'll try to explain that with softer words to avoid polluting our discussion. You're trying to introduce a new structure in the language just to turn these usage cases into slightly more readable code (in your opinion) but at the same time you don't want to use the features already in the language which would turn your own example into more readable code and would kill the need for the new presented structures. > custom class that held all the relevant information, included a __cmp__ > method, added attribute access and set rutines ...OR... maybe it was > reasonable that I used a tuple and saved people the pain of looking > through twice as much source, just to see a class, that has nothing to > do with the PEP. Using examples which are better written in a more clear way inside a PEP which is purposing a structure for better readability of code is not a good way to convince people. -- Gustavo Niemeyer http://niemeyer.net From barry at python.org Fri Jan 23 12:18:55 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 23 12:19:09 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: <16401.21641.241874.592766@sftp.fdrake.net> References: <87isj2bnyg.fsf@hydra.localdomain> <16401.21641.241874.592766@sftp.fdrake.net> Message-ID: <1074878335.2263.53.camel@anthem> On Fri, 2004-01-23 at 12:06, Fred L. Drake, Jr. wrote: > Kurt B. Kaiser writes: > > Thanks! Hopefully they will email admins. > > > > heading off to fix IDLEfork syncmail.... > > I've sent a note to the cvs-syncmail-talk list, but I haven't seen it > show up in my mailbox yet. lists.sourceforge.net seems to be *really* > slow right now. ;-( Along with everything else on SF. :/ -Barry From jcarlson at uci.edu Fri Jan 23 14:42:21 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Jan 23 14:53:24 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040123170130.GA10485@burma.localdomain> References: <20040109114039.FB88.JCARLSON@uci.edu> <20040123170130.GA10485@burma.localdomain> Message-ID: <20040123094842.1BB0.JCARLSON@uci.edu> > [has this message hit the list? I haven't received it] [the message hit the list on the afternoon of January 9] > > You are free to disagree that two objects, which clearly state that they > > are either the largest or smallest objects, are not clear. I don't know > > how to address your concern. > > Both of them. Both of what? Are both the objects not clear? Again, you are free to believe that objects that self-document are not clear. Would a better name or location suffice? I'm open for suggestions. > > It is not whether "if we don't have it we can do it easy", it is about > > "if we have it, we can do it easier". The Queue module, just recently > > Understood, but not agreed. Having the objects can make certain kinds of algorithms easier and clearer to implement. Not having them is what we've had for years. The only question is where is a location we can put them without polluting a random namespace with objects that don't fit. Math is out (Min and Max aren't floating point numbers, which you can read more about in the PEP), cmp.Min/Max are awkward, min.Min and max.Max seem strange, operator.Min/Max seems like a reasonable location, but no one has any feedback on that one yet. Again, I'm open for suggestions. > > Arguably, there aren't any usage cases where None is necessary. But yet > > we have None. > [...] > > Please, let's not ignore how these usage cases differ in wideness. Imagine for a moment that None didn't exist. You could create a singleton instance of an object that represents the same idea as None and make its boolean false. It would take <10 lines, and do /everything/ that None does. Take a look at the first few examples in the 'Max Examples' section of the PEP: http://www.python.org/peps/pep-0326.html#max-examples Notice the reduction in code from using only numbers, to using None, to using Max? Notice how each implementation got clearer? That is what the PEP is about, making code clearer. > > so cannot comment. I'm also don't know if defining a __rcmp__ method > > would be sufficient. > > How would two objects with rcmp react? A better question would be, "How would the comparison between two objects behave if the object on the left has __cmp__, and the object on the right has __rcmp__?" Checking the docs, I notice that __rcmp__ is no longer supported, and hasn't been for a few releases, so never mind. An even better question would be, "How would two objects with __cmp__ react?" Checking a few examples, it is whoever is on the left side of the comparison operator. > > > - It's possible to implement that object with a couple of lines, as > > > you've shown; > > > > I don't see how the length of the implementation has anything to do with > > how useful it is. > > Let me explain. If you have code that is useful very ocasionally and is > implemented with three or four lines of code, the standard library is > not the place for it. The entire itertools module is filled with examples where just a few lines of code can save time and effort in re-coding something that should be there in the first place. The PEP argues the case that a minimum and maximum value should be placed somewhere accessable. As stated in the PEP: "Independent implementations of the Min/Max concept by users desiring such functionality are not likely to be compatible, and certainly will produce inconsistent orderings. The following examples seek to show how inconsistent they can be." (read the PEP for the examples) > > I (and others) have provided examples of where they would be useful. > > If you care to see more, we can discuss this further off-list. > > The best place to put useful examples is in the PEP itself. If you have > further examples of this, please include there. So far I haven't seen > any examples which are worth such implementation, as I've presented. The only thing you've "presented" so far is that you think that the objects are fundamentally useless. On the other hand, no less than 10 people here on python-dev and c.l.py (I didn't even announce it there) have voiced that the objects themselves are useful if given a proper name and location. > > of the proposal, a Maximum and a Minimum. The existance of which has > > already been agreed upon as being useful. Whether they are in the > > I'd like to be presented with some case where None is not enough, > preferably with some other argument besides "that's one or two lines > shorter", since having dozens of "shorter features" contradicts the > Python oposition to the Perl model. Both list comprehensions (Python 2.0) and generator expressions (Python 2.4) have been introduced to reduce code length. Heck, generators themselves reduce the length of code required to create an iterator object. I remember creating iterators before generators existed in the base language, talk about a pain in the ass. > > > - Your Dijkstra example is a case of abusement of tuple's sorting > > > behavior. If you want readable code as you suggest, try implementing > > > a custom object with a comparison operator, instead of writting > > > "foo = (a,b,c,d)", and "(a,b,c,d) = foo", and "foo[x][y]" all over > > > the place. > > > > I don't see how using the features of a data structure already in the > > standard language is "abuse". Perhaps I should have created my own > [...] > > Sorry. I'll try to explain that with softer words to avoid polluting our > discussion. > > You're trying to introduce a new structure in the language just to turn > these usage cases into slightly more readable code (in your opinion) > but at the same time you don't want to use the features already in the > language which would turn your own example into more readable code and > would kill the need for the new presented structures. The minimum and maximum objects are not structures. They are singleton instances with specific behavior when compared against other objects. No more, no less. Hiding the special cases inside a class, doesn't remove the special cases, it just moves them somewhere else. I could have certainly just given the class, which would have contained __cmp__ functions that looked something like this: class DijkstraSPElement_Max: #initialization def __cmp__(self, other): return cmp(self.distance, other.distance) class DijkstraSPElement_None: #initialization def __cmp__(self, other): if self.distance is None: if other.distance is None: return 0 return 1 elif other.distance is None: return -1 return cmp(self.distance, other.distance) Hey, it makes a good case for the PEP; maybe I'll stick it in there if I have time this afternoon. Thank you for the sugestion. > > custom class that held all the relevant information, included a __cmp__ > > method, added attribute access and set rutines ...OR... maybe it was > > reasonable that I used a tuple and saved people the pain of looking > > through twice as much source, just to see a class, that has nothing to > > do with the PEP. > > Using examples which are better written in a more clear way inside a PEP > which is purposing a structure for better readability of code is not a > good way to convince people. Are you saying that I should place comments describing what the algorithm is doing inside the examples? Perhaps. Thank you for the ideas and suggestions, - Josiah From nbastin at opnet.com Fri Jan 23 17:01:35 2004 From: nbastin at opnet.com (Nick Bastin) Date: Fri Jan 23 17:01:39 2004 Subject: [Python-Dev] Hotshot Message-ID: Does anybody know the current state of hotshot? I read on some of the twisted mailing lists a while back that someone tried it but had some problems (can't remember what off the top of my head...have to search the archives), and was wondering if it was regarded as complete (and if there was any documentation that talks about how the code coverage aspect is supposed to be used). -- Nick From fdrake at acm.org Fri Jan 23 17:29:23 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Jan 23 17:29:33 2004 Subject: [Python-Dev] Hotshot In-Reply-To: References: Message-ID: <16401.41027.950480.346339@sftp.fdrake.net> Nick Bastin writes: > Does anybody know the current state of hotshot? I read on some of the > twisted mailing lists a while back that someone tried it but had some > problems (can't remember what off the top of my head...have to search > the archives), and was wondering if it was regarded as complete (and if > there was any documentation that talks about how the code coverage > aspect is supposed to be used). HotShot has had some attention, and has been found useful, but I don't think of it as really finished. Issues have been reported relating to making measurements when threads are involved, but I've not had any time available for it. There is some documentation in the Library Reference: http://www.python.org/doc/current/lib/module-hotshot.html What's in the standard library contains the low-level profiler and a hotshot.stats module that allows the same kind of analysis and reporting as the pstats module supports for the older profile module. There is support for collecting the information needed to measure coverage, but no higher-level tools to make it easy to use. That's the case for the per-line performance measurements as well. Any improvements to the analysis and reporting tools, or documentation, would be quite welcome! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From batlogg at solution2u.net Fri Jan 23 17:47:25 2004 From: batlogg at solution2u.net (Jodok Batlogg) Date: Fri Jan 23 17:47:33 2004 Subject: [Python-Dev] test_poll fails when building python 2.3.3 on osx Message-ID: <1D4CD545-4DF6-11D8-8229-000A95D06D16@solution2u.net> hi, i'm trying to build py2.3.3 on osx (10.3) with gcc3.3 (and 3.1) but test_poll always fails. any idea? ./python.exe Lib/test/test_poll.py Running poll test 1 This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. Traceback (most recent call last): File "Lib/test/test_poll.py", line 171, in ? test_poll1() File "Lib/test/test_poll.py", line 65, in test_poll1 poll_unit_tests() File "Lib/test/test_poll.py", line 77, in poll_unit_tests r = p.poll() select.error: (9, 'Bad file descriptor') thanks jodok batlogg solution2u.net gmbh ? hof 4 ? a-6861 alberschwende fon +43 5579 85777-65 ? fax -77 ? mobil +43 699 11841546 http://solution2u.net/ ? batlogg@solution2u.net have a nice day From nbastin at opnet.com Fri Jan 23 17:52:28 2004 From: nbastin at opnet.com (Nick Bastin) Date: Fri Jan 23 17:52:28 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <16401.41027.950480.346339@sftp.fdrake.net> References: <16401.41027.950480.346339@sftp.fdrake.net> Message-ID: On Jan 23, 2004, at 5:29 PM, Fred L. Drake, Jr. wrote: > HotShot has had some attention, and has been found useful, but I don't > think of it as really finished. Issues have been reported relating to > making measurements when threads are involved, but I've not had any > time available for it. > > There is some documentation in the Library Reference: > > http://www.python.org/doc/current/lib/module-hotshot.html > > What's in the standard library contains the low-level profiler and a > hotshot.stats module that allows the same kind of analysis and > reporting as the pstats module supports for the older profile module. There was some note in a presentation you gave quite a while ago that HotShot could profile C extension functions, but I believe that would have required a change to the main python interpreter loop. Does it actually allow the profiling of extension functions, and was it via some previously existing mechanism in python, or did the interpreter change? (at least the last time I was in eval_frame, CALL_FUNCTION wasn't profiled if PyCFunction_Check returned true - I actually modified this at one point except in cases where fast_cfunction was used, but at least in 2.2.2 the standard distribution didn't seem to have the ability to profile extension functions). -- Nick From fdrake at acm.org Fri Jan 23 18:34:37 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Jan 23 18:34:47 2004 Subject: [Python-Dev] Hotshot In-Reply-To: References: <16401.41027.950480.346339@sftp.fdrake.net> Message-ID: <16401.44941.810970.351551@sftp.fdrake.net> Nick Bastin writes: > There was some note in a presentation you gave quite a while ago that > HotShot could profile C extension functions, but I believe that would > have required a change to the main python interpreter loop. Does it It would have required an interpreter change which, I believe, was never made. There may be a patch for this on SourceForge; I don't remember. You can search the patch and bug trackers for "hotshot" to see what's there; I afraid I don't remember the specific changes that were involved. The mechanism discussed at the time would have allowed measuring the time spent in calls to PyCFunction objects (extension functions), but would not have provided per-line information for those functions. The goals was as much to subtract the time spent in C code from the time allocated to the Python function itself more than to actually provide measurement of the C function itself (though some information falls out of that naturally). -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From nbastin at opnet.com Fri Jan 23 18:44:27 2004 From: nbastin at opnet.com (Nick Bastin) Date: Fri Jan 23 18:44:27 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <16401.44941.810970.351551@sftp.fdrake.net> References: <16401.41027.950480.346339@sftp.fdrake.net> <16401.44941.810970.351551@sftp.fdrake.net> Message-ID: <14FA083A-4DFE-11D8-B47B-000393CBDF94@opnet.com> On Jan 23, 2004, at 6:34 PM, Fred L. Drake, Jr. wrote: > Nick Bastin writes: >> There was some note in a presentation you gave quite a while ago that >> HotShot could profile C extension functions, but I believe that would >> have required a change to the main python interpreter loop. Does it > > It would have required an interpreter change which, I believe, was > never made. There may be a patch for this on SourceForge; I don't > remember. You can search the patch and bug trackers for "hotshot" to > see what's there; I afraid I don't remember the specific changes that > were involved. I can certainly contribute a patch that would allow this to occur - I've modified the interpreter to allow this in 2.2.2 (I still use the old profiler, but the patch isn't profiler specific, of course) in most cases, but I felt that the patch was relatively hackish, and it didn't work for fast_cfunction. That being said, I can contribute it and maybe somebody will be interested in making it work in all cases.. :-) > The mechanism discussed at the time would have allowed measuring the > time spent in calls to PyCFunction objects (extension functions), but > would not have provided per-line information for those functions. The Of course, it would be assumed that any kind of line coverage analysis would not work in extension functions. Someone could provide a simple preprocessor for extension module code that would allow them to be built in such a way to provide that information (as many commercial tools do), but that's a bit of the scope of what I'm interested in at the moment. -- Nick From bob at redivi.com Fri Jan 23 19:19:11 2004 From: bob at redivi.com (Bob Ippolito) Date: Fri Jan 23 19:16:34 2004 Subject: [Python-Dev] test_poll fails when building python 2.3.3 on osx In-Reply-To: <1D4CD545-4DF6-11D8-8229-000A95D06D16@solution2u.net> References: <1D4CD545-4DF6-11D8-8229-000A95D06D16@solution2u.net> Message-ID: On Jan 23, 2004, at 5:47 PM, Jodok Batlogg wrote: > i'm trying to build py2.3.3 on osx (10.3) with gcc3.3 (and 3.1) but > test_poll always fails. The poll interface is emulated in OS X 10.3, the Darwin kernel doesn't actually work like that. It might be broken, but in practice that doesn't really matter because poll is very rarely used. From the header file: This file, and the accompanying "poll.c", implement the System V poll(2) system call for BSD systems (which typically do not provide poll()). Poll() provides a method for multiplexing input and output on multiple open file descriptors; in traditional BSD systems, that capability is provided by select(). While the semantics of select() differ from those of poll(), poll() can be readily emulated in terms of select() -- which is how this function is implemented. In short, I wouldn't worry about it. It's probably a bug on Apple's part, but I'd be surprised if it would ever cause you a real problem. I've never seen Python code in the wild that uses select.poll() as the default I/O multiplexing mechanism (they all use select.select() instead). -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2357 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040123/96fa224e/smime.bin From oussoren at cistron.nl Sat Jan 24 02:28:16 2004 From: oussoren at cistron.nl (Ronald Oussoren) Date: Sat Jan 24 02:28:20 2004 Subject: [Python-Dev] test_poll fails when building python 2.3.3 on osx In-Reply-To: References: <1D4CD545-4DF6-11D8-8229-000A95D06D16@solution2u.net> Message-ID: On 24 jan 2004, at 1:19, Bob Ippolito wrote: > > On Jan 23, 2004, at 5:47 PM, Jodok Batlogg wrote: > >> i'm trying to build py2.3.3 on osx (10.3) with gcc3.3 (and 3.1) but >> test_poll always fails. > > The poll interface is emulated in OS X 10.3, the Darwin kernel doesn't > actually work like that. It might be broken, but in practice that > doesn't really matter because poll is very rarely used. > > From the header file: > This file, and the accompanying "poll.c", implement the System V > poll(2) system call for BSD systems (which typically do not provide > poll()). Poll() provides a method for multiplexing input and output > on multiple open file descriptors; in traditional BSD systems, that > capability is provided by select(). While the semantics of select() > differ from those of poll(), poll() can be readily emulated in terms > of select() -- which is how this function is implemented. > > In short, I wouldn't worry about it. It's probably a bug on Apple's > part, but I'd be surprised if it would ever cause you a real problem. > I've never seen Python code in the wild that uses select.poll() as the > default I/O multiplexing mechanism (they all use select.select() > instead). C code that implements the same test as the failing test in test_poll also fails, so yet this seems to be a bug in OSX. I haven't seen the specs for poll() though, it might also be a bug in the testcase :-). Whichever it may be, this is a harmless test failure. Ronald From ncoghlan at iinet.net.au Sat Jan 24 02:48:54 2004 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat Jan 24 02:52:17 2004 Subject: [Python-Dev] test_poll fails when building python 2.3.3 on osx In-Reply-To: References: <1D4CD545-4DF6-11D8-8229-000A95D06D16@solution2u.net> Message-ID: <40122366.1030604@iinet.net.au> Ronald Oussoren wrote: > > C code that implements the same test as the failing test in test_poll > also fails, so yet this seems to be a bug in OSX. I haven't seen the > specs for poll() though, it might also be a bug in the testcase :-). > Whichever it may be, this is a harmless test failure. Should it become a skip/expected failure for OSX then? Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 From bob at redivi.com Sat Jan 24 03:06:10 2004 From: bob at redivi.com (Bob Ippolito) Date: Sat Jan 24 03:03:41 2004 Subject: [Python-Dev] test_poll fails when building python 2.3.3 on osx In-Reply-To: <40122366.1030604@iinet.net.au> References: <1D4CD545-4DF6-11D8-8229-000A95D06D16@solution2u.net> <40122366.1030604@iinet.net.au> Message-ID: <2C14A6EF-4E44-11D8-BCF3-000A95686CD8@redivi.com> On Jan 24, 2004, at 2:48 AM, Nick Coghlan wrote: > Ronald Oussoren wrote: >> C code that implements the same test as the failing test in test_poll >> also fails, so yet this seems to be a bug in OSX. I haven't seen the >> specs for poll() though, it might also be a bug in the testcase :-). >> Whichever it may be, this is a harmless test failure. > > Should it become a skip/expected failure for OSX then? That's not a terrible idea, but as Ronald said, maybe someone should make sure the test itself is correct first? If it's skipped/expected, then surely nobody will look further into it ;) -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2357 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040124/b8d6e5f8/smime.bin From walter at livinglogic.de Sat Jan 24 07:03:01 2004 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Sat Jan 24 07:03:09 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <16401.41027.950480.346339@sftp.fdrake.net> References: <16401.41027.950480.346339@sftp.fdrake.net> Message-ID: <40125EF5.90005@livinglogic.de> Fred L. Drake, Jr. wrote: > Nick Bastin writes: > > Does anybody know the current state of hotshot? I read on some of the > > twisted mailing lists a while back that someone tried it but had some > > problems (can't remember what off the top of my head...have to search > > the archives), and was wondering if it was regarded as complete (and if > > there was any documentation that talks about how the code coverage > > aspect is supposed to be used). > > HotShot has had some attention, and has been found useful, but I don't > think of it as really finished. What's missing from hotshot is a little script that makes it possible to use hotshot from the command line (just like profile.py), i.e. something like this: import sys, os.path, hotshot, hotshot.stats prof = hotshot.Profile("hotshot.prof") filename = sys.argv[1] del sys.argv[0] sys.path.insert(0, os.path.dirname(filename)) prof.run("execfile(%r)" % filename) prof.close() stats = hotshot.stats.load("hotshot.prof") stats.sort_stats("time", "calls") stats.print_stats() The biggest problem I had with hotshot is the filesize. I was using the above script to profile a script which normally runs for about 10-15 minutes. After ca. 20 minutes the size of hotshot.prof was over 1 gig. Is there any possibility to reduce the filesize? Bye, Walter D?rwald From skip at pobox.com Sat Jan 24 14:59:41 2004 From: skip at pobox.com (Skip Montanaro) Date: Sat Jan 24 14:59:48 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <40125EF5.90005@livinglogic.de> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> Message-ID: <16402.52909.543920.394048@montanaro.dyndns.org> Walter> What's missing from hotshot is a little script that makes it Walter> possible to use hotshot from the command line (just like Walter> profile.py), i.e. something like this: ... Walter, Any objection if I check in the attached modification of your script as .../Tools/scripts/hotshotmain.py? It doesn't seem to work with the -p flag. Unfortunately, I have to take my son to hockey practice, so I can't debug that problem right now. Skip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/octet-stream Size: 1608 bytes Desc: hotshot main program Url : http://mail.python.org/pipermail/python-dev/attachments/20040124/1c52d8ec/attachment.obj From ncoghlan at iinet.net.au Sat Jan 24 18:13:39 2004 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat Jan 24 18:16:41 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <16402.52909.543920.394048@montanaro.dyndns.org> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> Message-ID: <4012FC23.7090707@iinet.net.au> Skip Montanaro wrote: > Any objection if I check in the attached modification of your script as > .../Tools/scripts/hotshotmain.py? It doesn't seem to work with the -p > flag. Unfortunately, I have to take my son to hockey practice, so I can't > debug that problem right now. From modified script: ======================= profile = PROFILE for opt, arg in opts: if opt in ("-h", "--help"): usage() return 0 if opt in ("-p", "--profile"): profile = arg return 0 ======================== Should that second 'return 0' be there? Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 From nbastin at opnet.com Sat Jan 24 19:04:51 2004 From: nbastin at opnet.com (Nick Bastin) Date: Sat Jan 24 19:04:58 2004 Subject: [Python-Dev] Performance regression tests? Message-ID: <18DA3013-4ECA-11D8-8A4C-000393CBDF94@opnet.com> Does there exist a set of interpreter performance regressions tests? For example, I've modified the interpreter to allow the profiling of C extension functions, and want to make sure that the performance impact on non-profiled applications is negligible. Does a test suite for this already exist (basically, something that calls a very lightweight C extension functions a few hundred thousand times)? -- Nick From aahz at pythoncraft.com Sat Jan 24 19:18:32 2004 From: aahz at pythoncraft.com (Aahz) Date: Sat Jan 24 19:18:36 2004 Subject: [Python-Dev] Performance regression tests? In-Reply-To: <18DA3013-4ECA-11D8-8A4C-000393CBDF94@opnet.com> References: <18DA3013-4ECA-11D8-8A4C-000393CBDF94@opnet.com> Message-ID: <20040125001832.GA17462@panix.com> On Sat, Jan 24, 2004, Nick Bastin wrote: > > Does there exist a set of interpreter performance regressions tests? > For example, I've modified the interpreter to allow the profiling of C > extension functions, and want to make sure that the performance impact > on non-profiled applications is negligible. Does a test suite for this > already exist (basically, something that calls a very lightweight C > extension functions a few hundred thousand times)? Pystone is the main thing. There's now also ParrotBench. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From tim.one at comcast.net Sat Jan 24 22:25:10 2004 From: tim.one at comcast.net (Tim Peters) Date: Sat Jan 24 22:25:20 2004 Subject: [Python-Dev] Oodles of warnings in cjkcodecs In-Reply-To: <20040123144206.GA76699@i18n.org> Message-ID: [Hye-Shik Chang, on compiler warnings in cjkcodecs] > Okay. It's fixed in CVS. If you find another warnings not fixed by > this, please let me know. I wasn't able to get the new files from SourceForge before tonight, and all the warnings under MSVC6 went away when I finally got them. Even better , these tests pass now. Thank you! From skip at pobox.com Sat Jan 24 22:41:00 2004 From: skip at pobox.com (Skip Montanaro) Date: Sat Jan 24 22:41:22 2004 Subject: [Python-Dev] Performance regression tests? In-Reply-To: <20040125001832.GA17462@panix.com> References: <18DA3013-4ECA-11D8-8A4C-000393CBDF94@opnet.com> <20040125001832.GA17462@panix.com> Message-ID: <16403.15052.180604.163884@montanaro.dyndns.org> >> Does there exist a set of interpreter performance regressions tests? aahz> Pystone is the main thing. There's now also ParrotBench. Also, MAL's pybench. Skip From skip at pobox.com Sat Jan 24 22:35:37 2004 From: skip at pobox.com (Skip Montanaro) Date: Sat Jan 24 23:33:33 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <4012FC23.7090707@iinet.net.au> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <4012FC23.7090707@iinet.net.au> Message-ID: <16403.14729.424627.586488@montanaro.dyndns.org> >> Unfortunately, I have to take my son to hockey practice, so I can't >> debug that problem right now. Nick> From modified script: Nick> ======================= Nick> profile = PROFILE Nick> for opt, arg in opts: Nick> if opt in ("-h", "--help"): Nick> usage() Nick> return 0 Nick> if opt in ("-p", "--profile"): Nick> profile = arg Nick> return 0 Nick> ======================== Nick> Should that second 'return 0' be there? Ah, yeah. Cut-n-paste bug. Thx, Skip From aahz at pythoncraft.com Sun Jan 25 12:17:46 2004 From: aahz at pythoncraft.com (Aahz) Date: Sun Jan 25 12:17:50 2004 Subject: [Python-Dev] PyCon core sprint? Message-ID: <20040125171745.GA16073@panix.com> Guido posted a while back that he didn't have time to organize a sprint. Anyone want to volunteer? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From jeremy at alum.mit.edu Sun Jan 25 13:37:55 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Sun Jan 25 13:45:27 2004 Subject: [Python-Dev] PyCon core sprint? In-Reply-To: <20040125171745.GA16073@panix.com> References: <20040125171745.GA16073@panix.com> Message-ID: <1075055875.24805.69.camel@localhost.localdomain> On Sun, 2004-01-25 at 12:17, Aahz wrote: > Guido posted a while back that he didn't have time to organize a sprint. > Anyone want to volunteer? Here's my suggestion: If you are interested in coming for a topic-to-be-determined core sprint, please sign up in the Wiki: http://www.python.org/cgi-bin/moinmoin/SprintPlan2004 We can find an organizer and topics once we know who is coming. Jeremy From guido at python.org Sun Jan 25 16:59:16 2004 From: guido at python.org (Guido van Rossum) Date: Sun Jan 25 16:59:24 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: Your message of "Mon, 19 Jan 2004 16:45:29 EST." <20040119214528.GA9767@panix.com> References: <20040119214528.GA9767@panix.com> Message-ID: <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> > Here's the draft PEP for imports. I'll wait a few days for comments > before sending to the PEPmeister. I've seen a few suggestions of using slashes intead of dots, enabling the use of ../ for relative import. I believe these are misguided. This would blur between the filesystem and the package namespace; next people would be expecting to be able to say from /home/guido/lib/python import neatTricks and then all hell breaks loose: what would the package name for neatTricks be? Should it be a toplevel module, 'neatTricks'? Or should it be called 'home.guido.lib.python.neatTricks'? Neither is acceptable. So please don't include that in the PEP -- the idea is dead on arrival. [...] > Here are the contenders: > > * One from Guido: > > :: > > from .foo import > > and > > :: > > from ...foo import > > These two forms have a couple of different suggested semantics. One > semantic is to make each dot represent one level. There have been > many complaints about the difficulty of counting dots. Another > option is to only allow one level of relative import. That misses a > lot of functionality, and people still complained about missing the > dot in the one-dot form. The final option is to define an algorithm > for finding relative modules and packages; the objection here is > "Explicit is better than implicit". The PEP should specify the algorithm proposed (which is to search up from the containing package until foo is found). [...] > * Finally, some people dislike the way you have to change ``import`` > to ``from ... import`` when you want to dig inside a package. They > suggest completely rewriting the ``import`` syntax: > > :: > > from MODULE import NAMES as RENAME searching HOW > > or > > :: > > import NAMES as RENAME from MODULE searching HOW > [from NAMES] [in WHERE] import ... Nobody seems to have questioned this syntax. But what does NAMES stand for? The current renaming syntax lets you write e.g. import foo as bar, spam as ham It doesn't make sense to write import foo, spam as one_name and import foo, spam as bar, ham is definitely worse IMO (not to mention that this already has a meaning today). So then the two questions about the first example are: - how string does the "searching HOW" clause bind? Do you write import foo as bar searching XXX, spam as ham searching XXX or import foo as bar, spam as ham searching XXX ??? Also, the PEP should mention the choices for HOW (I don't recall what they are). And your syntax should show which parts are optional, e.g. [from MODULE] import NAME [as RENAME], NAME [as RENAME], ... [searching HOW] I'm totally lost with the second alternative. Maybe examples of both should be given, unless you want to abandon these right away (fine with me :-). Anyway, stick it in the PEP archive already. You can still edit it later. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Sun Jan 25 23:16:04 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun Jan 25 23:16:08 2004 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <87y8rvz6nv.fsf@hydra.localdomain> Skip has kindly agreed to pass the maintenance of the report on to me. I'll give it a shot for awhile. Please let me know if you see any problems and/or if you have suggestions for improvements. Thanks to Skip for creating and maintaining this report over the last couple of years, and thanks to Jemery and Skip for the code I've been able to build on! -- KBK From kbk at shore.net Sun Jan 25 23:16:59 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun Jan 25 23:17:10 2004 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <87ptd7z6mc.fsf@hydra.localdomain> Patch / Bug Summary ___________________ Patches : 241 open ( +4) / 2299 closed ( +1) / 2540 total ( +5) Bugs : 677 open (+18) / 3870 closed ( +8) / 4547 total (+26) RFE : 124 open ( -2) / 116 closed ( +5) / 240 total ( +3) New / Reopened Patches ______________________ Fix typo in usage message(prechm.py) (2004-01-20) http://python.org/sf/880552 opened by George Yoshida incorrect end-of-message handling in mailbox.BabylMailbox (2004-01-20) http://python.org/sf/880621 opened by KAJIYAMA, Tamito logging says to call shutdown - make it more likely (2004-01-21) http://python.org/sf/881676 opened by Jim Jewett look in libbsd.a also for openpty and forkpty (2004-01-21) http://python.org/sf/881820 opened by David Dyck dynamic execution profiling vs opcode prediction (2004-01-25) http://python.org/sf/884022 opened by Andrew I MacIntyre Patches Closed ______________ add link to DDJ's article (2004-01-05) http://python.org/sf/871402 closed by fdrake New / Reopened Bugs ___________________ socket line buffering (2004-01-18) http://python.org/sf/879399 opened by Jean M. Brouwers Segmentantion Fault in bsddb (2004-01-20) http://python.org/sf/880531 opened by Rafael Ugolini RedHat RPM for 2.3.2 fails PIL install, unlike source Py (2004-01-20) http://python.org/sf/880634 opened by Gerry Kirk "ez" format code for ParseTuple() (2004-01-20) http://python.org/sf/880951 opened by Jon Willeke Float infinity unpicklable with BINFLOAT-using protocols (2004-01-20) http://python.org/sf/880990 opened by Jp Calderone 2.3.3 make fails build posix_openpty' (2004-01-21) http://python.org/sf/880996 opened by David Dyck Overflow in Python Profiler (2004-01-21) http://python.org/sf/881261 opened by Jakob Schi?tz Shelve slow after 7/8000 key (2004-01-21) http://python.org/sf/881522 opened by Marco Beri Error in obmalloc. (2004-01-21) http://python.org/sf/881641 opened by Aki Tossavainen xml.sax.handler.ContentHandler missing argument (2004-01-21) http://python.org/sf/881707 opened by James Kew configure warning / sys/un.h: present but cannot be compiled (2004-01-21) http://python.org/sf/881765 opened by David Dyck 2.3.3 hangs in test_fork1 and test_threadedtempfile (2004-01-21) http://python.org/sf/881768 opened by David Dyck 2.4a0 build fails in Modules/signalmodule.c (2004-01-21) http://python.org/sf/881812 opened by David Dyck type of Py_UNICODE depends on ./configure options (2004-01-22) http://python.org/sf/881861 opened by Biehunko Michael Error in obmalloc. (2004-01-21) http://python.org/sf/881641 reopened by mwh socket's makefile file object doesn't work with timeouts. (2004-01-22) http://python.org/sf/882297 opened by Jeremy Fincher boolobject.h documentation missing from Python/C API (2004-01-22) http://python.org/sf/882332 opened by Greg Chapman bundlebuilder --lib mishandles frameworks (2004-01-23) http://python.org/sf/883283 opened by Russell Owen quopri encoding & Unicode (2004-01-24) http://python.org/sf/883466 opened by Stuart Bishop python crash in pyexpat's XmlInitUnknownEncodingNS (2004-01-24) http://python.org/sf/883495 opened by Matthias Klose locale specific problem in test_strftime.py (2004-01-24) http://python.org/sf/883604 opened by George Yoshida distutils doesn't see user-directories set via VisStudio 7.1 (2004-01-25) http://python.org/sf/883969 opened by Mark Hammond Possible incorrect code generation (2004-01-24) http://python.org/sf/883987 opened by Bill Rubenstein opcode prediction distorts dynamic execution profile stats (2004-01-25) http://python.org/sf/884020 opened by Andrew I MacIntyre MIMEMultipart _subparts constructor unusable (2004-01-25) http://python.org/sf/884030 opened by Amit Aronovitch metaclasses and __unicode__ (2004-01-25) http://python.org/sf/884064 opened by Christoph Simon OSATerminology is broken (2004-01-25) http://python.org/sf/884085 opened by Bob Ippolito Bugs Closed ___________ logging handlers raise exception on level (2004-01-13) http://python.org/sf/876421 closed by vsajip Zipfile archive name can't be unicode (2004-01-16) http://python.org/sf/878120 closed by ssmmhh Segmentantion Fault in bsddb (2004-01-20) http://python.org/sf/880531 closed by niemeyer pwd and grp modules error msg improvements (2004-01-16) http://python.org/sf/878572 closed by bwarsaw Float infinity unpicklable with BINFLOAT-using protocols (2004-01-20) http://python.org/sf/880990 closed by tim_one Small problems with the csv module's documentation (2003-08-30) http://python.org/sf/797853 closed by montanaro Error in obmalloc. (2004-01-21) http://python.org/sf/881641 closed by mwh 2.3.3 hangs in test_fork1 and test_threadedtempfile (2004-01-21) http://python.org/sf/881768 closed by dcd Possible incorrect code generation (2004-01-24) http://python.org/sf/883987 closed by nnorwitz New / Reopened RFE __________________ Add ResolveFinderAliases to macostools module (2004-01-21) http://python.org/sf/881824 opened by Paul Donovan Enable command line exit in IDLE. (2004-01-22) http://python.org/sf/882171 opened by Raymond Hettinger Parsing userID:passwd in http_proxy (2004-01-22) http://python.org/sf/882397 opened by Marcin J. Kraszewski RFE Closed __________ Thai encoding alias for 'cp874' (2003-12-05) http://python.org/sf/854511 closed by loewis builtin save() function available interactively only (2003-12-01) http://python.org/sf/852222 closed by rhettinger All numbers are bitarrays too (2003-11-21) http://python.org/sf/846568 closed by rhettinger Requesting Mock Object support for unittest.TestCase (2003-03-22) http://python.org/sf/708125 closed by purcell Parsing userID:passwd in http_proxy (2004-01-22) http://python.org/sf/882397 closed by sjoerd From jeremy at alum.mit.edu Mon Jan 26 00:01:20 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon Jan 26 00:13:06 2004 Subject: [Python-Dev] namespace for generator expressions Message-ID: <1075093277.24805.101.camel@localhost.localdomain> I'm happy to see progress on the generator expressions implementation, but I think the specification of namespaces, which is just a sketch, might be simplified. The details begin by saying that a generator expression is equivalent to creating an anonymous generator function and calling it. Good! I think that will match an experienced programmer's intuition about how it works. This simple model seems to be at odds with the later suggestion that all free variables are bound using default argument values. An anonymous generator function would just have free variables, bound by the usual rules. The PEP suggests that a binding using default argument values is what you want. It's hard to argue one way or another without compelling examples, but I think this is a case of special cases aren't special enough to break the rules. Are there any other cases where an expression is evaluated using non-current bindings? The only near comparison is with list comprehensions, which aren't lazy, but do follow the normal rules. Any code that depends on what binding is used for a free variable is going to be obscure. The example in the PEP is contrived, of course, but I can't imagine any code where it wouldn't be easy to achieve the desired result by using a new variable that is not rebound after the generator expression is defined. Do you think there is any significant gain in expressiveness? The PEP says that no examples have been found where it would be better to use the normal rule. On the other hand, it doesn't show any examples where's it's obviously better to use the proposed rule. Another angle on this concern is the extra complexity it adds to the language reference and to other explanations of the language's name binding model (thinking of course materials, books, etc). The new rules adds non-trivial complexity to those rules, because we've got to distinguish two kinds of code blocks that follow different rules for free variables. Jeremy From walter at livinglogic.de Mon Jan 26 07:59:36 2004 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon Jan 26 07:59:46 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <16402.52909.543920.394048@montanaro.dyndns.org> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> Message-ID: <40150F38.704@livinglogic.de> Skip Montanaro wrote: > Walter> What's missing from hotshot is a little script that makes it > Walter> possible to use hotshot from the command line (just like > Walter> profile.py), i.e. something like this: > > ... > > Walter, > > Any objection if I check in the attached modification of your script as > .../Tools/scripts/hotshotmain.py? It doesn't seem to work with the -p > flag. Unfortunately, I have to take my son to hockey practice, so I can't > debug that problem right now. Maybe this script should use optparse instead of getopt? Bye, Walter D?rwald From skip at pobox.com Mon Jan 26 08:47:24 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 26 08:47:34 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <40150F38.704@livinglogic.de> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <40150F38.704@livinglogic.de> Message-ID: <16405.6764.113143.290678@montanaro.dyndns.org> >> Any objection if I check in the attached modification of your script >> as .../Tools/scripts/hotshotmain.py? It doesn't seem to work with >> the -p flag. Unfortunately, I have to take my son to hockey >> practice, so I can't debug that problem right now. Walter> Maybe this script should use optparse instead of getopt? How would optparse be better than getopt, especially for simple use such as this? Skip From walter at livinglogic.de Mon Jan 26 10:19:28 2004 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon Jan 26 10:19:35 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <16405.6764.113143.290678@montanaro.dyndns.org> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <40150F38.704@livinglogic.de> <16405.6764.113143.290678@montanaro.dyndns.org> Message-ID: <40153000.4010407@livinglogic.de> Skip Montanaro wrote: > >> Any objection if I check in the attached modification of your script > >> as .../Tools/scripts/hotshotmain.py? It doesn't seem to work with > >> the -p flag. Unfortunately, I have to take my son to hockey > >> practice, so I can't debug that problem right now. > > Walter> Maybe this script should use optparse instead of getopt? > > How would optparse be better than getopt, especially for simple use such as > this? I'd say, that all new stuff should use optparse. Bye, Walter D?rwald From guido at python.org Mon Jan 26 10:40:34 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 26 10:40:41 2004 Subject: [Python-Dev] Hotshot In-Reply-To: Your message of "Mon, 26 Jan 2004 07:47:24 CST." <16405.6764.113143.290678@montanaro.dyndns.org> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <40150F38.704@livinglogic.de> <16405.6764.113143.290678@montanaro.dyndns.org> Message-ID: <200401261540.i0QFeY821535@c-24-5-183-134.client.comcast.net> > Walter> Maybe this script should use optparse instead of getopt? > > How would optparse be better than getopt, especially for simple use > such as this? Optparse is *always* better than getopt, once you're used to it. Even for a single option it's usually fewer lines of code. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jan 26 10:51:13 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 26 10:51:19 2004 Subject: [Python-Dev] namespace for generator expressions In-Reply-To: Your message of "Mon, 26 Jan 2004 00:01:20 EST." <1075093277.24805.101.camel@localhost.localdomain> References: <1075093277.24805.101.camel@localhost.localdomain> Message-ID: <200401261551.i0QFpDs21676@c-24-5-183-134.client.comcast.net> > I'm happy to see progress on the generator expressions implementation, > but I think the specification of namespaces, which is just a sketch, > might be simplified. Ouch! Where were you when this PEP was discussed on python-dev? I was originally strongly in your camp, but Tim and several others convinced me that in every single case where a generator expression has a free variable, you want early binding, not late. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Mon Jan 26 10:52:21 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon Jan 26 10:58:28 2004 Subject: [Python-Dev] namespace for generator expressions In-Reply-To: <200401261551.i0QFpDs21676@c-24-5-183-134.client.comcast.net> References: <1075093277.24805.101.camel@localhost.localdomain> <200401261551.i0QFpDs21676@c-24-5-183-134.client.comcast.net> Message-ID: <1075132341.24805.128.camel@localhost.localdomain> On Mon, 2004-01-26 at 10:51, Guido van Rossum wrote: > > I'm happy to see progress on the generator expressions implementation, > > but I think the specification of namespaces, which is just a sketch, > > might be simplified. > > Ouch! Where were you when this PEP was discussed on python-dev? I > was originally strongly in your camp, but Tim and several others > convinced me that in every single case where a generator expression > has a free variable, you want early binding, not late. I guess I was busy with other things. I'm sorry I missed the discussion, but if it was at all substantial, the PEP didn't do a good job of capturing it. Jeremy From barry at python.org Mon Jan 26 11:36:48 2004 From: barry at python.org (Barry Warsaw) Date: Mon Jan 26 11:37:26 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <200401261540.i0QFeY821535@c-24-5-183-134.client.comcast.net> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <40150F38.704@livinglogic.de> <16405.6764.113143.290678@montanaro.dyndns.org> <200401261540.i0QFeY821535@c-24-5-183-134.client.comcast.net> Message-ID: <1075135007.20208.6.camel@anthem> On Mon, 2004-01-26 at 10:40, Guido van Rossum wrote: > > Walter> Maybe this script should use optparse instead of getopt? > > > > How would optparse be better than getopt, especially for simple use > > such as this? > > Optparse is *always* better than getopt, once you're used to it. Even > for a single option it's usually fewer lines of code. And with Optparse it's much easier to add/remove options since each option's spec and handling is in one place, instead of spread out in several places. I do have a few minor nits with optparse though[1]. Even so, +1 on using Optparse for all new scripts. -Barry [1] e.g http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=842213 From skip at pobox.com Mon Jan 26 11:40:59 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 26 11:41:18 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <200401261540.i0QFeY821535@c-24-5-183-134.client.comcast.net> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <40150F38.704@livinglogic.de> <16405.6764.113143.290678@montanaro.dyndns.org> <200401261540.i0QFeY821535@c-24-5-183-134.client.comcast.net> Message-ID: <16405.17179.666968.327225@montanaro.dyndns.org> >> How would optparse be better than getopt, especially for simple use >> such as this? Guido> Optparse is *always* better than getopt, once you're used to it. Guido> Even for a single option it's usually fewer lines of code. Is this a pronouncement? A quick grep suggests that so far only Tools/scripts/diff.py uses optparse. I converted the script to use optparse. It will take some getting used to (just like getopt did). So, is this (hotshotmain.py) a useful standalone script? Given that hotshot is a package instead of a module I don't think it's possible to just tack a main() on to an existing module. Skip From barry at python.org Mon Jan 26 11:57:53 2004 From: barry at python.org (Barry Warsaw) Date: Mon Jan 26 11:58:11 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib optparse.py, 1.5, 1.6 In-Reply-To: References: Message-ID: <1075136273.20208.8.camel@anthem> On Mon, 2004-01-26 at 11:42, fdrake@projects.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Lib > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv14137 > > Modified Files: > optparse.py > Log Message: > don't wrap lines too late by default > closes SF bug #842213 Were you going to back port this to 2.3? -Barry From guido at python.org Mon Jan 26 11:57:44 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 26 11:58:17 2004 Subject: [Python-Dev] PyCon Reminder: Early bird reg deadline 2/1 Message-ID: <200401261657.i0QGviR22138@c-24-5-183-134.client.comcast.net> Info ---- This is a reminder that the deadline for early bird registration for PyCon DC 2004 is February 1, 2004. Early bird registration is $175; after that, it will be $250 through March 17, then $300 at the door. (Note that a week ago I announced the wrong prices for non-early bird. :-) To register, visit: http://www.pycon.org/dc2004/register/ Plug ---- PyCon is the most important conference in the year for me. It brings together the largest group of Python users and developers imaginable. I come to PyCon for many reasons: to meet with other core Python developers face to face; to hear about the exciting things that users from all over the world are doing with Python; to interact in person with Python users of all types, from newbies to veterans of many years; to hear everybody's feedback on where Python is and where they'd like it to go; and perhaps most of all to continue to encourage this wonderful community by my presence, words and actions to grow, reach out to new users of all kinds, and keep listening to each other. I hope everyone who comes to the conference will return from it with a new or renewed feeling of excitement about Python, whether they are developers, sophisticated users, beginners, or even skeptical passers-by. The Python community includes everyone, from grade schoolers just learning about computer science to renowned scientists interested in using the best tools and business people looking for a secret weapon. I'm looking forward to seeing you all at PyCon! Background ---------- PyCon is a community-oriented conference targeting developers (both those using Python and those working on the Python project). It gives you opportunities to learn about significant advances in the Python development community, to participate in a programming sprint with some of the leading minds in the Open Source community, and to meet fellow developers from around the world. The organizers work to make the conference affordable and accessible to all. DC 2004 will be held March 24-26, 2004 in Washington, D.C. The keynote speaker is Mitch Kapor of the Open Source Applications Foundation (http://www.osafoundation.org/). There will be a four-day development sprint before the conference. We're looking for volunteers to help run PyCon. If you're interested, subscribe to http://mail.python.org/mailman/listinfo/pycon-organizers Don't miss any PyCon announcements! Subscribe to http://mail.python.org/mailman/listinfo/pycon-announce You can discuss PyCon with other interested people by subscribing to http://mail.python.org/mailman/listinfo/pycon-interest The central resource for PyCon DC 2004 is http://www.pycon.org/ --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jan 26 11:55:45 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 26 11:58:24 2004 Subject: [Python-Dev] Hotshot In-Reply-To: Your message of "Mon, 26 Jan 2004 10:40:59 CST." <16405.17179.666968.327225@montanaro.dyndns.org> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <40150F38.704@livinglogic.de> <16405.6764.113143.290678@montanaro.dyndns.org> <200401261540.i0QFeY821535@c-24-5-183-134.client.comcast.net> <16405.17179.666968.327225@montanaro.dyndns.org> Message-ID: <200401261655.i0QGtjj22096@c-24-5-183-134.client.comcast.net> > >> How would optparse be better than getopt, especially for > >> simple use such as this? > > Guido> Optparse is *always* better than getopt, once you're used > Guido> to it. Even for a single option it's usually fewer lines > Guido> of code. > > Is this a pronouncement? A quick grep suggests that so far only > Tools/scripts/diff.py uses optparse. It's not a pronouncement -- it's just a recommendation. I see no need to start converting existing code to optparse -- but for all new code, I recommend it. --Guido van Rossum (home page: http://www.python.org/~guido/) From marktrussell at btopenworld.com Mon Jan 26 12:15:38 2004 From: marktrussell at btopenworld.com (Mark Russell) Date: Mon Jan 26 12:15:31 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <200401261655.i0QGtjj22096@c-24-5-183-134.client.comcast.net> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <40150F38.704@livinglogic.de> <16405.6764.113143.290678@montanaro.dyndns.org> <200401261540.i0QFeY821535@c-24-5-183-134.client.comcast.net> <16405.17179.666968.327225@montanaro.dyndns.org> <200401261655.i0QGtjj22096@c-24-5-183-134.client.comcast.net> Message-ID: <1075137338.812.10.camel@localhost> On Mon, 2004-01-26 at 16:55, Guido van Rossum wrote: > It's not a pronouncement -- it's just a recommendation. I see no need > to start converting existing code to optparse -- but for all new code, > I recommend it. Maybe the documentation should point people in the direction of optparse (e.g. by moving the optparse section ahead of getopt and putting a note like: This module exists primarily for backwards compatibility - the optparse module is recommended for all new code. at the top of the getopt section). Mark From fdrake at acm.org Mon Jan 26 12:10:08 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon Jan 26 12:22:09 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib optparse.py, 1.5, 1.6 In-Reply-To: <1075136273.20208.8.camel@anthem> References: <1075136273.20208.8.camel@anthem> Message-ID: <16405.18928.55314.169376@sftp.fdrake.net> Barry Warsaw writes: > Were you going to back port this to 2.3? No, but I've no objection if you do. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry at python.org Mon Jan 26 13:42:30 2004 From: barry at python.org (Barry Warsaw) Date: Mon Jan 26 13:42:40 2004 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib optparse.py, 1.5, 1.6 In-Reply-To: <16405.18928.55314.169376@sftp.fdrake.net> References: <1075136273.20208.8.camel@anthem> <16405.18928.55314.169376@sftp.fdrake.net> Message-ID: <1075142549.20208.18.camel@anthem> On Mon, 2004-01-26 at 12:10, Fred L. Drake, Jr. wrote: > Barry Warsaw writes: > > Were you going to back port this to 2.3? > > No, but I've no objection if you do. Cool, done. I just didn't want to step on your toes. -Barry From guido at python.org Mon Jan 26 13:59:58 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 26 14:00:05 2004 Subject: [Python-Dev] Hotshot In-Reply-To: Your message of "Mon, 26 Jan 2004 17:15:38 GMT." <1075137338.812.10.camel@localhost> References: <16401.41027.950480.346339@sftp.fdrake.net> <40125EF5.90005@livinglogic.de> <16402.52909.543920.394048@montanaro.dyndns.org> <40150F38.704@livinglogic.de> <16405.6764.113143.290678@montanaro.dyndns.org> <200401261540.i0QFeY821535@c-24-5-183-134.client.comcast.net> <16405.17179.666968.327225@montanaro.dyndns.org> <200401261655.i0QGtjj22096@c-24-5-183-134.client.comcast.net> <1075137338.812.10.camel@localhost> Message-ID: <200401261900.i0QIxwS22975@c-24-5-183-134.client.comcast.net> > On Mon, 2004-01-26 at 16:55, Guido van Rossum wrote: > > It's not a pronouncement -- it's just a recommendation. I see no need > > to start converting existing code to optparse -- but for all new code, > > I recommend it. > > Maybe the documentation should point people in the direction of optparse > (e.g. by moving the optparse section ahead of getopt and putting a note > like: > > This module exists primarily for backwards compatibility - the > optparse module is recommended for all new code. > > at the top of the getopt section). This sounds awfully close to deprecation of getopt. I don't want to go that far -- I don't want people to worry that getopt might go away. It won't. --Guido van Rossum (home page: http://www.python.org/~guido/) From niemeyer at conectiva.com Mon Jan 26 15:31:22 2004 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Mon Jan 26 15:37:27 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040123094842.1BB0.JCARLSON@uci.edu> References: <20040109114039.FB88.JCARLSON@uci.edu> <20040123170130.GA10485@burma.localdomain> <20040123094842.1BB0.JCARLSON@uci.edu> Message-ID: <20040126203122.GA6867@burma.localdomain> [...] > The only question is where is a location we can put them without > polluting a random namespace with objects that don't fit. Math is out > (Min and Max aren't floating point numbers, which you can read more > about in the PEP), cmp.Min/Max are awkward, min.Min and max.Max seem > strange, operator.Min/Max seems like a reasonable location, but no one > has any feedback on that one yet. Again, I'm open for suggestions. My suggestion is to not introduce these objects at all, and if they're going to be introduced, there should be internal support for them in the interpreter, or they're meaningless. Every object with a custom cmp method would have to take care not to be greater than your objects. > > > Arguably, there aren't any usage cases where None is necessary. But yet > > > we have None. > > [...] > > > > Please, let's not ignore how these usage cases differ in wideness. > > Imagine for a moment that None didn't exist. You could create a > singleton instance of an object that represents the same idea as None > and make its boolean false. It would take <10 lines, and do /everything/ > that None does. [...] Please, let's not ignore how these usage cases differ in wideness. > Take a look at the first few examples in the 'Max Examples' section of > the PEP: http://www.python.org/peps/pep-0326.html#max-examples > Notice the reduction in code from using only numbers, to using None, to > using Max? Notice how each implementation got clearer? That is what > the PEP is about, making code clearer. Comments about the "Max Examples": - If the objects in the sequence implement their own cmp operator, you can't be sure your example will work. That turns these structures (call them as you want) into something useless. - "Max" is a possible return from this function. It means the code using your "findmin_Max" will have to "if foo == Max" somewhere, so it kills your argument of less code as well. [...] > An even better question would be, "How would two objects with __cmp__ > react?" Checking a few examples, it is whoever is on the left side of > the comparison operator. Exactly. That's the point. Your "Top" value is not really "Top", unless every object with custom comparison methods take care about it. [...] > The entire itertools module is filled with examples where just a few > lines of code can save time and effort in re-coding something that > should be there in the first place. The PEP argues the case that a > minimum and maximum value should be placed somewhere accessable. Understood. > As stated in the PEP: > "Independent implementations of the Min/Max concept by users desiring > such functionality are not likely to be compatible, and certainly will > produce inconsistent orderings. The following examples seek to show how > inconsistent they can be." (read the PEP for the examples) Independent implementations are not compatible for the same reason why your implementation is not compatible with the rest of the world. IOW, an indepent implementation would not respect your Max, just like any object with a custom comparison may not respect it. [...] > The only thing you've "presented" so far is that you think that the > objects are fundamentally useless. On the other hand, no less than 10 Sorry, but you're blind. Review my messages. > people here on python-dev and c.l.py (I didn't even announce it there) > have voiced that the objects themselves are useful if given a proper > name and location. The fact that other people agree with your suggestion doesn't affect my own opinion. [...] > > You're trying to introduce a new structure in the language just to turn > > these usage cases into slightly more readable code (in your opinion) > > but at the same time you don't want to use the features already in the > > language which would turn your own example into more readable code and > > would kill the need for the new presented structures. > > The minimum and maximum objects are not structures. They are singleton > instances with specific behavior when compared against other objects. > No more, no less. Heh.. pointless. > Hiding the special cases inside a class, doesn't remove the special > cases, it just moves them somewhere else. I could have certainly just > given the class, which would have contained __cmp__ functions that > looked something like this: > > class DijkstraSPElement_Max: > #initialization > def __cmp__(self, other): > return cmp(self.distance, other.distance) > > class DijkstraSPElement_None: > #initialization > def __cmp__(self, other): > if self.distance is None: > if other.distance is None: > return 0 > return 1 > elif other.distance is None: > return -1 > return cmp(self.distance, other.distance) > > Hey, it makes a good case for the PEP; maybe I'll stick it in there if I > have time this afternoon. Thank you for the sugestion. Ouch! It's getting worse. :-) Put *that* example in your PEP, please: class Node: def __init__(self, node, parent=None, distance=None, visited=False): self.node = node self.parent = parent self.distance = distance self.visited = visited def __cmp__(self, other): pair = self.distance, other.distance if None in pair: return cmp(*pair)*-1 return cmp(*pair) def DijkstraSP_table(graph, S, T): table = {} for node in graph.iterkeys(): table[node] = Node(node) table[S] = Node(S, distance=0) cur = min(table.values()) while (not cur.visited) and cur.distance != None: node.visited = True for cdist, child in graph[node]: ndist = node.distance+cdist childnode = table[child] if not childnode.visited and \ (childnode.distance is None or ndist < childnode.distance): childnode.distance = ndist cur = min(table.values()) if not table[T].visited: return None cur = T path = [T] while table[cur].parent is not None: path.append(table[cur].parent) cur = path[-1] path.reverse() return path [...] > > Using examples which are better written in a more clear way inside a PEP > > which is purposing a structure for better readability of code is not a > > good way to convince people. > > Are you saying that I should place comments describing what the > algorithm is doing inside the examples? Perhaps. No, I meant you should do something towards what I've presented above. -- Gustavo Niemeyer http://niemeyer.net From tdelaney at avaya.com Mon Jan 26 16:50:04 2004 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Mon Jan 26 16:54:15 2004 Subject: [Python-Dev] Hotshot Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE011DAA6D@au3010avexu1.global.avaya.com> > From: Walter D?rwald > > The biggest problem I had with hotshot is the filesize. I was > using the above script to profile a script which normally > runs for about 10-15 minutes. After ca. 20 minutes the size > of hotshot.prof was over 1 gig. Is there any possibility to > reduce the filesize? In my coverage tool (which I *still* haven't managed to get permission to release!!!) I was storing arrays of file numbers (indexes into a list), line numbers (into the file) and type (line, exception, etc). Every thousand or so entries, I would take the arrays, to a `tostring()` then compress the string. Then the 3 sets of data was written to file as a binary pickle. At the end of the run, a tuple was pickled for each file consisting of things like the fully-qualified module name, the filename, number of lines, the compressed source code and info about the members in the module (e.g. type, starting line, end line, etc - useful for finding in the source code). I also included the (compressed) stacktrace of any exception which ended the coverage run. This resulted in multi-megabyte files being reduced to a few kilobytes, without a major memory or performance penalty, and all the data I needed to display the results i.e. no need for the coverage to be displayed on the same computer. The compression wasn't as good as if I did larger chunks, but the difference wasn't particularly significant - maybe a few percentage points. Also, since the chunks being compressed were fairly small, they didn't hurt the performance much. It does make the extractor much more complex, but I thought it was worth it. Tim Delaney From tdelaney at avaya.com Mon Jan 26 17:02:17 2004 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Mon Jan 26 17:06:04 2004 Subject: [Python-Dev] namespace for generator expressions Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE011DAA80@au3010avexu1.global.avaya.com> > From: Guido van Rossum > > > I'm happy to see progress on the generator expressions > implementation, > > but I think the specification of namespaces, which is just a sketch, > > might be simplified. > > Ouch! Where were you when this PEP was discussed on python-dev? I > was originally strongly in your camp, but Tim and several others > convinced me that in every single case where a generator expression > has a free variable, you want early binding, not late. I think I agree with Jeremy. I was originally in the early-binding camp, but I think we're better off trusting the programmer. How often is a generator expression not going to be evaluated almost immediately? I guess, when they're passed to a function. But even in that case, how often are the bindings going to change? Except in pathological cases, they won't. Anyway, got a meeting to go to :( Tim Delaney From pje at telecommunity.com Mon Jan 26 17:36:19 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 26 17:32:29 2004 Subject: [Python-Dev] namespace for generator expressions In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE011DAA80@au3010avexu1.glob al.avaya.com> Message-ID: <5.1.0.14.0.20040126172420.03425dd0@mail.telecommunity.com> At 09:02 AM 1/27/04 +1100, Delaney, Timothy C (Timothy) wrote: > > From: Guido van Rossum > > > > > I'm happy to see progress on the generator expressions > > implementation, > > > but I think the specification of namespaces, which is just a sketch, > > > might be simplified. > > > > Ouch! Where were you when this PEP was discussed on python-dev? I > > was originally strongly in your camp, but Tim and several others > > convinced me that in every single case where a generator expression > > has a free variable, you want early binding, not late. > >I think I agree with Jeremy. I was originally in the early-binding camp, >but I think we're better off trusting the programmer. The intent is that listcomps should be "safely" replaceable with genexprs anywhere that an iterator is acceptable in place of a list. >How often is a generator expression not going to be evaluated almost >immediately? I guess, when they're passed to a function. But even in that >case, how often are the bindings going to change? Except in pathological >cases, they won't. A trivial example is: iterators = [] for i in range(5): iterators.append(x*2 for x in range(i)) print map(list,iterators) If a listcomp is used, you get: [[],[0,],[0,2],[0,2,4],[0,2,4,6],[0,2,4,6,8]] If genexprs do late binding, you get: [[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8]] which is a significant difference in semantics. To make this code work with a late binding genexpr, it becomes necessary to create a function definition, thus negating the benefit of using the genexpr in the first place. As to how common this is, ask how often you use a list comprehension inside of another loop, and then ask how often you'd use a genexpr instead if it was available. Finally, ask yourself whether you're likely to remember that you need to totally rewrite the structure if you decide to use a genexpr instead. :) (Of course, I also often forget that free variables don't bind early in nested functions! It always seems strange to me that I have to write a function that returns a function in order to just define a function with bound variables. Indeed, it's hard to think of a time where I've *ever* wanted late binding of variables in a nested function.) From skip at pobox.com Mon Jan 26 17:42:22 2004 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 26 17:42:45 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE011DAA6D@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DE011DAA6D@au3010avexu1.global.avaya.com> Message-ID: <16405.38862.629934.745988@montanaro.dyndns.org> >> The biggest problem I had with hotshot is the filesize. Tim> In my coverage tool (which I *still* haven't managed to get Tim> permission to release!!!) I was storing arrays of file numbers Tim> (indexes into a list), line numbers (into the file) and type (line, Tim> exception, etc). Every thousand or so entries, I would take the Tim> arrays, to a `tostring()` then compress the string. Then the 3 sets Tim> of data was written to file as a binary pickle. It seems to me it might be simpler to just write the profile file through the gzip module and teach the hotshot.stats.load() function to recognize such files and uncompress accordingly. Skip From mkaufman at bea.com Mon Jan 26 19:43:07 2004 From: mkaufman at bea.com (Mike Kaufman) Date: Mon Jan 26 19:43:13 2004 Subject: [Python-Dev] Out of Office AutoReply: hello Message-ID: <4B2B4C417991364996F035E1EE39E2E13ECC62@uskiex01.amer.bea.com> I am out of the office from 1/27/2004 through 1/30/2004. I will be back in the office Monday, 2/2/2004. Please contact Ken Tam or Brendan Maclean about any urgent issues. Thanks! Mike Kaufman From jeremy at alum.mit.edu Mon Jan 26 19:51:34 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon Jan 26 19:57:41 2004 Subject: [Python-Dev] namespace for generator expressions In-Reply-To: <5.1.0.14.0.20040126172420.03425dd0@mail.telecommunity.com> References: <5.1.0.14.0.20040126172420.03425dd0@mail.telecommunity.com> Message-ID: <1075164693.24805.191.camel@localhost.localdomain> On Mon, 2004-01-26 at 17:36, Phillip J. Eby wrote: > The intent is that listcomps should be "safely" replaceable with genexprs > anywhere that an iterator is acceptable in place of a list. > > >How often is a generator expression not going to be evaluated almost > >immediately? I guess, when they're passed to a function. But even in that > >case, how often are the bindings going to change? Except in pathological > >cases, they won't. > > A trivial example is: Are there non-trivial examples? The PEP suggests that they exist, but doesn't provide any. > iterators = [] > for i in range(5): > iterators.append(x*2 for x in range(i)) > > print map(list,iterators) > > If a listcomp is used, you get: > > [[],[0,],[0,2],[0,2,4],[0,2,4,6],[0,2,4,6,8]] > > If genexprs do late binding, you get: > > [[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8]] Note that Armin Rigo suggested a small change here a few weeks ago that makes the list comp and the gen expr behave the same way. The range(i) part of the gen expr is evaluated at the point of definition. The gen expr, thus, translates to: def f(it): for x in it: yield x * 2 iterators.append(f(range(i)) As a result of this change, the only new scope is for the body of the target expression. I don't know how that effects the earlier examples. BTW is there good terminology for describing the parts of a list comp or gen expr? How about the iterator, the conditions, and the expression? Jeremy From pje at telecommunity.com Mon Jan 26 21:10:09 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 26 21:06:22 2004 Subject: [Python-Dev] namespace for generator expressions In-Reply-To: <1075164693.24805.191.camel@localhost.localdomain> References: <5.1.0.14.0.20040126172420.03425dd0@mail.telecommunity.com> <5.1.0.14.0.20040126172420.03425dd0@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20040126210358.0305c610@mail.telecommunity.com> At 07:51 PM 1/26/04 -0500, Jeremy Hylton wrote: >On Mon, 2004-01-26 at 17:36, Phillip J. Eby wrote: > > The intent is that listcomps should be "safely" replaceable with genexprs > > anywhere that an iterator is acceptable in place of a list. > > > > >How often is a generator expression not going to be evaluated almost > > >immediately? I guess, when they're passed to a function. But even in that > > >case, how often are the bindings going to change? Except in pathological > > >cases, they won't. > > > > A trivial example is: > >Are there non-trivial examples? The PEP suggests that they exist, but >doesn't provide any. http://mail.python.org/pipermail/python-dev/2003-October/039323.html I'd have posted a link to this first, but it took a while to track it down. > > iterators = [] > > for i in range(5): > > iterators.append(x*2 for x in range(i)) > > > > print map(list,iterators) > > > > If a listcomp is used, you get: > > > > [[],[0,],[0,2],[0,2,4],[0,2,4,6],[0,2,4,6,8]] > > > > If genexprs do late binding, you get: > > > > [[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8]] > >Note that Armin Rigo suggested a small change here a few weeks ago that >makes the list comp and the gen expr behave the same way. It's a start, but it doesn't address Tim's example. >BTW is there good terminology for describing the parts of a list comp or >gen expr? How about the iterator, the conditions, and the expression? Sounds good to me. My contrived example demonstrated early binding for the iterator. Tim's shows early binding for the conditions. I think it's also straightforward to see that early binding for the expression is similarly desirable. If you're looking for more arguments, I'd suggest looking at all of Tim Peters' posts for October under the subject "accumulator display syntax". :) From jeremy at alum.mit.edu Mon Jan 26 21:04:30 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon Jan 26 21:10:39 2004 Subject: [Python-Dev] namespace for generator expressions In-Reply-To: <5.1.0.14.0.20040126210358.0305c610@mail.telecommunity.com> References: <5.1.0.14.0.20040126172420.03425dd0@mail.telecommunity.com> <5.1.0.14.0.20040126172420.03425dd0@mail.telecommunity.com> <5.1.0.14.0.20040126210358.0305c610@mail.telecommunity.com> Message-ID: <1075169069.24805.194.camel@localhost.localdomain> On Mon, 2004-01-26 at 21:10, Phillip J. Eby wrote: > If you're looking for more arguments, I'd suggest looking at all of Tim > Peters' posts for October under the subject "accumulator display syntax". :) I'll try, but I was hoping the PEP author would help out here. An important part of the PEP process is documenting the rationale for a decision. I'm not opposed to reading through all the messages, but it's going to take a lot more time than reading a PEP. Jeremy From raymond.hettinger at verizon.net Mon Jan 26 21:19:55 2004 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon Jan 26 21:21:02 2004 Subject: [Python-Dev] Collections Message-ID: <002d01c3e47c$0e69d3c0$e841fea9@oemcomputer> The collections module is in the sandbox and is almost ready for primetime. If anyone has some time to look at it, I would appreciate it if you would email me with any comments or suggestions. Right now, it only has one type, deque(), which has a simple API consisting of append(), appendleft(), pop(), popleft(), and clear(). It also supports __len__(), __iter__(), __copy__(), and __reduce__(). It is fast, threadsafe, easy-to-use, and memory friendly (no calls to realloc() or memmove()). The sandbox includes unittests, a performance measurement tool, and a pure python equivalent. I've tried it out with the standard library and it drops in seamlessly with Queue.py, mutex.py, threading.py, shlex.py, and pydoc.py. I looked at implementing this as a package, but that was overkill. If future additions need to be in a separate sourcefile, it is simple enough to load them into the collections namespace. Raymond Hettinger To run the sandbox code, type: python setup.py build install and then at the interactive prompt: from collections import deque -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20040126/796b5365/attachment.html From tim.one at comcast.net Mon Jan 26 22:51:04 2004 From: tim.one at comcast.net (Tim Peters) Date: Mon Jan 26 22:51:07 2004 Subject: [Python-Dev] namespace for generator expressions In-Reply-To: <1075164693.24805.191.camel@localhost.localdomain> Message-ID: [Jeremy] > Are there non-trivial examples? The PEP suggests that they exist, but > doesn't provide any. Well, this example *should* be both easy and natural -- but it turns out to be a disaster on two distinct counts if late binding is used: http://mail.python.org/pipermail/python-dev/2003-October/039323.html I'll repeat one bit from http://mail.python.org/pipermail/python-dev/2003-October/039328.html too: Whenever I've written a list-of-generators, or in the recent example a generator pipeline, I have found it semantically necessary, without exception so far, to capture the bindings of the variables whose bindings wouldn't otherwise be invariant across the life of the generator. [If] it turns out that this is always, or nearly almost always, the case, across future examples too, then it would just be goofy not to implement generator expressions that way ("well, yes, the implementation does do a wrong thing in every example we had, but what you're not seeing is that the explanation would have been a line longer had the implementation done a useful thing instead" ). > Note that Armin Rigo suggested a small change here a few weeks ago > that makes the list comp and the gen expr behave the same way. The > range(i) part of the gen expr is evaluated at the point of > definition. The gen expr, thus, translates to: > > def f(it): > for x in it: > yield x * 2 > > iterators.append(f(range(i)) > > As a result of this change, the only new scope is for the body of the > target expression. I don't know how that effects the earlier > examples. The example in the first link above is: pipe = source for p in predicates: # add a filter over the current pipe, and call that the new pipe pipe = e for e in pipe if p(e) "p" and "pipe" both vary, and disaster ensues if the bindings (for both) in effect at the time of each binding (to "pipe") aren't used when the last generator in the chain (the final binding of "pipe") is provoked into delivering results. If Armin's suggestion transforms that to pipe = source for p in predicates: def g(it): for e in it: if p(e): yield e pipe = g(pipe) then the binding of "p" is left bound to predicates[-1] in all the intermediate generators. > BTW is there good terminology for describing the parts of a list comp > or gen expr? How about the iterator, the conditions, and the > expression? Probably more descriptive than body, mind and soul . Fine by me! From jcarlson at uci.edu Mon Jan 26 19:47:43 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon Jan 26 23:02:24 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040126203122.GA6867@burma.localdomain> References: <20040123094842.1BB0.JCARLSON@uci.edu> <20040126203122.GA6867@burma.localdomain> Message-ID: <20040126131713.FBF6.JCARLSON@uci.edu> [...] > My suggestion is to not introduce these objects at all, and if they're > going to be introduced, there should be internal support for them in the > interpreter, or they're meaningless. Every object with a custom cmp > method would have to take care not to be greater than your objects. Interpretation: You are -1 on the PEP in general. If it does get introduced, then you are +1 on there being a special case in the interpreter for when Min or Max are in the comparison. If given the proper values for PyMinObject and PyMaxObject (quite the large 'if given'), it is an 5 line modification to PyObject_Compare in object.c: int PyObject_Compare(PyObject *v, PyObject *w) { PyTypeObject *vtp; int result; if (v == NULL || w == NULL) { PyErr_BadInternalCall(); return -1; } if (v == w) return 0; vtp = v->ob_type; > if (v == PyMinObject || w == PyMaxObject) > return -1; > else if (v == PyMaxObject || w == PyMinObject) > return 1; > else if (Py_EnterRecursiveCall(" in cmp")) return -1; result = do_cmp(v, w); Py_LeaveRecursiveCall(); return result < 0 ? -1 : result; } Unfortunately the above seems to imply that Min and Max would need to be builtins (I may be wrong here, please correct me if I am), and new builtins have been frowned upon by most everyone since the beginning. [...] > > Take a look at the first few examples in the 'Max Examples' section of > > the PEP: http://www.python.org/peps/pep-0326.html#max-examples > > Notice the reduction in code from using only numbers, to using None, to > > using Max? Notice how each implementation got clearer? That is what > > the PEP is about, making code clearer. > > Comments about the "Max Examples": > > - If the objects in the sequence implement their own cmp operator, you > can't be sure your example will work. That turns these structures > (call them as you want) into something useless. Those who want to use Max and/or Min, and want to implement their own cmp operator - which uses non-standard behavior when comparing against Max and/or Min - may have to deal with special cases involving Max and/or Min. This makes sense because it is the same deal with any value or object you want to result in non-standard behavior. If people want a special case, then they must write the special case. > - "Max" is a possible return from this function. It means the code using > your "findmin_Max" will have to "if foo == Max" somewhere, so it kills > your argument of less code as well. No, it doesn't. min(a, Max) will always return a. I should have included a test for empty sequences as an argument in order to differentiate between empty sequences and sequences that have (0, None, Max) as their actual minimum values (in the related code snippets). This results in the simplification of the (0, None) examples into one, but introduces an index variable and sequence lookups for the general case: def findmin_General(seq): if not len(seq): raise TypeError("Sequence is empty") cur = seq[0] for i in xrange(1, len(seq)): cur = min(seq[i], cur) return cur def findmin_Max(seq): if not len(seq): raise TypeError("Sequence is empty") cur = Max for obj in seq: cur = min(obj, cur) return cur Now they both have the same number of lines, but I find the second one a bit clearer due its lack of sequence indexing. > [...] > > An even better question would be, "How would two objects with __cmp__ > > react?" Checking a few examples, it is whoever is on the left side of > > the comparison operator. > > Exactly. That's the point. Your "Top" value is not really "Top", unless > every object with custom comparison methods take care about it. Certainly anyone who wants to use Max/Min may have to take care to use it properly. How is this any different from the way we program with any other special values? Just to point things out, when comparing two standard python data types (int, float, None, dict, list, tuple, str), cmp calls the left data type's cmp operator. When comparing an object that is not a standard python data type to anything else, cmp calls the leftmost non-standard object's existant cmp operator. The described behavior looks to be consistant with current CVS for object.c, in which the various cmp functions are defined. Perhaps this behavior should be documented in the customization documentation. [...] > > As stated in the PEP: > > "Independent implementations of the Min/Max concept by users desiring > > such functionality are not likely to be compatible, and certainly will > > produce inconsistent orderings. The following examples seek to show how > > inconsistent they can be." (read the PEP for the examples) > > Independent implementations are not compatible for the same reason why > your implementation is not compatible with the rest of the world. IOW, > an indepent implementation would not respect your Max, just like any > object with a custom comparison may not respect it. One of 3 cases would occur: 1. If Max and Min are in the standard distribution, then the people who use it, would write code that is compatible with it (ok). 2. Those that have no need for such extreme values will never write code that /could/ be incompatible with it (ok). 3. Those that don't know about Max/Min, or write their own implementations that overlap with included Python functionality, would not be supported (ok) (I hope the reasons for this are obvious, if they are not, I will clarify). In any case, the PEP describes the behavior and argues about the creation of the "One True Implementation of the One True Maximum Value and One True Minimum Value", and including it in the standard Python distribution in a reasonable location, with a name that is intuitive. > [...] > > The only thing you've "presented" so far is that you think that the > > objects are fundamentally useless. On the other hand, no less than 10 > > Sorry, but you're blind. Review my messages. My mistake, you've also said that the examples were not good enough, have recently given modifications to the examples that may make them better, and have stated that if Max/Min were to be included, they should have interpreter special cases. > The fact that other people agree with your suggestion doesn't affect > my own opinion. Indeed it doesn't. I was attempting to point out that though you think (in general) that the inclusion of Max and Min values in Python are useless, others (beyond myself) find that they do have uses, and would actually use them. See this thread (ugly url ahead): http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&safe=off&threadm=mailman.667.1074799564.12720.python-list%40python.org&rnum=1&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DISO-8859-1%26safe%3Doff%26q%3D326%26btnG%3DGoogle%2BSearch%26meta%3Dgroup%253Dcomp.lang.python.* [...] > > > > Hey, it makes a good case for the PEP; maybe I'll stick it in there if I > > have time this afternoon. Thank you for the sugestion. > > Ouch! It's getting worse. :-) > > Put *that* example in your PEP, please: [...] I like your optimization of comparisons to None, but the real difference between using None and Max in the node elements are the following: class DijkstraSPElement_Max: #initialization def __cmp__(self, other): return cmp(self.distance, other.distance) class DijkstraSPElement_None: #initialization def __cmp__(self, other): pair = self.distance, other.distance if None in pair: return cmp(*pair)*-1 return cmp(*pair) I'll modify the Dijkstra example to include the objects. Really, they show that for structures that have a comparison key and some additional meta-information attached, Max and Min still win (if only because they don't create and search a two-tuple). Out of curiosity, why did you use cmp(*pair)*-1 and not -cmp(*pair)? - Josiah From fdrake at acm.org Mon Jan 26 23:05:33 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon Jan 26 23:05:43 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <16405.38862.629934.745988@montanaro.dyndns.org> References: <338366A6D2E2CA4C9DAEAE652E12A1DE011DAA6D@au3010avexu1.global.avaya.com> <16405.38862.629934.745988@montanaro.dyndns.org> Message-ID: <16405.58253.84956.701712@sftp.fdrake.net> Skip Montanaro writes: > It seems to me it might be simpler to just write the profile file through > the gzip module and teach the hotshot.stats.load() function to recognize > such files and uncompress accordingly. The problem with this is that the low-level log reader and writer are in C rather than Python (using fopen(), fwrite(), etc.). We could probably work out a reader that gets input buffers from an arbitrary Python file object, and maybe we could handle writing that way, but that does change the cost of each write. HotShot tries to compensate for that, but it's unclear how successful that is. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido at python.org Mon Jan 26 23:11:10 2004 From: guido at python.org (Guido van Rossum) Date: Mon Jan 26 23:11:17 2004 Subject: [Python-Dev] namespace for generator expressions In-Reply-To: Your message of "Mon, 26 Jan 2004 19:51:34 EST." <1075164693.24805.191.camel@localhost.localdomain> References: <5.1.0.14.0.20040126172420.03425dd0@mail.telecommunity.com> <1075164693.24805.191.camel@localhost.localdomain> Message-ID: <200401270411.i0R4BA024001@c-24-5-183-134.client.comcast.net> > > A trivial example is: > > Are there non-trivial examples? The PEP suggests that they exist, but > doesn't provide any. > > > iterators = [] > > for i in range(5): > > iterators.append(x*2 for x in range(i)) > > > > print map(list,iterators) > > > > If a listcomp is used, you get: > > > > [[],[0,],[0,2],[0,2,4],[0,2,4,6],[0,2,4,6,8]] > > > > If genexprs do late binding, you get: > > > > [[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8],[0,2,4,6,8]] > > Note that Armin Rigo suggested a small change here a few weeks ago that > makes the list comp and the gen expr behave the same way. I may be mistaken, but I had always assumed it should be the way Armin suggested. > The range(i) part of the gen expr is evaluated at the point of > definition. The gen expr, thus, translates to: > > def f(it): > for x in it: > yield x * 2 > > iterators.append(f(range(i)) > > As a result of this change, the only new scope is for the body of the > target expression. I don't know how that effects the earlier examples. There were definitely examples along the lines of using 'i' in the target expression. > BTW is there good terminology for describing the parts of a list comp or > gen expr? How about the iterator, the conditions, and the expression? Expression is too generic; I suggest target expression. --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Mon Jan 26 20:32:16 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue Jan 27 00:36:25 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) Message-ID: <20040126164752.FBFB.JCARLSON@uci.edu> What does everyone think of the following? A few people have suggested that the Max and Min objects be available as the result of calls to max() and min(), this would solve the location and name problem, and shouldn't break any old code. Current behavior of min and max: 1. without arguments raises a TypeError 2. with a single non-sequence argument raises a TypeError 3. with an empty sequence as an argument raises a ValueError 4. with one argument that is a non-empty sequence returns the min or max of the sequence 5. with more than one argument returns the min or max of *args If we assume that no one is calling min or max without arguments in order to raise a TypeError (which seems like a reasonable assumption), then replacing the TypeError exception for behavior 1 with the following seems reasonable: min() -> Min max() -> Max It is also easy to wrap the current min/max functions in order to return the proper objects. _min = min def min(*args): if args: return _min(*args) return Min If such behavior is desired, it would probably make sense to alter the behavior of min_max() in bltinmodule.c to return pointers to the objects: (line 1094-1095 replaced with what is below) else if (!PyArg_UnpackTuple(args, (op==Py_LT) ? "min" : "max", 1, 1, &v)) { v = (op=Py_LT) ? PyMinObject : PyMaxObject; Py_INCREF(v); return v } By extension, it would make sense to alter the objects Min and Max to have their string representation be 'min()' and 'max()' respectively. The above 5-line modification coupled with the 5 line modification to PyObject_Compare (as posted in my previous post to python-dev which shows the replacement of line 758 in object.c with 5 others), and a minimal C implementation of the Min and Max objects (hopefully less than ~50 lines), could bring the conceptual Min and Max objects into reality. They would also invariably be smaller or larger than all other objects, due to modifications to the interpreter. They would additionally not pollute any namespace (only being accessable by min() or max()) , and be virutally self-documenting (min() returns global minimum, max() returns global maximum). What does everyone think? Thanks for reading, - Josiah 5-line replace of 758 in object.c included below for reference: if (v == PyMinObject || w == PyMaxObject) return -1; else if (v == PyMaxObject || w == PyMinObject) return 1; else if (Py_EnterRecursiveCall(" in cmp")) From jeremy at alum.mit.edu Tue Jan 27 02:02:25 2004 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue Jan 27 02:08:36 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040126164752.FBFB.JCARLSON@uci.edu> References: <20040126164752.FBFB.JCARLSON@uci.edu> Message-ID: <1075186945.24805.223.camel@localhost.localdomain> On Mon, 2004-01-26 at 20:32, Josiah Carlson wrote: > What does everyone think of the following? > > A few people have suggested that the Max and Min objects be available as > the result of calls to max() and min(), this would solve the location > and name problem, and shouldn't break any old code. > > Current behavior of min and max: > 1. without arguments raises a TypeError > 2. with a single non-sequence argument raises a TypeError > 3. with an empty sequence as an argument raises a ValueError > 4. with one argument that is a non-empty sequence returns the min or max > of the sequence > 5. with more than one argument returns the min or max of *args > > > If we assume that no one is calling min or max without arguments in > order to raise a TypeError (which seems like a reasonable assumption), > then replacing the TypeError exception for behavior 1 with the following > seems reasonable: > min() -> Min > max() -> Max I don't like it at all. min() (and max() ...) is supposed to return the smallest item from its arguments. Calling it with no arguments does not mean that we're trying to find the value that would be smaller than anything we passed to it, if we had passed something to it. It's just a programmer error. I haven't made any comment on the PEP 326 thread, so I'll offer my two cents and leave it at that. I can't recall ever needing a Min/Max object in eight years of Python programming. I admit its utility in some circumstances. It seems like a fine candidate for the Python Cookbook; perhaps if it sees widespread use we could consider putting it in the standard library one day. (And I'd call them bottom and top rather than min and max to avoid confusion with the functions.) Jeremy From gerrit at nl.linux.org Tue Jan 27 06:15:38 2004 From: gerrit at nl.linux.org (Gerrit Holl) Date: Tue Jan 27 06:20:26 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040126164752.FBFB.JCARLSON@uci.edu> References: <20040126164752.FBFB.JCARLSON@uci.edu> Message-ID: <20040127111538.GA2916@nl.linux.org> Josiah Carlson wrote: > Current behavior of min and max: > 1. without arguments raises a TypeError > 2. with a single non-sequence argument raises a TypeError > 3. with an empty sequence as an argument raises a ValueError > 4. with one argument that is a non-empty sequence returns the min or max > of the sequence > 5. with more than one argument returns the min or max of *args To be more precise: not sequence but iterable: >>> def f(): ... yield 1; yield 2; yield 3 ... >>> max(f()) 3 > If we assume that no one is calling min or max without arguments in > order to raise a TypeError (which seems like a reasonable assumption), I'm not sure. Suppose I collect a number of things in a collection and pass it to max with *: max(*mycollection). If, by error, mycollection is empty, I want a TypeError, not a silent failure because it suddenly started returning an object. Of course, I shouldn't use the '*' form then, but I think it's a realistic scenario. > then replacing the TypeError exception for behavior 1 with the following > seems reasonable: > min() -> Min > max() -> Max I don't think I like this solution. I do like the idea of min/max, and had a recent use case when writing an Interval type, but I builtin type with a good name. I think 'extreme' would be a good name, where 'extreme(True)' would return an object behaving like Highest and 'extreme(False)' would return an object behaving like Lowest. yours, Gerrit. -- 246. If a man hire an ox, and he break its leg or cut the ligament of its neck, he shall compensate the owner with ox for ox. -- 1780 BC, Hammurabi, Code of Law -- PrePEP: Builtin path type http://people.nl.linux.org/~gerrit/creaties/path/pep-xxxx.html Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/ From tjreedy at udel.edu Tue Jan 27 06:41:24 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Jan 27 06:41:21 2004 Subject: [Python-Dev] Re: PEP 326 (quick location possibility) References: <20040126164752.FBFB.JCARLSON@uci.edu> Message-ID: "Josiah Carlson" wrote in message news:20040126164752.FBFB.JCARLSON@uci.edu... > What does everyone think of the following? > By extension, it would make sense to alter the objects Min and Max to > have their string representation be 'min()' and 'max()' respectively. This strikes me as plausible for the following reason: int() and list(), for instance, instead of raising TypeError as some would expect or even want, return their additive identity elements. You are proposing that min() and max() return their respective comparitive identity elements. There will not, however, be a literal for these, nor a builtin name. Terry J. Reedy From virusscan at neobee.net Tue Jan 27 06:51:01 2004 From: virusscan at neobee.net (Neobee.net Virus Scanner) Date: Tue Jan 27 06:51:05 2004 Subject: [Python-Dev] VIRUS U VASOJ PORUCI Message-ID: V I R U S U V A S O J P O R U C I Neobee.net virus scanner je pronasao W32/Mydoom.A@mm virus(e) u poruci upucenoj na adresu(e): drazen.jaksic@activez.net Zarazena poruka NIJE isporucena !!! ID zarazene poruke je: 401650a5-7d6f. ------------------------------------------------------------------------ ZAGLAVLJE (HEADER) PORUKE:: Received: from green02.telekom.ftn.ns.ac.yu ([147.91.173.88] helo=python.org) by holomatrix1.neobee.net with esmtp (Exim 4.14) id 1AlRk6-0001Gy-Ou for drazen.jaksic@activez.net; Tue, 27 Jan 2004 12:50:58 +0100 From: python-dev@python.org To: drazen.jaksic@activez.net Subject: Error Date: Tue, 27 Jan 2004 12:55:17 +0100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0004_4B13DACF.14B6332C" X-Priority: 3 X-MSMail-Priority: Normal Message-Id: X-NEOBEE-Virus-Scanned: by Neobee.net Virus Scanner From guido at python.org Tue Jan 27 08:08:45 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 27 08:08:49 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: Your message of "Mon, 26 Jan 2004 17:32:16 PST." <20040126164752.FBFB.JCARLSON@uci.edu> References: <20040126164752.FBFB.JCARLSON@uci.edu> Message-ID: <200401271308.i0RD8jQ24906@c-24-5-183-134.client.comcast.net> > What does everyone think of the following? [making min() and max() called args return the extremes] I've already commented on this before. It is still unacceptable. --Guido van Rossum (home page: http://www.python.org/~guido/) From jepler at unpythonic.net Tue Jan 27 08:32:43 2004 From: jepler at unpythonic.net (Jeff Epler) Date: Tue Jan 27 08:32:47 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040126131713.FBF6.JCARLSON@uci.edu> References: <20040123094842.1BB0.JCARLSON@uci.edu> <20040126203122.GA6867@burma.localdomain> <20040126131713.FBF6.JCARLSON@uci.edu> Message-ID: <20040127133243.GE24260@unpythonic.net> On Mon, Jan 26, 2004 at 04:47:43PM -0800, Josiah Carlson wrote: > This results in the simplification of the (0, None) examples into one, > but introduces an index variable and sequence lookups for the general > case: > > def findmin_General(seq): > if not len(seq): > raise TypeError("Sequence is empty") > cur = seq[0] > for i in xrange(1, len(seq)): > cur = min(seq[i], cur) > return cur > > def findmin_Max(seq): > if not len(seq): > raise TypeError("Sequence is empty") > cur = Max > for obj in seq: > cur = min(obj, cur) > return cur > > Now they both have the same number of lines, but I find the second one > a bit clearer due its lack of sequence indexing. I'd avoid the sequence indexing by writing the following: def findmin_Iter(seq): seq = iter(seq) cur = seq.next() for i in seq: cur = min(i, cur) return cur This code will raise StopIteration, not TypeError, if the sequence is empty. Jeff From ironblayde at aeon-software.com Tue Jan 27 09:40:48 2004 From: ironblayde at aeon-software.com (ironblayde@aeon-software.com) Date: Tue Jan 27 09:40:55 2004 Subject: [Python-Dev] Re: Hello In-Reply-To: <200401271440.i0REehI6013579@host2.apollohosting.com> References: <200401271440.i0REehI6013579@host2.apollohosting.com> Message-ID: <200401271440.i0REem4l013676@host2.apollohosting.com> This is an autoresponder. I'll never see your message. From ark-mlist at att.net Tue Jan 27 11:08:55 2004 From: ark-mlist at att.net (Andrew Koenig) Date: Tue Jan 27 11:08:42 2004 Subject: [Python-Dev] Re: PEP 326 (quick location possibility) In-Reply-To: Message-ID: <001001c3e4ef$dd9e4680$6402a8c0@arkdesktop> > > By extension, it would make sense to alter the objects Min and Max to > > have their string representation be 'min()' and 'max()' respectively. > > This strikes me as plausible for the following reason: int() and list(), > for instance, instead of raising TypeError as some would expect or even > want, return their additive identity elements. You are proposing that > min() and max() return their respective comparitive identity elements. Unfortunately, no. The identity element for min is Max, and the identity element for max is Min. From ark-mlist at att.net Tue Jan 27 11:10:40 2004 From: ark-mlist at att.net (Andrew Koenig) Date: Tue Jan 27 11:10:26 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040126164752.FBFB.JCARLSON@uci.edu> Message-ID: <001101c3e4f0$1c25d030$6402a8c0@arkdesktop> > If we assume that no one is calling min or max without arguments in > order to raise a TypeError (which seems like a reasonable assumption), > then replacing the TypeError exception for behavior 1 with the following > seems reasonable: > min() -> Min > max() -> Max I consider that behavior Unreasonable. What's reasonable is min() -> Max max() -> Min From guido at python.org Tue Jan 27 11:55:56 2004 From: guido at python.org (Guido van Rossum) Date: Tue Jan 27 11:56:03 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: Your message of "Tue, 27 Jan 2004 11:10:40 EST." <001101c3e4f0$1c25d030$6402a8c0@arkdesktop> References: <001101c3e4f0$1c25d030$6402a8c0@arkdesktop> Message-ID: <200401271655.i0RGtuh25326@c-24-5-183-134.client.comcast.net> > > If we assume that no one is calling min or max without arguments in > > order to raise a TypeError (which seems like a reasonable assumption), > > then replacing the TypeError exception for behavior 1 with the following > > seems reasonable: > > min() -> Min > > max() -> Max > > I consider that behavior Unreasonable. What's reasonable is > > min() -> Max > max() -> Min That it is possible to disagree about whether min() should return Min or Max is enough to veto this idea. --Guido van Rossum (home page: http://www.python.org/~guido/) From virus.manager at st.com Tue Jan 27 12:13:50 2004 From: virus.manager at st.com (virus.manager@st.com) Date: Tue Jan 27 12:13:54 2004 Subject: [Python-Dev] Virus Alert Message-ID: <20040127171350.418DDFA8F@zeta.dmz-eu.st.com> The mail message (file: document.zip) you sent to contains a virus. (on zeta) From MAILER-DAEMON at wanadoopro.ma Tue Jan 27 15:33:32 2004 From: MAILER-DAEMON at wanadoopro.ma (Mail Delivery Subsystem) Date: Tue Jan 27 15:35:28 2004 Subject: [Python-Dev] Returned mail: see transcript for details Message-ID: <200401272033.i0RKXVRk014166@wanadoopro.ma> The original message was received at Tue, 27 Jan 2004 20:33:31 GMT from root@localhost ----- The following addresses had permanent fatal errors ----- john@saga.co.ma.KAV (reason: 550 5.1.1 ... User unknown) (expanded from: john@saga.co.ma.KAV) ----- Transcript of session follows ----- ... while talking to [127.0.0.1]: >>> DATA <<< 550 5.1.1 ... User unknown 550 5.1.1 john@saga.co.ma.KAV... User unknown <<< 503 5.0.0 Need RCPT (recipient) -------------- next part -------------- Skipped content of type message/delivery-status-------------- next part -------------- An embedded message was scrubbed... From: python-dev@python.org Subject: ! VIRUS TROUVE DANS UN MESSAGE VOUS ETANT DESTINE ! Date: Tue Jan 27 20:33:31 2004 Size: 949 Url: http://mail.python.org/pipermail/python-dev/attachments/20040127/31042929/attachment.mht From kschutz at cso.atmel.com Tue Jan 27 16:39:10 2004 From: kschutz at cso.atmel.com (kschutz@cso.atmel.com) Date: Tue Jan 27 16:41:49 2004 Subject: [Python-Dev] Out of office: Back on February 3 In-Reply-To: <0HS600GLY45AMH@csomail.cso.atmel.com> Message-ID: <0HS600G0345AXH@csomail.cso.atmel.com> This is an automatic reply. I will be out of the office on overseas travel through February 3. For TCG project related questions, please contact Randy Mummert, 719-540-1068, rmummert@cso.atmel.com. For TCG marketing and all other product questions, please contact Jay Johnson, 719-540-1835, jjohnson@cso.atmel.com. Thank you. From niemeyer at conectiva.com Tue Jan 27 17:05:34 2004 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Tue Jan 27 17:11:40 2004 Subject: [Python-Dev] Re: PEP 326 now online In-Reply-To: <20040126131713.FBF6.JCARLSON@uci.edu> References: <20040123094842.1BB0.JCARLSON@uci.edu> <20040126203122.GA6867@burma.localdomain> <20040126131713.FBF6.JCARLSON@uci.edu> Message-ID: <20040127220534.GA6886@burma.localdomain> > [...] > > My suggestion is to not introduce these objects at all, and if they're > > going to be introduced, there should be internal support for them in the > > interpreter, or they're meaningless. Every object with a custom cmp > > method would have to take care not to be greater than your objects. > > Interpretation: > You are -1 on the PEP in general. If it does get introduced, then you > are +1 on there being a special case in the interpreter for when Min or > Max are in the comparison. That's it. > If given the proper values for PyMinObject and PyMaxObject (quite the > large 'if given'), it is an 5 line modification to PyObject_Compare in > object.c: [...] I know it's easy, I just don't think it's worth. Anyway, you've explained it in the above sentence. > > Comments about the "Max Examples": > > > > - If the objects in the sequence implement their own cmp operator, you > > can't be sure your example will work. That turns these structures > > (call them as you want) into something useless. > > > Those who want to use Max and/or Min, and want to implement their own > cmp operator - which uses non-standard behavior when comparing against Max > and/or Min - may have to deal with special cases involving Max and/or > Min. > > This makes sense because it is the same deal with any value or object > you want to result in non-standard behavior. If people want a special > case, then they must write the special case. What I mean is that I'd have to review every generic class I ever wrote which is supposed to be compatible with the standard library to check if they're compatible with Min/Max. I don't like this approach, but that's just my opinion. > > - "Max" is a possible return from this function. It means the code using > > your "findmin_Max" will have to "if foo == Max" somewhere, so it kills > > your argument of less code as well. > > No, it doesn't. > min(a, Max) will always return a. Interesting, you say that it doesn't, but... > I should have included a test for empty sequences as an argument in > order to differentiate between empty sequences and sequences that have > (0, None, Max) as their actual minimum values (in the related code > snippets). > > This results in the simplification of the (0, None) examples into one, > but introduces an index variable and sequence lookups for the general > case: [...] > Now they both have the same number of lines, but I find the second one > a bit clearer due its lack of sequence indexing. ... you actually agree. :-) > > [...] > > > An even better question would be, "How would two objects with __cmp__ > > > react?" Checking a few examples, it is whoever is on the left side of > > > the comparison operator. > > > > Exactly. That's the point. Your "Top" value is not really "Top", unless > > every object with custom comparison methods take care about it. > > Certainly anyone who wants to use Max/Min may have to take care to use > it properly. How is this any different from the way we program with any > other special values? I don't want to use them, but I'd have to be careful so that other people might use them. Again, I don't like this approach. [...] > One of 3 cases would occur: > 1. If Max and Min are in the standard distribution, then the people who > use it, would write code that is compatible with it (ok). Wrong. People which use Max/Min might be unaware that they must take care when building __cmp__ so that their classes are compatible with Min/Max. > 2. Those that have no need for such extreme values will never write code > that /could/ be incompatible with it (ok). Wrong. Those that have no need for such extreme values will have even more chances to do that. > 3. Those that don't know about Max/Min, or write their own > implementations that overlap with included Python functionality, would > not be supported (ok) (I hope the reasons for this are obvious, if they > are not, I will clarify). Same case. > In any case, the PEP describes the behavior and argues about the > creation of the "One True Implementation of the One True Maximum Value > and One True Minimum Value", and including it in the standard Python > distribution in a reasonable location, with a name that is intuitive. Ok, and I thank you for you patience defending the matter. Opinions diverge, and the divergences should be explained. Once everyone is aware about the issues, someone will have to decide if it is implemented or not (I'm glad it's not my job :-). > I like your optimization of comparisons to None, but the real difference > between using None and Max in the node elements are the following: [...] Oops.. the real difference is exposed with any alternative implementation. Introducing code which "emulates" your own scheme is pretty tendentious. > Out of curiosity, why did you use cmp(*pair)*-1 and not -cmp(*pair)? Because that's the first implementation that came into my mind. -- Gustavo Niemeyer http://niemeyer.net From python at rcn.com Tue Jan 27 17:10:34 2004 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 27 17:12:03 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040126164752.FBFB.JCARLSON@uci.edu> Message-ID: <006401c3e522$62dceb80$e841fea9@oemcomputer> > What does everyone think? Thanks for reading, > - Josiah > > > 5-line replace of 758 in object.c included below for reference: > if (v == PyMinObject || w == PyMaxObject) > return -1; > else if (v == PyMaxObject || w == PyMinObject) > return 1; > else if (Py_EnterRecursiveCall(" in cmp")) -1 on *any* modification of the existing builtin functions. It is best to keep the proposal pure and de-coupled by just having a pair of objects that return always-under or always- over in a comparision. If you return back to that foundation, the open questions are * putting is someplace useful, yet out of the way, without introducing any weirdness (if there's nothing else quite like it, it is probably a bad idea). * determining that is actually worth it. Jeremy never needed it in eight years, though if it had been available I'll bet he would have used it. Even if the need arises only once a year, then it is probably too rare (that threshold varies with the hardness of the problem, i.e. you may only need the logging module once per year, but when you need it, there is no cheap, well thought-out, category killing alternative). In this case, the problem is so immediately solvable that it hardly counts as a minor speed bump. My sense is that Jeremy is on target with his suggestion that this be a recipe. We've certainly reached the point where more time and effort has been spent discussing it than could ever be saved by the introduction of the min/max constants. To date, the arguments tend to emphasize purity over practicality, and the original motivation may have been replaced by an understandable desire to push a PEP through the acceptance stage. IOW, "we've come this far and suffered so much" is not really a good reason for pressing on. Raymond Hettinger From eppstein at ics.uci.edu Tue Jan 27 18:39:06 2004 From: eppstein at ics.uci.edu (David Eppstein) Date: Tue Jan 27 18:39:11 2004 Subject: [Python-Dev] Re: PEP 326 (quick location possibility) References: <20040126164752.FBFB.JCARLSON@uci.edu> Message-ID: In article <20040126164752.FBFB.JCARLSON@uci.edu>, Josiah Carlson wrote: > A few people have suggested that the Max and Min objects be available as > the result of calls to max() and min(), this would solve the location > and name problem, and shouldn't break any old code. I would probably use min() and max() over someobscuremodule.UniversalLowerBoundObject and someobscuremodule.UniversalUpperBoundObject. I very much like the idea of PEP 326 (making proper universal min and max objects instead of encouraging those of us who know about it to use None or forcing us to make half-working versions ourselves), and I feel strongly that (if something like PEP 326 is accepted) min() and max() should return the Min and Max objects, as should min([]) and max([]). All that said, I don't think making the actual names of these objects be min() and max() would lead to the most readable code, and I'm still hopeful that more readable names can be found. Smallest and Largest, maybe? -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From eppstein at ics.uci.edu Tue Jan 27 18:41:23 2004 From: eppstein at ics.uci.edu (David Eppstein) Date: Tue Jan 27 18:50:38 2004 Subject: [Python-Dev] Re: PEP 326 (quick location possibility) References: <001001c3e4ef$dd9e4680$6402a8c0@arkdesktop> Message-ID: In article <001001c3e4ef$dd9e4680$6402a8c0@arkdesktop>, "Andrew Koenig" wrote: > Unfortunately, no. The identity element for min is Max, and the identity > element for max is Min. Oops, you're right! Then, definitely, min() and max() should not be the official names of the universal min and max objects. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From andrew.luscombe at tenixsolutions.com Tue Jan 27 21:17:01 2004 From: andrew.luscombe at tenixsolutions.com (Luscombe Andrew) Date: Tue Jan 27 21:17:18 2004 Subject: [Python-Dev] extended if and while statements Message-ID: <6000255C3751D711BBE20008C719D5BD123236@lmtmelsas03.lmt.com.au> This e-mail describes a suggestion for an extension to the "if" and "while" statements so that programmers have an option to use them like switch statements. The resulting statements seem to me to read more like natural language than similar constructs with switch statements. The following examples illustrate the basic idea: if book_count() .. == 1: purchase_9_books() == 2: purchase_8_books() < 5: purchase_5_books() else: rearrange_bookshelves() which is pretty much like a switch statement and would be logically equivalent to the following: nBooks = book_count() if nBooks == 1: purchase_9_books() elif nBooks == 2: purchase_8_books() elif nBooks < 5: purchase_5_books() else: rearrange_bookshelves() and the following shows a similar thing for the while statement: while book_count() .. < 10: purchase_book() put_book_on_shelf() > 20: put_in_storage( book[-1] ) rearrange_bookshelves() which would be logically equivalent to the following: while 1: nBooks = book_count() if nBooks < 10: purchase_book() put_book_on_shelf() elif nBooks > 20: put_in_storage( book[-1] ) rearrange_bookshelves() else: break The ".." punctuation placed where a ":" would normally be indicates the extended usage, and means that the result of the expression evaluated immediately after the "while" or "if" will be prepended to each of the following partial conditional statements with the first true one causing the statements following it to be executed. The original punctuation for "if" and "while" statements with its original meaning would be retained. There would be a number of options to choose in implementing the extension. For example the ".." symbol might be a ":-" or something else. But a more interesting choice would be whether or not to allow an "else" within a while loop as follows, and whether or not to continue looping when the "else" path is taken: while book_count() .. < 10: purchase_book() put_book_on_shelf() > 20: put_in_storage( book[-1] ) rearrange_bookshelves() else: drink_a_beer() perhaps by default the else path should break, but if it ends in a "continue" then it should keep looping. The reasoning behind this is that due to the natural language meaning of the word "while" the loop should repeat while one of the conditions is true, and this is not the case in the else path. So an else within the loop would do nothing different from an else immediately after the while loop unless it has a continue - in which case it would keep looping and there probably better be a break in one of the other paths. Of course, it would also be possible to not allow these type of "else" statements, and then if you want such functionality you could either make it out of separate ifs and whiles, or achieve a similar result by using a comparison with an extreme value as the last condition, for example: while book_count() .. < 10: purchase_book() put_book_on_shelf() > 20: put_in_storage( book[-1] ) rearrange_bookshelves() > -1: drink_a_beer() Or, you might not care about any conflict with the natural language meaning of the word while, and just make the else path not break by default. Another similar choice is whether or not to have some code after the conditional code that is executed after any of the conditional statements, for example: while book_count() .. < 10: purchase_book() put_book_on_shelf() > 20: put_in_storage( book[-1] ) rearrange_bookshelves() drink_a_beer() which would be equivalent to: while 1: nBooks = book_count() if nBooks < 10: purchase_book() put_book_on_shelf() elif nBooks > 20: put_in_storage( book[-1] ) rearrange_bookshelves() else: break drink_a_beer() Another thing that some people might see as an issue is that a two ended range condition with current operators and functions might not run very efficiently. For example if book_count() .. in range(1,6): purchase_book() == 6: drink_a_beer() One way around this would be if the interpreter/complier is able to recognise such structures and compile an efficient version. I guess it could do this when compiling the "in" operator followed by the range() function anywhere in the program, but then it's not really an "if" statement issue anymore. I've had a go at writing a formal definition of the extended statements: if .. (: )* [else: ] while .. (: )* [] [else: ] This e-mail is only intended to be read by the named recipient. It may contain information, which is confidential, proprietry or the subject of legal privilege. If you are not the intended recipient you must delete this e-mail and may not use any information contained in it. Legal privilege is not waived because you have read this e-mail. From python at rcn.com Tue Jan 27 23:07:42 2004 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 27 23:09:02 2004 Subject: [Python-Dev] extended if and while statements In-Reply-To: <6000255C3751D711BBE20008C719D5BD123236@lmtmelsas03.lmt.com.au> Message-ID: <009c01c3e554$478c5aa0$e841fea9@oemcomputer> > if book_count() .. > == 1: purchase_9_books() > == 2: purchase_8_books() > < 5: purchase_5_books() > else: > rearrange_bookshelves() When the action clause is listed on a separate line (as it should be) as it must be for multi-line blocks, the result is not as pretty and shows that an additional level of indentation is required with almost no pay-off: if book_count() .. == 1: purchase_9_books() == 2: purchase_8_books() < 5: purchase_5_books() else: rearrange_bookshelves I don't see this as a win over the following equivalent: > nBooks = book_count() > if nBooks == 1: > purchase_9_books() > elif nBooks == 2: > purchase_8_books() > elif nBooks < 5: > purchase_5_books() > else: > rearrange_bookshelves() I recommend teasing out the idea on comp.lang.python. There will be no shortage of criticisms on every possible variation. There have been many suggestions for a switch statement and a couple of them were elegant and pythonic. Here is one variant: select book_count(): case 1: purchase_9_books() case 2: purchase_8_books() case < 5: purchase_5_books() else: rearrange_bookshelves Here is a link to one of the many threads on ideas for python switch-case statements: http://groups.google.com/groups?th=1fafecbd71f20d27 Raymond Hettinger From sxw at dcs.ed.ac.uk Wed Jan 28 00:24:42 2004 From: sxw at dcs.ed.ac.uk (sxw@dcs.ed.ac.uk) Date: Wed Jan 28 00:21:46 2004 Subject: [Python-Dev] TEST Message-ID: <200401280522.i0S5M5p14858@tnqdns1.tnq.co.in> ------------------ Virus Warning Message (on tnqdns1) Found virus WORM_MIMAIL.R in file ewfv.pif (in ewfv.zip) The uncleanable file ewfv.zip is moved to /etc/iscan/virus/virvga7bV. --------------------------------------------------------- -------------- next part -------------- -------------- next part -------------- ------------------ Virus Warning Message (on tnqdns1) ewfv.zip is removed from here because it contains a virus. --------------------------------------------------------- From root at tnqdns1.tnq.co.in Wed Jan 28 00:22:06 2004 From: root at tnqdns1.tnq.co.in (root@tnqdns1.tnq.co.in) Date: Wed Jan 28 00:21:52 2004 Subject: [Python-Dev] Virus Alert Message-ID: <200401280522.i0S5M6I14876@tnqdns1.tnq.co.in> Have detected a virus (WORM_MIMAIL.R) in your mail traffic on 01/28/2004 10:52:05 with an action quarantined. From martin at v.loewis.de Wed Jan 28 01:03:44 2004 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 28 01:04:01 2004 Subject: [Python-Dev] extended if and while statements In-Reply-To: <6000255C3751D711BBE20008C719D5BD123236@lmtmelsas03.lmt.com.au> References: <6000255C3751D711BBE20008C719D5BD123236@lmtmelsas03.lmt.com.au> Message-ID: <401750C0.8020401@v.loewis.de> Luscombe Andrew wrote: > > This e-mail describes a suggestion for an extension to the "if" and "while" > statements so that programmers have an option to use them like switch > statements. The resulting statements seem to me to read more like natural > language than similar constructs with switch statements. See PEP 275. Martin From Norton_AntiVirus_Gateways at pmulya.ac.id Wed Jan 28 01:49:20 2004 From: Norton_AntiVirus_Gateways at pmulya.ac.id (Norton_AntiVirus_Gateways@pmulya.ac.id) Date: Wed Jan 28 01:17:29 2004 Subject: [Python-Dev] Virus Found, om mic! Message-ID: --- Scan information follows --- Result: Virus Detected Virus Name: W32.Novarg.A@mm File Attachment: data.scr Attachment Status: deleted --- Original message information follows --- From: python-dev@python.org To: leo@pmulya.ac.id Date: Wed, 28 Jan 2004 13:11:46 +0700 Subject: hello Message-Id: Received: from mx2.cyber-isp.net ([202.58.65.8]) by dmz1.pmulya.ac.id (NAVGW 2.5.1.12) with SMTP id M2004012813491905122 for ; Wed, 28 Jan 2004 13:49:19 +0700 From virus.manager at st.com Wed Jan 28 05:55:02 2004 From: virus.manager at st.com (virus.manager@st.com) Date: Wed Jan 28 05:55:06 2004 Subject: [Python-Dev] Virus Alert Message-ID: <20040128105502.66F22F325@zeta.dmz-eu.st.com> The mail message (file: body.scr) you sent to contains a virus. (on zeta) From jcarlson at uci.edu Wed Jan 28 02:05:21 2004 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Jan 28 05:58:52 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <006401c3e522$62dceb80$e841fea9@oemcomputer> References: <20040126164752.FBFB.JCARLSON@uci.edu> <006401c3e522$62dceb80$e841fea9@oemcomputer> Message-ID: <20040127214512.0659.JCARLSON@uci.edu> Dear python-dev, I took a pass through the archives of January, just to get an actual number on what people think of the ideas in the PEP. At the end of this email is a listing in +/- 1/0 format (FYI, I'm not counting myself in these), using only the latest +/- 1/0 comments from publicly posted emails to either python-dev or c.l.py. The 4 that are against (-1) the PEP being included with Python or in the standard library, are /developers of/ Python. The 12 that are in favor (+1) of the PEP being included with Python or in the standard library, are /users of/ Python. I'll not attempt to analyze the reasons behind this split in opinion, for various reasons. I will state that the liklihood of it being accepted, with such strong core developer dislike for the PEP, is likely represented by some low probability that is arbitrarily close to 0, but not 'None', not 'not None', perhaps 'not not None', but not any other such foolishness (come on, it's a joke, laugh). If there is a core Python developer who thinks that this functionality should be included with Python (accessable via some method on a builtin, name, attribute, etc.), I'm sure we'd like to hear from you. If there does not exist a core Python developer who thinks that this functionality should be included with Python, I'll spend the next few days (maybe a week), updating the PEP, include some final words, and include a reasonably* named pair of objects in a reasonably named module for people to use if they want to. At some point in the future, if there exists support from the user community, we could discuss its inclusion/exclusion in the standard library. I hope this is reasonable to everyone with an opinion on PEP 326. Thank you for all the feedback, criticisms, and suggestions, - Josiah Feelings on inclusion into Python or the standard library somewhere (name of objects and location preferences vary, if I mistook your opinion, my apologies, etc.) +1: Terry J. Reedy Andrew P. Lentvorski, Jr. David Eppstein Bob Ippolito Evan Simpson Andrew Koenig Gary Robinson (from c.l.py) John Roth (from c.l.py) A. Lloyd Flanagan (from c.l.py) Jeremy Fincher (from c.l.py) Andrew Koenig (from c.l.py) Erik Max Francis (from c.l.py) +0 John Hazen Robert Brewer Michael Chermside 0 Phillip J. Eby -0 Gerrit Holl -1 Guido van Rossum Gustavo Niemeyer Jeremy Hylton Raymond Hettinger * Reasonable means arbitrary but meaningful, likely the result of choosing the most positively voted for in comments from people in python-dev and c.l.py. From postmaster at vsnl.net Wed Jan 28 06:17:00 2004 From: postmaster at vsnl.net (postmaster@vsnl.net) Date: Wed Jan 28 06:17:45 2004 Subject: [Python-Dev] Virus Alert Message-ID: <0HS700G0260CLS@vmail.vsnl.com> The mail message (file: document.scr) you sent to jane@kongu.ac.in contains a virus. (on vmail) From anthony at interlink.com.au Wed Jan 28 07:13:24 2004 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed Jan 28 07:13:36 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040127214512.0659.JCARLSON@uci.edu> Message-ID: <20040128121324.BFD2925B353@bonanza.off.ekorp.com> FWIW, as both a developer and user of Python, I'm -1 on this. I've been coding in Python for over 10 years now, and I can't think of a single project or piece of code that would have been significantly easier with this construct. Trade off the very limited usefulness against not bolting everything into the standard python library (or worse yet, builtins) and it's an easy choice. All imho, of course. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From Paul.Moore at atosorigin.com Wed Jan 28 07:58:16 2004 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Wed Jan 28 07:58:18 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C09C23@UKDCX001.uk.int.atosorigin.com> From: Anthony Baxter > FWIW, as both a developer and user of Python, I'm -1 on this. Agreed. I've been quiet on this one as I don't have anything much to add, but if you're counting responses, I'm a -1. Paul From barry at python.org Wed Jan 28 08:07:46 2004 From: barry at python.org (Barry Warsaw) Date: Wed Jan 28 08:07:50 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C09C23@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C09C23@UKDCX001.uk.int.atosorigin.com> Message-ID: <1075295265.18907.18.camel@anthem> On Wed, 2004-01-28 at 07:58, Moore, Paul wrote: > From: Anthony Baxter > > FWIW, as both a developer and user of Python, I'm -1 on this. > > Agreed. I've been quiet on this one as I don't have anything much > to add, but if you're counting responses, I'm a -1. Me too. -1. pep-10-ly y'rs, -Barry From mrussell at verio.net Wed Jan 28 08:12:40 2004 From: mrussell at verio.net (Mark Russell) Date: Wed Jan 28 08:12:41 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C09C23@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C09C23@UKDCX001.uk.int.atosorigin.com> Message-ID: <1075295560.769.9.camel@localhost> On Wed, 2004-01-28 at 12:58, Moore, Paul wrote: > Agreed. I've been quiet on this one as I don't have anything much > to add, but if you're counting responses, I'm a -1. Same here - as a python user I'm -1 on on adding clutter to the library, and -1000 on changing the behaviour of min() and max() (if I pass an empty sequence to these functions I *want* an exception to draw my attention to the fact that I haven't thought about that case). I haven't said anything up to now because I've been hoping this thread will die and I didn't want to prolong it. Mark From guido at python.org Wed Jan 28 08:12:58 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 28 08:13:04 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: Your message of "Tue, 27 Jan 2004 23:05:21 PST." <20040127214512.0659.JCARLSON@uci.edu> References: <20040126164752.FBFB.JCARLSON@uci.edu> <006401c3e522$62dceb80$e841fea9@oemcomputer> <20040127214512.0659.JCARLSON@uci.edu> Message-ID: <200401281312.i0SDCw227952@c-24-5-183-134.client.comcast.net> > I took a pass through the archives of January, just to get an actual > number on what people think of the ideas in the PEP. Dear Josiah, This has lasted long enough. Python is not a democracy. So here's a BDFL Pronouncement. PEP 326 is REJECTED, because: - The usefulness of the feature is marginal; - it is easily implemented when you really need it; and - we can't agree on a good way to spell it. Please don't bring this up in python-dev. It's not worth whining about. Please focus your energy on something else. I've seen good code and good ideas from of you, I would be sorry if you wasted more energy on a lost cause. Not to mention the wasted time of everyone else arguing a lost cause. --Guido van Rossum (home page: http://www.python.org/~guido/) From walter at livinglogic.de Wed Jan 28 08:55:38 2004 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Jan 28 08:55:50 2004 Subject: [Python-Dev] Hotshot In-Reply-To: <16405.58253.84956.701712@sftp.fdrake.net> References: <338366A6D2E2CA4C9DAEAE652E12A1DE011DAA6D@au3010avexu1.global.avaya.com> <16405.38862.629934.745988@montanaro.dyndns.org> <16405.58253.84956.701712@sftp.fdrake.net> Message-ID: <4017BF5A.3060107@livinglogic.de> Fred L. Drake, Jr. wrote: > Skip Montanaro writes: > > It seems to me it might be simpler to just write the profile file through > > the gzip module and teach the hotshot.stats.load() function to recognize > > such files and uncompress accordingly. > > The problem with this is that the low-level log reader and writer are > in C rather than Python (using fopen(), fwrite(), etc.). We could > probably work out a reader that gets input buffers from an arbitrary > Python file object, and maybe we could handle writing that way, but > that does change the cost of each write. HotShot tries to compensate > for that, but it's unclear how successful that is. OK, I've checked both the old profile.py and the new hotshotmain.py (Thanks for checking it in, Skip) with one of my scripts. The script runs for: real 0m6.192s user 0m6.010s sys 0m0.160s when running standalone. Using the old profile.py I get the following: real 0m46.892s user 0m43.430s sys 0m0.970s Running with the new hotshotmain.py gives the following run times: real 1m6.873s user 1m5.220s sys 0m0.840s The size of hotshop.prof is 5104590 bytes. After gzipping it with "gzip -9" the size of hotshop.prof drops to 562467 bytes. So gzipping might help, but dropping filesize from 1 GB to 100 MB still doesn't sound so convincing. And I wonder what would happen to run time. Bye, Walter D?rwald From skip at pobox.com Wed Jan 28 09:05:34 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 28 09:05:44 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040128121324.BFD2925B353@bonanza.off.ekorp.com> References: <20040127214512.0659.JCARLSON@uci.edu> <20040128121324.BFD2925B353@bonanza.off.ekorp.com> Message-ID: <16407.49582.839967.556840@montanaro.dyndns.org> Anthony> FWIW, as both a developer and user of Python, I'm -1 on this. Anthony> I've been coding in Python for over 10 years now, and I can't Anthony> think of a single project or piece of code that would have been Anthony> significantly easier with this construct. I'm with Anthony. I've been programming in Python for a similar amount of time and never even considered such a beast might be useful. Skip From nas-python at python.ca Wed Jan 28 10:26:01 2004 From: nas-python at python.ca (Neil Schemenauer) Date: Wed Jan 28 10:16:59 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040127214512.0659.JCARLSON@uci.edu> References: <20040126164752.FBFB.JCARLSON@uci.edu> <006401c3e522$62dceb80$e841fea9@oemcomputer> <20040127214512.0659.JCARLSON@uci.edu> Message-ID: <20040128152601.GA19424@mems-exchange.org> -1, as a deveoper and a user. Neil From Andrew.MacKeith at ABAQUS.com Wed Jan 28 10:31:22 2004 From: Andrew.MacKeith at ABAQUS.com (Andrew MacKeith) Date: Wed Jan 28 10:31:49 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) References: <20040126164752.FBFB.JCARLSON@uci.edu> <006401c3e522$62dceb80$e841fea9@oemcomputer> <20040127214512.0659.JCARLSON@uci.edu> Message-ID: <4017D5CA.1070309@ABAQUS.com> Josiah Carlson wrote: > Dear python-dev, > The 12 that are in favor (+1) of the PEP being included with Python or > in the standard library, are /users of/ Python. > -1 and I'm a user. From skip at pobox.com Wed Jan 28 11:22:06 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 28 11:33:37 2004 Subject: [Python-Dev] Would GPL on python-mode.el be a problem? Message-ID: <16407.57774.541583.6592@montanaro.dyndns.org> Not too long ago, Matthias Klose submitted an enormous patch (1370 lines) to python-mode.el (only 3804 lines itself): https://sourceforge.net/tracker/index.php?func=detail&aid=875596&group_id=86916&atid=581351 I decided to start disecting it today and immediately ran into this comment by Dave Love (fx@gnu.org): ;; This file contains significant bits of GPL'd code, so it must be ;; under the GPL. -- fx During the python-mode project's short life, I've copied python-mode.el over to the Python project's Misc directory a few times. I thought I'd better ask if GPL-ness would present a problem. I have always remained blissfully unaware of licensing details, so I have no grasp of what the ramifications are. Thx, Skip From ark-mlist at att.net Wed Jan 28 11:49:12 2004 From: ark-mlist at att.net (Andrew Koenig) Date: Wed Jan 28 11:49:01 2004 Subject: [Python-Dev] Would GPL on python-mode.el be a problem? In-Reply-To: <16407.57774.541583.6592@montanaro.dyndns.org> Message-ID: <000001c3e5be$a8bb4ce0$6402a8c0@arkdesktop> > During the python-mode project's short life, I've copied python-mode.el > over > to the Python project's Misc directory a few times. I thought I'd better > ask if GPL-ness would present a problem. I have always remained > blissfully > unaware of licensing details, so I have no grasp of what the ramifications > are. >From http://www.gnu.org/licenses/gpl-faq.html#MereAggregation: Mere aggregation of two programs means putting them side by side on the same CD-ROM or hard disk. We use this term in the case where they are separate programs, not parts of a single program. In this case, if one of the programs is covered by the GPL, it has no effect on the other program. Combining two modules means connecting them together so that they form a single larger program. If either part is covered by the GPL, the whole combination must also be released under the GPL--if you can't, or won't, do that, you may not combine them. What constitutes combining two parts into one program? This is a legal question, which ultimately judges will decide. We believe that a proper criterion depends both on the mechanism of communication (exec, pipes, rpc, function calls within a shared address space, etc.) and the semantics of the communication (what kinds of information are interchanged). If the modules are included in the same executable file, they are definitely combined in one program. If modules are designed to run linked together in a shared address space, that almost surely means combining them into one program. By contrast, pipes, sockets and command-line arguments are communication mechanisms normally used between two separate programs. So when they are used for communication, the modules normally are separate programs. But if the semantics of the communication are intimate enough, exchanging complex internal data structures, that too could be a basis to consider the two parts as combined into a larger program. From barry at python.org Wed Jan 28 11:59:38 2004 From: barry at python.org (Barry Warsaw) Date: Wed Jan 28 11:59:50 2004 Subject: [Python-Dev] Would GPL on python-mode.el be a problem? In-Reply-To: <16407.57774.541583.6592@montanaro.dyndns.org> References: <16407.57774.541583.6592@montanaro.dyndns.org> Message-ID: <1075309177.2635.22.camel@geddy> On Wed, 2004-01-28 at 11:22, Skip Montanaro wrote: > Not too long ago, Matthias Klose submitted an enormous patch (1370 lines) to > python-mode.el (only 3804 lines itself): > > https://sourceforge.net/tracker/index.php?func=detail&aid=875596&group_id=86916&atid=581351 > > I decided to start disecting it today and immediately ran into this comment > by Dave Love (fx@gnu.org): > > ;; This file contains significant bits of GPL'd code, so it must be > ;; under the GPL. -- fx I saw that too, and also blissfully ignored it. > During the python-mode project's short life, I've copied python-mode.el over > to the Python project's Misc directory a few times. I thought I'd better > ask if GPL-ness would present a problem. I have always remained blissfully > unaware of licensing details, so I have no grasp of what the ramifications > are. They are, of course, of tarpit proportions. :) I'd start by asking Dave Love for a deeper analysis of exactly which parts of python-mode he thinks are derived from GPL'd software. -Barry From guido at python.org Wed Jan 28 12:07:57 2004 From: guido at python.org (Guido van Rossum) Date: Wed Jan 28 12:08:06 2004 Subject: [Python-Dev] Would GPL on python-mode.el be a problem? In-Reply-To: Your message of "Wed, 28 Jan 2004 10:22:06 CST." <16407.57774.541583.6592@montanaro.dyndns.org> References: <16407.57774.541583.6592@montanaro.dyndns.org> Message-ID: <200401281707.i0SH7vM28595@c-24-5-183-134.client.comcast.net> > Not too long ago, Matthias Klose submitted an enormous patch (1370 lines) to > python-mode.el (only 3804 lines itself): > > https://sourceforge.net/tracker/index.php?func=detail&aid=875596&group_id=86916&atid=581351 > > I decided to start disecting it today and immediately ran into this comment > by Dave Love (fx@gnu.org): > > ;; This file contains significant bits of GPL'd code, so it must be > ;; under the GPL. -- fx > > During the python-mode project's short life, I've copied python-mode.el over > to the Python project's Misc directory a few times. I thought I'd better > ask if GPL-ness would present a problem. I have always remained blissfully > unaware of licensing details, so I have no grasp of what the ramifications > are. Serious. Do not copy this into Python. I would recomment rejecting the patch for python-mode.el as well (isn't that project also using the PSF license?). --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Wed Jan 28 12:21:31 2004 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 28 12:22:55 2004 Subject: [Python-Dev] Would GPL on python-mode.el be a problem? In-Reply-To: <200401281707.i0SH7vM28595@c-24-5-183-134.client.comcast.net> References: <16407.57774.541583.6592@montanaro.dyndns.org> <200401281707.i0SH7vM28595@c-24-5-183-134.client.comcast.net> Message-ID: <16407.61339.47664.712860@montanaro.dyndns.org> >> I have always remained blissfully unaware of licensing details, so I >> have no grasp of what the ramifications are. Guido> Serious. Guido> Do not copy this into Python. Okay. Guido> I would recomment rejecting the patch for python-mode.el as well Guido> (isn't that project also using the PSF license?). Made easier by the fact that it is a humongous patch implementing many different things and doesn't apply cleanly. Thx, Skip From administrator at portline.pt Wed Jan 28 19:53:26 2004 From: administrator at portline.pt (administrator@portline.pt) Date: Wed Jan 28 14:50:57 2004 Subject: [Python-Dev] Network Associates Webshield - e-mail Content Alert Message-ID: <200401281950.i0SJopYW078962@mxzilla3.xs4all.nl> Your message to " " with the subject field "Wsppmukdguhkrlzqj" has been held by our system because it contains unusual information for business mail. If a release request is not received, this message will be permanently deleted. Note that by requesting release you are authorizing an inspection of the mail. ** WARNING** All mail activity is logged and repeated attempts to send non-business mail of this type will result in the refusal to handle mail from your e-mail account. From postmaster at techwave.com.br Wed Jan 28 17:10:26 2004 From: postmaster at techwave.com.br (postmaster@techwave.com.br) Date: Wed Jan 28 17:10:49 2004 Subject: [Python-Dev] VIRUS IN YOUR MAIL Message-ID: <20040128221026.CEC2F18B99B@interprise.techwave.com.br> V I R U S A L E R T Our viruschecker found the 'W32/MyDoom-A' virus in your email to the following recipient: -> suporte@techwave.com.br Delivery of the email was stopped! Please check your system for viruses, or ask your system administrator to do so. For your reference, here are the SMTP envelope originator and headers from your email: >From python-dev@python.org ------------------------- BEGIN HEADERS ----------------------------- Received: from python.org (unknown [200.17.97.66]) by interprise.techwave.com.br (Postfix) with ESMTP id AF3DC18B99A for ; Wed, 28 Jan 2004 20:10:13 -0200 (BRST) From: python-dev@python.org To: suporte@techwave.com.br Subject: Date: Wed, 28 Jan 2004 20:14:23 -0200 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0012_E5F333C6.A1807F9D" X-Priority: 3 X-MSMail-Priority: Normal Message-Id: <20040128221013.AF3DC18B99A@interprise.techwave.com.br> -------------------------- END HEADERS ------------------------------ From ulaymond at daydreamer.com Wed Jan 28 21:38:28 2004 From: ulaymond at daydreamer.com (Loraine Grove) Date: Wed Jan 28 18:36:00 2004 Subject: [Python-Dev] Fast generic solution better than VIAAGRRA_1 Message-ID: OFFSHORE PHARMACY WITH NO QUESTIONS! Super Viagra - CIALIS is Here Dubbed "The Weekend Pill" * LONGER LIFE Up to 36 hours compared to 4 hours for viagra * ACTS FASTER from as little as 15 minutes compared to 60 minutes for viagra * BEST SEX EVER solid erection and only whenever you need it. With cialis YOU are in control "Cialis works in 15 minutes and lasts for 36 hours giving you strong healthy erections whenever you want or need them!" NO QUESTIONS. JUST PAY & WE SHIP. COMPLETE PRIVACY & CONFIDENTIATLITY. 128-bit encrypted website for maximum security. We also have an assortment of other drugs such as soma, valium, xanax, meridia, etc. http://deletion.instrhh.com/m001p/index.php?id=m0015 No more of this sort of material. Honoured in 24-48 hours. http://instrhh.com/m001p/byebye.html ravish steeplebush saponify turmoil dissuade conakry ascetic neutral pontificate copybook grid ancestral 3 5 From nobody at netapp.com Wed Jan 28 19:07:07 2004 From: nobody at netapp.com (Auto-Reply Daemon for matt) Date: Wed Jan 28 19:07:11 2004 Subject: [Python-Dev] Your mail to matt@netapp.com Message-ID: <200401290007.i0T077EB016140@frejya.corp.netapp.com> This is either an invalid user name or the person you have sent mail to, 'matt', is no longer employed with Network Appliance. Please take a moment to check your mailing list and update if necessary. Regrettably, we have no forwarding information for 'matt'. Your message has NOT been forwarded. Please update your records and try again. Please Contact HELPDESK at X6466 for any inquiries. Thank you, IT Department [the body of your message is below] >From python-dev@python.org Wed Jan 28 16:07:07 2004 >Received: from psmtp.com (frejya [10.57.157.119]) > by frejya.corp.netapp.com (8.12.9/8.12.9/NTAP-1.5) with SMTP id i0T073Di016124 > for ; Wed, 28 Jan 2004 16:07:07 -0800 (PST) >Received: from source ([10.254.252.22]) by frejya ([10.57.157.119]) with SMTP; > Wed, 28 Jan 2004 16:07:03 PST >Received: from python.org (cslab-07.union.edu [149.106.62.207]) > by mx02.netapp.com (8.12.10/8.12.10/NTAP-1.4) with ESMTP id i0T072Bo017306 > for ; Wed, 28 Jan 2004 16:07:02 -0800 (PST) >Message-Id: <200401290007.i0T072Bo017306@mx02.netapp.com> >From: python-dev@python.org >To: matt@netapp.com >Subject: Hello >Date: Wed, 28 Jan 2004 19:06:58 -0500 >MIME-Version: 1.0 >Content-Type: multipart/mixed; > boundary="----=_NextPart_000_0010_50A90073.2A6F9AC8" >X-Priority: 3 >X-MSMail-Priority: Normal >X-pstn-levels: (C:80.0762 M:99.4056 P:95.9108 R:95.9108 S: 7.2366 ) >X-pstn-settings: 3 (1.0000:1.0000) pmCr >X-pstn-addresses: from > >This is a multi-part message in MIME format. > >------=_NextPart_000_0010_50A90073.2A6F9AC8 >Content-Type: text/plain; charset=us-ascii >Content-Transfer-Encoding: 7bit > >------------------ Virus Warning Message (on frejya) > >Found virus WORM_MIMAIL.R in file body.pif (in body.zip) >The file is deleted. > >If you have questions, please contact helpdesk@netapp.com. > >--------------------------------------------------------- > >------=_NextPart_000_0010_50A90073.2A6F9AC8 >Content-Type: text/plain; > charset="Windows-1252" >Content-Transfer-Encoding: 7bit > >The message cannot be represented in 7-bit ASCII encoding and has been sent as a binary attachment. > > >------=_NextPart_000_0010_50A90073.2A6F9AC8 >Content-Type: text/plain; charset=us-ascii >Content-Transfer-Encoding: 7bit > > >------------------ Virus Warning Message (on frejya) > >body.zip is removed from here because it contains a virus. > >--------------------------------------------------------- >------=_NextPart_000_0010_50A90073.2A6F9AC8-- > > From postmaster at netapp.com Wed Jan 28 19:07:15 2004 From: postmaster at netapp.com (postmaster@netapp.com) Date: Wed Jan 28 19:07:21 2004 Subject: [Python-Dev] Virus Alert Message-ID: <200401290007.i0T07FRh015120@hawk.corp.netapp.com> The mail message (file: body.zip) you sent to contains a virus. (on frejya) From tim.one at comcast.net Wed Jan 28 22:31:33 2004 From: tim.one at comcast.net (Tim Peters) Date: Wed Jan 28 22:31:38 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040127214512.0659.JCARLSON@uci.edu> Message-ID: [Josiah Carlson] > ... > If there is a core Python developer who thinks that this functionality > should be included with Python (accessable via some method on a > builtin, name, attribute, etc.), I'm sure we'd like to hear from you. It doesn't appear to matter any more, but I did. It's not needed often enough to be a builtin, and ideas for confusing the existing meanings of min() and max() were rightfully dead on arrival, so finding a sensible place for sensible names is a real problem. The functionality of "a biggest" and "a smallest" object is A-OK by me, though, and something I'd use. Indeed, I had a bug within the past few years in a find-the-minimum search loop, using an initial value 10x larger than anything I thought I'd ever see. A few months later, the inputs in one case happened *all* to be larger than that, so it returned the bogus initial value as if it were a sensible result. Hilarity ensued. There are few enough bugs in my code at this age that I take very seriously any principled gimmick that could prevent repeating one. One thing I haven't needed is an object guaranteed to compare "bewteen" any two others . From MAILER-DAEMON at spsl.nsc.ru Thu Jan 29 00:12:21 2004 From: MAILER-DAEMON at spsl.nsc.ru (Mail Delivery Subsystem) Date: Thu Jan 29 00:17:35 2004 Subject: [Python-Dev] Returned mail: see transcript for details Message-ID: <200401290512.i0T5CLYG055785@camelot.spsl.nsc.ru> The original message was received at Thu, 29 Jan 2004 11:12:16 +0600 (NOVT) from localhost [127.0.0.1] ----- The following addresses had permanent fatal errors ----- (reason: 550 Unknown local part fancy in ) ----- Transcript of session follows ----- ... while talking to mail.nsu.ru.: >>> DATA <<< 550 Unknown local part fancy in 550 5.1.1 ... User unknown <<< 503 Valid RCPT TO must precede DATA -------------- next part -------------- Skipped content of type message/delivery-status-------------- next part -------------- An embedded message was scrubbed... From: python-dev@python.org Subject: Virus in letter for you Date: Thu, 29 Jan 2004 11:12:15 +0600 (NOVT) Size: 1290 Url: http://mail.python.org/pipermail/python-dev/attachments/20040129/c3c41aeb/attachment.mht From Symantec_AntiVirus_for_SMTP_Gateways at pasco.com Thu Jan 29 03:41:59 2004 From: Symantec_AntiVirus_for_SMTP_Gateways at pasco.com (Symantec_AntiVirus_for_SMTP_Gateways@pasco.com) Date: Thu Jan 29 03:38:43 2004 Subject: [Python-Dev] Content violation Message-ID: Content violation found in email message. Message rejected. From: python-dev@python.org To: hamilton@pasco.com File(s): data.pif Matching filename: *.pif From null at picolo.nl Thu Jan 29 10:24:49 2004 From: null at picolo.nl (Picolo! Automailer) Date: Thu Jan 29 04:25:26 2004 Subject: [Python-Dev] Uw mail naar een Picolo! mailinglist is niet verstuurd. Message-ID: [Delivery Error / Probleem bij bezorging] Beste Picolo.nl user, U heeft een e-mail gestuurd naar een Picolo! mailinglist die niet bestaat. Controleer of het e-mail adres van de mailinglist juist is. Uw e-mail had als onderwerp: Status Uw e-mail was gericht aan: john@picolo.nl Heeft u vragen of opmerkingen omtrent deze e-mail, neem dan contact op met helpdesk.service@picolo.nl Vriendelijke groeten Picolo! Mailinglist Service -- Heeft u geen e-mail naar een picolo mailinglist gestuurd en ontvangt u met regelmaat deze berichten, neemt u dan contact op met de Picolo! ongewenste e-mail service: spam.service@picolo.nl Dit bericht is automatisch gegenereerd door het Picolo! mailinglist systeem (4). Wilt u ook een eigen mailinglist? Bezoek dan http://www.picolo.nl From robot at wwmind.com Thu Jan 29 08:23:37 2004 From: robot at wwmind.com (WWMind Autoresponder) Date: Thu Jan 29 08:23:40 2004 Subject: [Python-Dev] Email non recapitata / Email not delivered Message-ID: Siamo spiacenti, l'indirizzo @wwmind.com al quale hai appena scritto è inesistente. La tua email non è stata recapitata. Sorry, the address specified in your recent email does not exist. Your email has not been delivered. WWMind.com From python at rcn.com Thu Jan 29 10:16:55 2004 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 29 10:18:59 2004 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Doc Makefile.deps, 1.115, 1.116 In-Reply-To: Message-ID: <003c01c3e67b$0b8160a0$e841fea9@oemcomputer> > Log Message: > update dependency information Thanks. Raymond From python at rcn.com Thu Jan 29 10:24:27 2004 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 29 10:25:41 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: Message-ID: <004301c3e67b$fc28bd00$e841fea9@oemcomputer> > Indeed, I had a bug within the past few years in a find-the-minimum search > loop, using an initial value 10x larger than anything I thought I'd ever > see. A few months later, the inputs in one case happened *all* to be > larger > than that, so it returned the bogus initial value as if it were a sensible > result. Hilarity ensued. There are few enough bugs in my code at this > age > that I take very seriously any principled gimmick that could prevent > repeating one. Alex Martelli had proposed extending the key= idea to min() and max(). The idea is to let those functions completely encapsulate the logic of searching for minimum and maximum elements: bestplayer = min(players, key=attrgetter('points')) This would work equally well with other objective functions. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From tim.one at comcast.net Thu Jan 29 11:04:00 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 29 11:04:05 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <004301c3e67b$fc28bd00$e841fea9@oemcomputer> Message-ID: [Tim' >> ... >> There are few enough bugs in my code at this age that I take very >> seriously any principled gimmick that could prevent repeating one. [Raymond] > Alex Martelli had proposed extending the key= idea to min() and max(). > The idea is to let those functions completely encapsulate the logic > of searching for minimum and maximum elements: > > bestplayer = min(players, key=attrgetter('points')) > > This would work equally well with other objective functions. Different issue. I didn't have "a sequence". The loop was more like: global_minimum = made_up_value_presumed_to_be_unreachably_large while mote_to_look_at: do a ton of computation, yielding a candidate if score(candidate) < global_minimum: global_minimum = score(candidate) do a ton of stuff to prune the search based on the new (so far) local minimum return global_minimum It's picking the made_up_value_presumed_to_be_unreachably_large that's a brittle hack, and is specifically the source of the bug I mentioned. Adding new twists to min() wouldn't have made any difference to that. From skip at pobox.com Thu Jan 29 11:30:02 2004 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 29 11:32:18 2004 Subject: [Python-Dev] Raymond's next task is... Message-ID: <16409.13578.497557.262449@montanaro.dyndns.org> Raymond seems lately to like converting slow Python modules into peppy extension modules. My nomination for his next endeavor is Lib/sre*.py, especially sre_compile.py and sre_parse.py. Skip From aahz at pythoncraft.com Thu Jan 29 11:55:56 2004 From: aahz at pythoncraft.com (Aahz) Date: Thu Jan 29 11:56:08 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: References: <004301c3e67b$fc28bd00$e841fea9@oemcomputer> Message-ID: <20040129165556.GA14295@panix.com> On Thu, Jan 29, 2004, Tim Peters wrote: > > Different issue. I didn't have "a sequence". The loop was more like: > > global_minimum = made_up_value_presumed_to_be_unreachably_large > while mote_to_look_at: > do a ton of computation, yielding a candidate > if score(candidate) < global_minimum: > global_minimum = score(candidate) > do a ton of stuff to prune the search based > on the new (so far) local minimum > return global_minimum > > It's picking the made_up_value_presumed_to_be_unreachably_large that's > a brittle hack, and is specifically the source of the bug I mentioned. > Adding new twists to min() wouldn't have made any difference to that. I'm curious: why you didn't use None as the initial value or use some other hack to avoid initializing with a specific number? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From ark-mlist at att.net Thu Jan 29 12:28:06 2004 From: ark-mlist at att.net (Andrew Koenig) Date: Thu Jan 29 12:27:49 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040129165556.GA14295@panix.com> Message-ID: <001a01c3e68d$42219460$6402a8c0@arkdesktop> > > global_minimum = made_up_value_presumed_to_be_unreachably_large > > while mote_to_look_at: > > do a ton of computation, yielding a candidate > > if score(candidate) < global_minimum: > > global_minimum = score(candidate) > > do a ton of stuff to prune the search based > > on the new (so far) local minimum > > return global_minimum > I'm curious: why you didn't use None as the initial value or use some > other hack to avoid initializing with a specific number? Perhaps None is unreachably small, rather than unreachably large? From tim.one at comcast.net Thu Jan 29 12:52:50 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 29 12:52:52 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040129165556.GA14295@panix.com> Message-ID: [Aahz] > I'm curious: why you didn't use None as the initial value or use some > other hack to avoid initializing with a specific number? In 2.3, None compares less than anything else; the search loop required that the initial value compare larger than anything else (although I wouldn't have been comfortable with relying on that None compares smaller either, since that's non-obvious version-specific behavior). I've since tended to write these kinds of loops as: global_min = None ... if global_min is None or score(candidate) < global_min: global_min = score(candidate) do stuff appropriate for a new local minimum but it's easy to forget the "global_min is None" clause -- in which case, *because* None compares less than everything else, the "if" test never passes. A conventional spelling for universal smallest and largest objects would allow for a very clean and clear solution. I'm reminded that we introduced "print >> f" partly because most people made mistakes in details when trying to temporarily rebind sys.stdout "by hand". Complex search loops are brittle in the details too. So I'll let Barry propose that a bare ">>" mean "biggest" and a bare "<<" mean "smallest" . >>> << < >> True From theller at python.net Thu Jan 29 13:02:13 2004 From: theller at python.net (Thomas Heller) Date: Thu Jan 29 13:02:28 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: (Tim Peters's message of "Thu, 29 Jan 2004 12:52:50 -0500") References: Message-ID: > A conventional spelling for universal smallest and largest objects would > allow for a very clean and clear solution. I'm reminded that we introduced > "print >> f" partly because most people made mistakes in details when trying > to temporarily rebind sys.stdout "by hand". Complex search loops are > brittle in the details too. So I'll let Barry propose that a bare ">>" mean > "biggest" and a bare "<<" mean "smallest" . > >>>> << < >> > True Well, then we should add "==" and "<>" also. I'm sure disambiguities in the grammar can be resolved somehow ;-) >>> == == == True >>> <> <> <> False Thomas From skip at pobox.com Thu Jan 29 13:31:35 2004 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 29 13:32:18 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: References: <20040129165556.GA14295@panix.com> Message-ID: <16409.20871.547345.759086@montanaro.dyndns.org> Tim> In 2.3, None compares less than anything else; Perhaps a stupid observation, but empty lists, dicts, strings and tuples appear to compare larger than any int or float: >>> lst = [(), [], {}, "", 1e308, sys.maxint, 0, -sys.maxint, -1e308, None] >>> lst.sort() >>> lst.reverse() >>> lst [(), '', [], {}, 1e+308, 2147483647, 0, -2147483647, -1e+308, None] >>> absmax = lst[0] >>> absmin = lst[-1] >>> print absmax () >>> print absmin None I realize this is implementation-dependent (and I'm sure Tim already knows all this stuff). People wanting absolute min and max objects could use something like the above to generate such stuff when needed. pep-326-not-really-needed-ly, y'rs, Skip From barry at python.org Thu Jan 29 13:37:42 2004 From: barry at python.org (Barry Warsaw) Date: Thu Jan 29 13:37:44 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: References: Message-ID: <1075401461.17818.83.camel@geddy> On Thu, 2004-01-29 at 12:52, Tim Peters wrote: > So I'll let Barry propose that a bare ">>" mean > "biggest" and a bare "<<" mean "smallest" . If only PEP 666 were free, I'd be all over that. -Barry From tim.one at comcast.net Thu Jan 29 13:49:52 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 29 13:49:55 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <16409.20871.547345.759086@montanaro.dyndns.org> Message-ID: [Skip] > Perhaps a stupid observation, but empty lists, dicts, strings and > tuples appear to compare larger than any int or float: Why do you presume it's numbers that are getting compared in a min/max search loop? Sometimes it's tuples, sometimes lists, sometimes strings, sometimes instances of a user-defined class (etc). > ... > I realize this is implementation-dependent (and I'm sure Tim already > knows all this stuff). People wanting absolute min and max objects > could use something like the above to generate such stuff when needed. Sorry, they can't. If so happens that every string compares less than every tuple -- you really expect sensible code to be written that way? The brittleness compounds if you try. Extending that example, it so happens that every Unicode string compares *larger* than every tuple, so apart from the inherent obscurity of relying on accidental crap like that, it can break when your input data changes a little. From aahz at pythoncraft.com Thu Jan 29 14:28:22 2004 From: aahz at pythoncraft.com (Aahz) Date: Thu Jan 29 14:28:29 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: References: <20040129165556.GA14295@panix.com> Message-ID: <20040129192822.GA17665@panix.com> On Thu, Jan 29, 2004, Tim Peters wrote: > [Aahz] >> >> I'm curious: why you didn't use None as the initial value or use some >> other hack to avoid initializing with a specific number? > > In 2.3, None compares less than anything else; the search loop required that > the initial value compare larger than anything else (although I wouldn't > have been comfortable with relying on that None compares smaller either, > since that's non-obvious version-specific behavior). > > I've since tended to write these kinds of loops as: > > global_min = None > ... > > if global_min is None or score(candidate) < global_min: > global_min = score(candidate) > do stuff appropriate for a new local minimum That's precisely what I was suggesting, yes. > but it's easy to forget the "global_min is None" clause -- in which case, > *because* None compares less than everything else, the "if" test never > passes. That's not a problem I've run into, and I don't see it cropping up all that often on c.l.py. (To be precise, I don't have any memories of it cropping up.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From skip at pobox.com Thu Jan 29 14:28:17 2004 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 29 14:28:38 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: References: <16409.20871.547345.759086@montanaro.dyndns.org> Message-ID: <16409.24273.189619.370115@montanaro.dyndns.org> Tim> [Skip] >> Perhaps a stupid observation, but empty lists, dicts, strings and >> tuples appear to compare larger than any int or float: Tim> Why do you presume it's numbers that are getting compared in a Tim> min/max search loop? Sorry, false assumption. I suspect it's the most common case. Skip From tim.one at comcast.net Thu Jan 29 14:36:58 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 29 14:37:01 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040129192822.GA17665@panix.com> Message-ID: >> if global_min is None or score(candidate) < global_min: ... [Aahz] > That's not a problem I've run into, and I don't see it cropping up all > that often on c.l.py. (To be precise, I don't have any memories of it > cropping up.) No, most search loops you see on c.l.py don't bother with this at all, and are brittle as a result. That's the point. The need for additional "thing is None" kinds of clauses is something people learn the hard way, after they've been burned by making a bad guess for an extreme initial value (or more likely what turns out to be bad *eventually* -- passes all their initial tests, but screws up later when input changes). From tim.one at comcast.net Thu Jan 29 16:56:17 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 29 16:56:40 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: Message-ID: [Thomas Heller] > Well, then we should add "==" and "<>" also. Alas, <> will be introduced as a deprecated spelling, despite Barry's protests. > I'm sure disambiguities in the grammar can be resolved somehow ;-) > > >>> == == == > True > >>> <> <> <> > False I'm quite sure the last line should be True. <> compares not equal to everything, including itself, while == compares equal to everything. Then >>> <> == == might be a nice way to spell random.choice([True, False]); or maybe it should raise a new ParadoxError. We should leave *something* for Guido to decide here . From barry at python.org Thu Jan 29 17:23:32 2004 From: barry at python.org (Barry Warsaw) Date: Thu Jan 29 17:23:34 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: References: Message-ID: <1075415011.17818.165.camel@geddy> On Thu, 2004-01-29 at 16:56, Tim Peters wrote: > [Thomas Heller] > > Well, then we should add "==" and "<>" also. > > Alas, <> will be introduced as a deprecated spelling, despite Barry's > protests. C'mon, Just! Let's beat him up until he gives. Then again, Py3K is so far away, I'll probably be hacking so much C#, my soul will be sucked dry and I won't care. -Barry From python at rcn.com Thu Jan 29 17:32:55 2004 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 29 17:34:37 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: Message-ID: <00c201c3e6b7$e3a8ccc0$e841fea9@oemcomputer> > [Thomas Heller] > > Well, then we should add "==" and "<>" also. > > Alas, <> will be introduced as a deprecated spelling, despite Barry's > protests. > > > I'm sure disambiguities in the grammar can be resolved somehow ;-) > > > > >>> == == == > > True > > >>> <> <> <> > > False [Tim] > I'm quite sure the last line should be True. <> compares not equal to > everything, including itself, while == compares equal to everything. Then > > >>> <> == == > > might be a nice way to spell random.choice([True, False]); or maybe it > should raise a new ParadoxError. We should leave *something* for Guido to > decide here . I suspect Jeremy will want a write-up about whether there should be early or late evaluation: >>> == == == >>> _.eval() True Raymond Hettinger From bsder at allcaps.org Thu Jan 29 19:46:30 2004 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Thu Jan 29 19:39:29 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040129192822.GA17665@panix.com> References: <20040129165556.GA14295@panix.com> <20040129192822.GA17665@panix.com> Message-ID: <20040129161953.C21991@mail.allcaps.org> On Thu, 29 Jan 2004, Aahz wrote: > > I've since tended to write these kinds of loops as: > > > > global_min = None > > ... > > > > if global_min is None or score(candidate) < global_min: > > global_min = score(candidate) > > do stuff appropriate for a new local minimum > > That's precisely what I was suggesting, yes. The code is still brittle. If score(candidate) returns None, hilarity ensues. None happens to compare less than everything; consequently, that None slides into global_min and obliterates the previous results. The next value wipes out the None because None is the initial sentinel. global_min now has a value which isn't minimal, but the error is hidden. The errors may even be random depending upon whether the real minimum comes before or after the None gets returned. (Gee, do ya think I've seen this before?) Even worse, None is the most likely value to appear in case of a bug since it is the default return value for a function which doesn't explicitly return a value. If magnitude comparisons involving None raised some kind of Exception, this would not be an issue. -a From skip at pobox.com Thu Jan 29 20:26:06 2004 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 29 20:26:11 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040129161953.C21991@mail.allcaps.org> References: <20040129165556.GA14295@panix.com> <20040129192822.GA17665@panix.com> <20040129161953.C21991@mail.allcaps.org> Message-ID: <16409.45742.896933.104080@montanaro.dyndns.org> >> > if global_min is None or score(candidate) < global_min: >> > global_min = score(candidate) >> > do stuff appropriate for a new local minimum >> That's precisely what I was suggesting, yes. Andrew> The code is still brittle. If score(candidate) returns None, Andrew> hilarity ensues. Presumably the return value of score() is under control of the author of the loop who can guarantee that it won't return None. Skip From tim.one at comcast.net Thu Jan 29 21:46:29 2004 From: tim.one at comcast.net (Tim Peters) Date: Thu Jan 29 21:46:32 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <20040129161953.C21991@mail.allcaps.org> Message-ID: [Andrew P. Lentvorski, Jr.] > The code is still brittle. If score(candidate) returns None, hilarity > ensues. > > ... > If magnitude comparisons involving None raised some kind of Exception, > this would not be an issue. I agree, but that's also a different issue. I think people get tripped up more often by that int < string is always true, and they'd get more relief from that class of bug if mixing ints with strings in (at least < <= > and >=) comparisons raised TypeError. That plain None evaluates as false in a Boolean context is also a source of nasty bugs. Even Guido has been bitten by that. Code like if not thing: ... when thing is initialized to None is common in some codebases, and can easily do a wrong thing when the *intent* was to say "if thing has been bound to a value other than None", i.e. if the actual intent was if thing is not None: ... Viewing None as "something that's not really there", it's neither true nor false, so an exception would make sense to me here too. But those are all incompatible changes. Adding some new singleton objects would not have been. From barry at python.org Thu Jan 29 22:25:58 2004 From: barry at python.org (Barry Warsaw) Date: Thu Jan 29 22:26:01 2004 Subject: [Python-Dev] PEP 326 (quick location possibility) In-Reply-To: <00c201c3e6b7$e3a8ccc0$e841fea9@oemcomputer> References: <00c201c3e6b7$e3a8ccc0$e841fea9@oemcomputer> Message-ID: <1075433157.22708.15.camel@geddy> On Thu, 2004-01-29 at 17:32, Raymond Hettinger wrote: > >>> == == == > > >>> _.eval() > True Clearly we need a new Maybe or Whatever builtin. -Barry From FBatista at uniFON.com.ar Fri Jan 30 07:52:52 2004 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Fri Jan 30 07:54:52 2004 Subject: [Python-Dev] PEP 327: Decimal Data Type Message-ID: I'm proud to announce that the PEP for Decimal Data Type is now published under the python.org structure: http://www.python.org/peps/pep-0327.html This wouldn't has been possible without the help from Alex Martelli, Aahz, Tim Peters, David Goodger and c.l.p itself. After the pre-PEP roundups the features are almost established. There is not agreement yet on how to create a Decimal from a float, in both explicit and implicit constructions. I depend on settle that to finish the test cases and actually start to work on the code. I'll apreciate any feedback. Thank you all in advance. . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20040130/af0af8a9/attachment.html From arigo at tunes.org Fri Jan 30 08:32:53 2004 From: arigo at tunes.org (Armin Rigo) Date: Fri Jan 30 08:35:45 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> Message-ID: <20040130133253.GA6338@vicky.ecs.soton.ac.uk> Hello Guido, On Sun, Jan 25, 2004 at 01:59:16PM -0800, Guido van Rossum wrote: > from /home/guido/lib/python import neatTricks Keeping the amount of magic behind import as low as possible seems very important, because they are not a minor feature but something that every beginner must reasonably understand; I've already seen it as an obstacle. The above statement has the advantage of looking obvious; but in addition to the package name problem there is the fact that directory names are not always valid Python identifiers. A last try: import neatTricks in "/home/guido/lib/python" # no package import package.module in "/home/guido/lib/python" # package import foo in "." # relative import from neatTricks in "../cmds" import a, b, c s=os.path.join("some", "where"); import foo in s # expression path where the semantics would be to search sys.path if and only if no 'in' clause is specified. ('in' doesn't sound quite right...) Armin From FBatista at uniFON.com.ar Fri Jan 30 09:17:24 2004 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Fri Jan 30 09:19:29 2004 Subject: [Python-Dev] Draft: PEP for imports Message-ID: Armin Rigo wrote: #- import neatTricks in "/home/guido/lib/python" # no package #- import package.module in "/home/guido/lib/python" # package #- import foo in "." # #- relative import #- from neatTricks in "../cmds" import a, b, c #- s=os.path.join("some", "where"); import foo in s # #- expression path #- #- where the semantics would be to search sys.path if and only #- if no 'in' clause #- is specified. ('in' doesn't sound quite right...) I'm +1 on this alternative. What about "at" instead of "in"? . Facundo From python at discworld.dyndns.org Fri Jan 30 09:27:34 2004 From: python at discworld.dyndns.org (Charles Cazabon) Date: Fri Jan 30 09:23:33 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040130133253.GA6338@vicky.ecs.soton.ac.uk>; from arigo@tunes.org on Fri, Jan 30, 2004 at 01:32:53PM +0000 References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> Message-ID: <20040130082734.A6453@discworld.dyndns.org> Armin Rigo wrote: > > import neatTricks in "/home/guido/lib/python" # no package > import package.module in "/home/guido/lib/python" # package [...] > where the semantics would be to search sys.path if and only if no 'in' clause > is specified. ('in' doesn't sound quite right...) "at" more closely matches the apparent usage above, i.e.: from neatTricks at "/path/to/dir" import bar It's a little less rosy as: import neatTricks at "/home/guido/lib/python" but not too ugly. Unfortunately the best grammatical match would be "from", but I'm positive that would be objectionable to someone: from neatTricks from "/path/to/dir" import TrickyDick Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://www.qcc.ca/~charlesc/software/ ----------------------------------------------------------------------- From mcherm at mcherm.com Fri Jan 30 10:06:21 2004 From: mcherm at mcherm.com (Michael Chermside) Date: Fri Jan 30 10:06:27 2004 Subject: [Python-Dev] PEP 327: Decimal Data Type Message-ID: <1075475181.401a72ed15fc5@mcherm.com> Facundo Batista writes: > I'm proud to announce that the PEP for Decimal Data Type is now published > http://www.python.org/peps/pep-0327.html VERY nice work here. Here's my 2 cents: (1) You propose conversion from floats via: Decimal(1.1, 2) == Decimal('1.1') Decimal(1.1, 16) == Decimal('1.1000000000000001') Decimal(1.1) == Decimal('110000000000000008881784197001252...e-51') I think that we'd do even better to ommit the second use. People who really want to convert floats exactly can easily write "Decimal(1.1, 60)". But hardly anyone wants to convert floats exactly, while lots of newbies would forget to include the second parameter. I'd say just make Decimal(someFloat) raise a TypeError with a helpful message about how you need that second parameter when using floats. (2) For adding a Decimal and a float, you write: I propose to allow the interaction with float, making an exact conversion and raising ValueError if exceeds the precision in the current context (this is maybe too tricky, because for example with a precision of 9, Decimal(35) + 1.2 is OK but Decimal(35) + 1.1 raises an error). I suppose that would be all right, but I think I'm with Aahz on this one... require explicit conversion. It prevents newbie errors, and non-newbies can provide the functionality extremely easily. Also, we can always change our minds to allow addition with floats if we initially release with that raising an exception. But if we ever release a version of Python where Decimal and float can be added, we'll be stuck supporting it forever. Really, that's all I came up with. This is great, and I'm looking forward to using it. I would, though, be interested in a couple more syntax-related details: (a) What's the syntax for changing the context? I'd think we'd want a "pushDecimalContext()" and "popDecimalContext()" sort of approach, since most well-behaved routines will want to restore their caller's context. (b) How about querying to determine a thread's current context? I don't have any use cases, but it would seem peculiar not to provide it. (c) Given a Decimal object, is there a straightforward way to determine its coefficient and exponent? Methods named .precision() and .exponent() might do the trick. -- Michael Chermside From guido at python.org Fri Jan 30 10:29:19 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 30 10:29:24 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: Your message of "Fri, 30 Jan 2004 13:32:53 GMT." <20040130133253.GA6338@vicky.ecs.soton.ac.uk> References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> Message-ID: <200401301529.i0UFTJS02766@c-24-5-183-134.client.comcast.net> > Hello Guido, Hello Armin, > > from /home/guido/lib/python import neatTricks > > Keeping the amount of magic behind import as low as possible seems > very important, because they are not a minor feature but something > that every beginner must reasonably understand; I've already seen it > as an obstacle. The above statement has the advantage of looking > obvious; but in addition to the package name problem there is the > fact that directory names are not always valid Python identifiers. > A last try: > > import neatTricks in "/home/guido/lib/python" # no package > import package.module in "/home/guido/lib/python" # package > import foo in "." # relative import > from neatTricks in "../cmds" import a, b, c > s=os.path.join("some", "where"); import foo in s # expression path > > where the semantics would be to search sys.path if and only if no > 'in' clause is specified. ('in' doesn't sound quite right...) The 'from package at import module' syntax proposed later reads better, and the new keyword 'at' could be handled syntactically the same way as 'as'. I'm still -0 on this feature. You say that every beginner must learn import. This is indeed true. But shouldn't they learn how to set the path right rather than learn how to import by filename? Setting the path is one of the things that everybody must learn about import anyway. In any software endeavor, there comes a time when a piece of code is released with a reference to a file name that is only valid on the author's machine. I think we shouldn't encourage beginners to make such mistakes! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Jan 30 10:40:57 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 30 10:41:12 2004 Subject: [Python-Dev] LAST CHANCE! PyCon Early bird registration ends Sunday Message-ID: <200401301540.i0UFevo02863@c-24-5-183-134.client.comcast.net> If you're planning to come to PyCon DC 2004, make sure to REGISTER by Sunday, Feb 1. The early bird registration fee is only $175 ($125 for students). Monday it goes up to $250!!! To register, visit: http://www.pycon.org/dc2004/register/ Speakers, you too!!! DC 2004 will be held March 24-26, 2004 in Washington, D.C. The keynote speaker is Mitch Kapor of the Open Source Applications Foundation (http://www.osafoundation.org/). There will be a four-day development sprint before the conference. For all PyCon information: http://www.pycon.org/ --Guido van Rossum (home page: http://www.python.org/~guido/) From vscan at vdr.ma Fri Jan 30 11:42:08 2004 From: vscan at vdr.ma (vscan@vdr.ma) Date: Fri Jan 30 11:43:03 2004 Subject: [Python-Dev] VIRUS IN YOUR MAIL Message-ID: <20040130164208.BE625A94233@xfusion.intranet> V I R U S A L E R T Our viruschecker found the I-Worm.Mydoom virus(es) in your email to the following recipient(s): -> postfix@vdr.ma -> alex@wanadoo.fr Please check your system for viruses, or ask your system administrator to do so. For your reference, here are the headers from your email: ------------------------- BEGIN HEADERS ----------------------------- Received: from python.org (unknown [192.168.0.201]) by xfusion.intranet (Postfix) with ESMTP id 396EDA94231 for ; Fri, 30 Jan 2004 16:42:08 +0000 (WET) From: python-dev@python.org To: alex@wanadoo.fr Subject: hello Date: Fri, 30 Jan 2004 16:41:17 +0600 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0002_1183E8EE.EABEFAA8" X-Priority: 3 X-MSMail-Priority: Normal Message-Id: <20040130164208.396EDA94231@xfusion.intranet> -------------------------- END HEADERS ------------------------------ From gerrit at nl.linux.org Fri Jan 30 11:32:39 2004 From: gerrit at nl.linux.org (Gerrit Holl) Date: Fri Jan 30 12:55:38 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040130082734.A6453@discworld.dyndns.org> References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> <20040130082734.A6453@discworld.dyndns.org> Message-ID: <20040130163239.GA4921@nl.linux.org> Charles Cazabon wrote: > "at" more closely matches the apparent usage above, i.e.: > > from neatTricks at "/path/to/dir" import bar > > It's a little less rosy as: > > import neatTricks at "/home/guido/lib/python" In my opinion, "import foo at bar" reads too much like "import foo as bar". just my 2 mucents, Gerrit. -- 31. If he hire it out for one year and then return, the house, garden, and field shall be given back to him, and he shall take it over again. -- 1780 BC, Hammurabi, Code of Law -- PrePEP: Builtin path type http://people.nl.linux.org/~gerrit/creaties/path/pep-xxxx.html Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/ From python at discworld.dyndns.org Fri Jan 30 13:40:26 2004 From: python at discworld.dyndns.org (Charles Cazabon) Date: Fri Jan 30 13:36:29 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040130163239.GA4921@nl.linux.org>; from gerrit@nl.linux.org on Fri, Jan 30, 2004 at 05:32:39PM +0100 References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> <20040130082734.A6453@discworld.dyndns.org> <20040130163239.GA4921@nl.linux.org> Message-ID: <20040130124026.B8424@discworld.dyndns.org> Gerrit Holl wrote: > > > > import neatTricks at "/home/guido/lib/python" > > In my opinion, "import foo at bar" reads too much like "import foo as bar". Agreed. I'm -0 on the whole idea, but if it's wanted badly enough, "fromdir" might be less confusing than "in" or "at". Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://www.qcc.ca/~charlesc/software/ ----------------------------------------------------------------------- From exarkun at intarweb.us Fri Jan 30 09:26:13 2004 From: exarkun at intarweb.us (Jp Calderone) Date: Fri Jan 30 16:16:48 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040130133253.GA6338@vicky.ecs.soton.ac.uk> References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> Message-ID: <20040130142613.GA29255@intarweb.us> On Fri, Jan 30, 2004 at 01:32:53PM +0000, Armin Rigo wrote: > Hello Guido, > > On Sun, Jan 25, 2004 at 01:59:16PM -0800, Guido van Rossum wrote: > > from /home/guido/lib/python import neatTricks > > Keeping the amount of magic behind import as low as possible seems very > important, because they are not a minor feature but something that every > beginner must reasonably understand; I've already seen it as an obstacle. > The above statement has the advantage of looking obvious; but in addition to > the package name problem there is the fact that directory names are not always > valid Python identifiers. A last try: > > import neatTricks in "/home/guido/lib/python" # no package > import package.module in "/home/guido/lib/python" # package > import foo in "." # relative import > from neatTricks in "../cmds" import a, b, c > s=os.path.join("some", "where"); import foo in s # expression path > An experienced Python programmer could probably use this to some advantage, but I think many beginners would use it without realizing it will very easily result in a piece of software which is completely undistributable. They may save some time up front by not having to understanding how the rest of the import system works, but will it be more time than they have to spend later trying to fix their app so it runs on systems other than the one it was developed on? Jp From barry at python.org Fri Jan 30 16:30:55 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 30 17:15:48 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev Message-ID: <1075498254.1075.243.camel@anthem> There's been some talk off-list about changing the posting rules for python-dev, such that non-member post attempts are automatically rejected (i.e. bounced). This means you'd have to be a member to post to the list. I think that's not an inappropriate rule for python-dev. taking-the-pulse-of-the-list-ly y'rs, -Barry From chrisandannreedy at comcast.net Fri Jan 30 18:00:14 2004 From: chrisandannreedy at comcast.net (Chris Reedy) Date: Fri Jan 30 18:00:22 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040130133253.GA6338@vicky.ecs.soton.ac.uk> References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> Message-ID: <401AE1FD.7050809@comcast.net> Armin Rigo wrote: > import neatTricks in "/home/guido/lib/python" # no package > import package.module in "/home/guido/lib/python" # package > import foo in "." # relative import > from neatTricks in "../cmds" import a, b, c > s=os.path.join("some", "where"); import foo in s # expression path > >where the semantics would be to search sys.path if and only if no 'in' clause >is specified. ('in' doesn't sound quite right...) > Armin - Thinking about the semantics of the above, including corner cases, I find myself wanting to make the semantics of: in "" to be equivalent to: sys.path.insert(0, "") del sys.path[0] # assuming the import doesn't modify sys.path That doesn't seem to be quite equivalent to what you've proposed. On the other hand, that seems to provide at least semi-reasonable answers to questions about what happens to sys.modules, what happens if sys.modules['package'] is defined when importing package.module, what's the correct search path when a module being imported does an import, etc. Having said that, that makes me feel negatively about the proposal, since you could do it yourself directly. Chris From Jack.Jansen at cwi.nl Fri Jan 30 18:02:30 2004 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Fri Jan 30 18:02:41 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <1075498254.1075.243.camel@anthem> References: <1075498254.1075.243.camel@anthem> Message-ID: <61972A24-5378-11D8-830C-000A27B19B96@cwi.nl> On 30-jan-04, at 22:30, Barry Warsaw wrote: > There's been some talk off-list about changing the posting rules for > python-dev, such that non-member post attempts are automatically > rejected (i.e. bounced). This means you'd have to be a member to post > to the list. I think that's not an inappropriate rule for python-dev. Personally I don't see that python-dev has a problem with non-member postings. There's the occasional post that should have been sent to python-list, but that happens only once a week or so, and usually there's only one "wrong list" reply. I tend to prefer members-only only for lists where the signal-noise ratio drops below 90% or less. Even with the current virusstorm python-dev seems to keep doing pretty good. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From pyth at trillke.net Fri Jan 30 18:04:38 2004 From: pyth at trillke.net (Holger Krekel) Date: Fri Jan 30 18:04:45 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <1075498254.1075.243.camel@anthem> References: <1075498254.1075.243.camel@anthem> Message-ID: <20040130230438.GB27733@solar.trillke.net> Barry Warsaw wrote: > There's been some talk off-list about changing the posting rules for > python-dev, such that non-member post attempts are automatically > rejected (i.e. bounced). Wouldn't this mean sending back mail to spammers? > This means you'd have to be a member to post > to the list. I think that's not an inappropriate rule for python-dev. Does the off-list reasoning have to do with spam messages? cheers, holger From chrisandannreedy at comcast.net Fri Jan 30 18:07:47 2004 From: chrisandannreedy at comcast.net (Chris and Ann Reedy) Date: Fri Jan 30 18:07:55 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <1075498254.1075.243.camel@anthem> References: <1075498254.1075.243.camel@anthem> Message-ID: <401AE3C3.2040203@comcast.net> Barry Warsaw wrote: Barry - I recently subscribed to the list after reading from the web site for several months. I wonder if the proposed policy will end up cutting off postings from people who follow the list but aren't otherwise subscibed, and if that's good or bad. Chris >There's been some talk off-list about changing the posting rules for >python-dev, such that non-member post attempts are automatically >rejected (i.e. bounced). This means you'd have to be a member to post >to the list. I think that's not an inappropriate rule for python-dev. > >taking-the-pulse-of-the-list-ly y'rs, >-Barry > From tim.one at comcast.net Fri Jan 30 18:30:10 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 30 18:30:18 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <401AE3C3.2040203@comcast.net> Message-ID: [Chris and Ann Reedy] > I recently subscribed to the list after reading from the web site > for several months. I wonder if the proposed policy will end up > cutting off postings from people who follow the list but aren't > otherwise subscibed, It would, but you could subscribe and then immediately disable email delivery for your subscription (so you'd still be a list member, and so allowed to post, but wouldn't get any email from the list). > and if that's good or bad. It depends on which posts get bounced . I get so many spams, worms, etc, that I wouldn't notice the difference if python-dev stopped leaking them. Presumably other people would notice. From arigo at tunes.org Fri Jan 30 18:40:33 2004 From: arigo at tunes.org (Armin Rigo) Date: Fri Jan 30 18:43:23 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040130142613.GA29255@intarweb.us> References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> <20040130142613.GA29255@intarweb.us> Message-ID: <20040130234033.GA31124@vicky.ecs.soton.ac.uk> Hello, On Fri, Jan 30, 2004 at 09:26:13AM -0500, Jp Calderone wrote: > > import neatTricks in "/home/guido/lib/python" # no package > > import package.module in "/home/guido/lib/python" # package > > import foo in "." # relative import > > from neatTricks in "../cmds" import a, b, c > > s=os.path.join("some", "where"); import foo in s # expression path > > An experienced Python programmer could probably use this to some > advantage, but I think many beginners would use it without realizing it will > very easily result in a piece of software which is completely > undistributable. Point taken. As far as I can see I believe that the whole module system problably needs some more in-depth rethinking at some point. For the beginner the package system is baffling, and for the advanced user it lacks precise control over what is imported from where, for the cases where it is really needed. [tentative development] In the multi-directory projects I am familiar with, I've seen two cases: * a stand-alone application that has to deal with file-system-level directories anyway to access data files; this uses sys.path hacks all around to make sure the needed directories are accessible when needed. It would have been immediately useful to be able to write "import gamesrv fromdir '../common'". It would also have been nice to be able to open a file relative to the current module. (In this case relative file and module names are good.) * a big package named after the application, in which we try to make all imports absolute. This is cleaner apart from the need to setup sys.path initially from user-runnable scripts that are not at the root (which is the case of all the test_*.py). The needed hackery can be nicely isolated in a file 'autopath.py' that is copied into each directory, so that each user script starts with "import autopath". I never find myself using "import package.module" but always "from package import module". Actually I still have to be convinced that packages are a Good Thing. As far as I am concerned, sys.path hacks are the only case in which I miss a Python feature sufficiently to have to copy similar custom code all around my projects. A more radical point of view is that these sys.path hacks are here not because of a missing feature, but on the contrary to work around the way Python tries to isolate me from the "messiness out there" (the file system) by mapping it into a neat language construct (packages). Trying to map a user-controlled structure over another, different one is a receipt for eventual confusion, however good the design is. I think I'd prefer to have just a reasonably natural way to talk about modules -- and files in general -- relative to the current module or the standard library. This is only a vague idea, but I believe it would fit better into (at least my view of) Python's philosophy: keep the language simple by avoiding abstraction layers that are not "natural", i.e. that do some kind of remapping of concepts, as opposed to just expose a higher-level view of the same concepts. A bientot, Armin. From guido at python.org Fri Jan 30 18:52:06 2004 From: guido at python.org (Guido van Rossum) Date: Fri Jan 30 18:52:12 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: Your message of "Fri, 30 Jan 2004 16:30:55 EST." <1075498254.1075.243.camel@anthem> References: <1075498254.1075.243.camel@anthem> Message-ID: <200401302352.i0UNq6m04057@c-24-5-183-134.client.comcast.net> > There's been some talk off-list about changing the posting rules for > python-dev, such that non-member post attempts are automatically > rejected (i.e. bounced). This means you'd have to be a member to post > to the list. I think that's not an inappropriate rule for python-dev. I don't see a need. What is this trying to prevent? --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Fri Jan 30 18:54:19 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 30 18:54:25 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040130234033.GA31124@vicky.ecs.soton.ac.uk> References: <20040130142613.GA29255@intarweb.us> <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> <20040130142613.GA29255@intarweb.us> Message-ID: <5.1.1.6.0.20040130184813.02ba4860@telecommunity.com> At 11:40 PM 1/30/04 +0000, Armin Rigo wrote: >A more radical point of view is that these sys.path hacks are here not because >of a missing feature, but on the contrary to work around the way Python tries >to isolate me from the "messiness out there" (the file system) by mapping it >into a neat language construct (packages). Without that, various kinds of "hacks" such as importing from DLLs/.so's, zipfiles, frozen modules, py2exe, PYTHONPATH, and the like, would not be practical. These use cases alone are enough to make me -1 on adding explicit paths into the module system from within code. I'm *much* more interested in making non-module (but "read-only") files available through the package/module abstraction, than in breaking the abstraction to specify directories. For example, in PEAK I've implemented both a utility function ('fileNearModule("some.module","relative/path")') and a URL scheme ('pkgfile:some.module/relative/path') to support this. From skip at pobox.com Fri Jan 30 19:21:16 2004 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 30 19:21:20 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <1075498254.1075.243.camel@anthem> References: <1075498254.1075.243.camel@anthem> Message-ID: <16410.62716.370529.741040@montanaro.dyndns.org> Barry> There's been some talk off-list about changing the posting rules Barry> for python-dev, such that non-member post attempts are Barry> automatically rejected (i.e. bounced). Wasn't that how things worked when the list was first formed? If so, what was the rationale for the change to the current policy? Skip From fdrake at acm.org Fri Jan 30 19:32:18 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Jan 30 19:32:27 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <16410.62716.370529.741040@montanaro.dyndns.org> References: <1075498254.1075.243.camel@anthem> <16410.62716.370529.741040@montanaro.dyndns.org> Message-ID: <200401301932.18086.fdrake@acm.org> On Friday 30 January 2004 07:21 pm, Skip Montanaro wrote: > Wasn't that how things worked when the list was first formed? If so, > what was the rationale for the change to the current policy? Yes, it was. I don't remember why it was changed; we were probably accused of exclusivity. That's how these things often happen. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From aahz at pythoncraft.com Fri Jan 30 19:41:10 2004 From: aahz at pythoncraft.com (Aahz) Date: Fri Jan 30 19:41:24 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <16410.62716.370529.741040@montanaro.dyndns.org> References: <1075498254.1075.243.camel@anthem> <16410.62716.370529.741040@montanaro.dyndns.org> Message-ID: <20040131004110.GA19627@panix.com> On Fri, Jan 30, 2004, Skip Montanaro wrote: > > Barry> There's been some talk off-list about changing the posting rules > Barry> for python-dev, such that non-member post attempts are > Barry> automatically rejected (i.e. bounced). > > Wasn't that how things worked when the list was first formed? If so, what > was the rationale for the change to the current policy? Not quite. Historically, python-dev was a by-invite list, which definitely led to charges of exclusivity; however, anyone could *post* to the list. Barry's proposed policy would allow anyone to subscribe, but only subscribers could post. I'm overall +0, speaking as the person who most often responds to inappropriate posts. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From kbk at shore.net Fri Jan 30 19:44:02 2004 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri Jan 30 19:44:06 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: (Tim Peters's message of "Fri, 30 Jan 2004 18:30:10 -0500") References: Message-ID: <877jz90wx9.fsf@hydra.localdomain> "Tim Peters" writes: > It would, but you could subscribe and then immediately disable email > delivery for your subscription (so you'd still be a list member, and > so allowed to post, but wouldn't get any email from the list). This is an excellent method that I've used for a number of lists. For people who just want to post occasionally, this works very well, and I don't think it's asking to much to have them subscribe. A problem arises when people try to post from Google Groups. There may be some lists where we want to encourage that, and they (c.l.p) would need to stay open. > It depends on which posts get bounced . I get so many spams, > worms, etc, that I wouldn't notice the difference if python-dev > stopped leaking them. Presumably other people would notice. Yeah, but you want them for the spam file :-) Presumably, if you turned on your Spambayes you wouldn't see any. The real problem is the archives. On a low volume list, like Idle-dev, the spam completely drowns the ham unless it's moderated out. Once in the archive, it's an annoyance for all time. The lists are screening direct spam pretty well. The current problem is all the MTA bounces complaining about the "virus you sent" or "no such user". Moderating the lists is taking a fair amount of admin time right now. At least during storms, it seems reasonable to bounce non-subscribers automatically. The bounce will tell them what to do to get posted. -- KBK From barry at python.org Fri Jan 30 20:15:04 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 30 20:15:31 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <20040130234033.GA31124@vicky.ecs.soton.ac.uk> References: <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> <20040130142613.GA29255@intarweb.us> <20040130234033.GA31124@vicky.ecs.soton.ac.uk> Message-ID: <1075511703.1075.332.camel@anthem> On Fri, 2004-01-30 at 18:40, Armin Rigo wrote: > * a stand-alone application that has to deal with file-system-level > directories anyway to access data files; this uses sys.path hacks all around > to make sure the needed directories are accessible when needed. It would have > been immediately useful to be able to write "import gamesrv fromdir > '../common'". It would also have been nice to be able to open a file relative > to the current module. (In this case relative file and module names are > good.) I don't use path hacks to find data files, but I have gotten into the habit of making data file directories packages when they live inside the source tree. Then I can just do from my.package import data myfile = os.path.join(os.path.dirname(data.__file__), 'myfile.xml') > * a big package named after the application, in which we try to make all > imports absolute. This is cleaner apart from the need to setup sys.path > initially from user-runnable scripts that are not at the root (which is the > case of all the test_*.py). The needed hackery can be nicely isolated in a > file 'autopath.py' that is copied into each directory, so that each user > script starts with "import autopath". This is definitely a common idiom. I call the file path.py and yes, every top level script must import this before it imports any of the application's modules. We've been doing this kind of thing for years, and while I very often wish there was something we could put in the standard library, or hack we could add to the executable, there have really never been any elegant-enough solutions to the problem. > I never find myself using "import package.module" but always "from package > import module". Actually I still have to be convinced that packages are a > Good Thing. As far as I am concerned, sys.path hacks are the only case in > which I miss a Python feature sufficiently to have to copy similar custom code > all around my projects. Two different things. Packages are a huge win IMO because it allows for larger organizing principles. Flat import space was really painful before packages came along. Things like distutils wouldn't be feasible without them, and we'd be arguing about naming conventions much more if we didn't have them. That's aside from the path hackery necessary for top level application scripts. I use "from package import module" all the time and I think Python's rules here are just about right. Personally, my only complaint is the default-relative import rule. > A more radical point of view is that these sys.path hacks are here not because > of a missing feature, but on the contrary to work around the way Python tries > to isolate me from the "messiness out there" (the file system) by mapping it > into a neat language construct (packages). Trying to map a user-controlled > structure over another, different one is a receipt for eventual confusion, > however good the design is. I disagree. I really like the layer of abstraction Python imposes here and I'd hate to see that taken away. I'm -1 on any proposal to use (forward) slashes in import statements. To me, it's way to system dependent (wait 'til the Windows people push for backslashes). I also like the level of indirection, control, and flexibility that sys.path gives you. I agree it would be nice to not have to write autopath.py modules, but that only affects a handful of files in an otherwise large application. Other than relative imports, the only other thing I occasionally miss in Python's import semantics is the fact that more than one file system space can't be mapped into one package namespace without trickery. E.g. if in "import my.package.sub1" and "import my.package.sub2", the directories sub1 and sub2 didn't have to live under a directory called my/package. > I think I'd prefer to have just a reasonably natural way to talk about modules > -- and files in general -- relative to the current module or the standard > library. This is only a vague idea, but I believe it would fit better into > (at least my view of) Python's philosophy: keep the language simple by > avoiding abstraction layers that are not "natural", i.e. that do some kind of > remapping of concepts, as opposed to just expose a higher-level view of the > same concepts. I disagree that Python's current rules are not natural. I think they're very natural, modulo a few nits here and there. I definitely don't want to live closer to the metal in the import statement. -Barry From pje at telecommunity.com Fri Jan 30 20:25:19 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 30 20:25:31 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <1075511703.1075.332.camel@anthem> References: <20040130234033.GA31124@vicky.ecs.soton.ac.uk> <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> <20040130142613.GA29255@intarweb.us> <20040130234033.GA31124@vicky.ecs.soton.ac.uk> Message-ID: <5.1.1.6.0.20040130202430.02034e30@telecommunity.com> At 08:15 PM 1/30/04 -0500, Barry Warsaw wrote: >Other than relative imports, the only other thing I occasionally miss in >Python's import semantics is the fact that more than one file system >space can't be mapped into one package namespace without trickery. E.g. >if in "import my.package.sub1" and "import my.package.sub2", the >directories sub1 and sub2 didn't have to live under a directory called >my/package. Didn't Guido write a 'pkgutil' module for this? Or am I confusing what he wrote with what you want? :) From barry at python.org Fri Jan 30 20:32:32 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 30 20:32:42 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <877jz90wx9.fsf@hydra.localdomain> References: <877jz90wx9.fsf@hydra.localdomain> Message-ID: <1075512752.1075.350.camel@anthem> On Fri, 2004-01-30 at 19:44, Kurt B. Kaiser wrote: > Moderating the lists is taking a fair amount of admin time right now. > At least during storms, it seems reasonable to bounce non-subscribers > automatically. The bounce will tell them what to do to get posted. That's really the biggest motive for my suggestion. OTOH, python-dev at the moment doesn't have a ton of held messages, because the list was actually misconfigured to accept all non-member posts until about 4 hours ago. What's there now is all unwanted (spam, virus bounces, etc). Ideally, Mailman would have a challenge-response system for non-member posters, but I don't want to get into that here. Right now, Jeremy and I are the only admins and neither of us has much time to be diligent in clearing the queue. So maybe let's turn this around: if I can get 3 volunteers to be python-dev moderators, we'll leave the current policy of holding non-member postings for approval. Otherwise, we'll auto reject them. Email me directly if you're interested. -Barry From barry at python.org Fri Jan 30 20:35:01 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 30 20:35:08 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <20040131004110.GA19627@panix.com> References: <1075498254.1075.243.camel@anthem> <16410.62716.370529.741040@montanaro.dyndns.org> <20040131004110.GA19627@panix.com> Message-ID: <1075512900.1075.354.camel@anthem> On Fri, 2004-01-30 at 19:41, Aahz wrote: > Not quite. Historically, python-dev was a by-invite list, which > definitely led to charges of exclusivity; however, anyone could *post* > to the list. Barry's proposed policy would allow anyone to subscribe, > but only subscribers could post. Right! -Barry From barry at python.org Fri Jan 30 20:39:08 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 30 20:39:17 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <5.1.1.6.0.20040130202430.02034e30@telecommunity.com> References: <20040130234033.GA31124@vicky.ecs.soton.ac.uk> <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> <20040130142613.GA29255@intarweb.us> <20040130234033.GA31124@vicky.ecs.soton.ac.uk> <5.1.1.6.0.20040130202430.02034e30@telecommunity.com> Message-ID: <1075513148.1075.357.camel@anthem> On Fri, 2004-01-30 at 20:25, Phillip J. Eby wrote: > At 08:15 PM 1/30/04 -0500, Barry Warsaw wrote: > > >Other than relative imports, the only other thing I occasionally miss in > >Python's import semantics is the fact that more than one file system > >space can't be mapped into one package namespace without trickery. E.g. > >if in "import my.package.sub1" and "import my.package.sub2", the > >directories sub1 and sub2 didn't have to live under a directory called > >my/package. > > Didn't Guido write a 'pkgutil' module for this? Or am I confusing what he > wrote with what you want? :) He did! I totally spaced on that. Okay, Python packages are more close to perfect than I thought. :) Maybe, though, it's time to get rid of this warning from http://www.python.org/doc/current/lib/module-pkgutil.html Warning: This is an experimental module. It may be withdrawn or completely changed up to an including the release of Python 2.3 beta 1. :) -Barry From tim.one at comcast.net Fri Jan 30 20:59:42 2004 From: tim.one at comcast.net (tim.one@comcast.net) Date: Fri Jan 30 20:59:45 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev Message-ID: <013120040159.2910.6a04@comcast.net> [Skip] >> Wasn't that how things worked when the list was first formed? If so, >> what was the rationale for the change to the current policy? [Barry] > Yes, it was. I don't remember why it was changed; we were probably accused > of exclusivity. That's how these things often happen. I think we're confusing distinct gimmicks: at the start, anyone could post to python-dev, but a subscription request had to be approved by the list admin. Only the latter has changed. I read Barry as asking about changing the former this time, while leaving the latter alone. From barry at python.org Fri Jan 30 21:04:28 2004 From: barry at python.org (Barry Warsaw) Date: Fri Jan 30 21:04:40 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <013120040159.2910.6a04@comcast.net> References: <013120040159.2910.6a04@comcast.net> Message-ID: <1075514668.1075.375.camel@anthem> On Fri, 2004-01-30 at 20:59, tim.one@comcast.net wrote: > [Skip] > >> Wasn't that how things worked when the list was first formed? If so, > >> what was the rationale for the change to the current policy? > > [Barry] > > Yes, it was. I don't remember why it was changed; we were probably accused > > of exclusivity. That's how these things often happen. That wern't me. I think it was Fred. > I think we're confusing distinct gimmicks: at the start, anyone could post to python-dev, but a subscription request had to be approved by the list admin. Only the latter has changed. I read Barry as asking about changing the former this time, while leaving the latter alone. Yep. -Barry From pje at telecommunity.com Fri Jan 30 21:20:33 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 30 21:20:40 2004 Subject: [Python-Dev] Draft: PEP for imports In-Reply-To: <1075513148.1075.357.camel@anthem> References: <5.1.1.6.0.20040130202430.02034e30@telecommunity.com> <20040130234033.GA31124@vicky.ecs.soton.ac.uk> <20040119214528.GA9767@panix.com> <200401252159.i0PLxGj18366@c-24-5-183-134.client.comcast.net> <20040130133253.GA6338@vicky.ecs.soton.ac.uk> <20040130142613.GA29255@intarweb.us> <20040130234033.GA31124@vicky.ecs.soton.ac.uk> <5.1.1.6.0.20040130202430.02034e30@telecommunity.com> Message-ID: <5.1.1.6.0.20040130211856.02559ec0@telecommunity.com> At 08:39 PM 1/30/04 -0500, Barry Warsaw wrote: >On Fri, 2004-01-30 at 20:25, Phillip J. Eby wrote: > > At 08:15 PM 1/30/04 -0500, Barry Warsaw wrote: > > > > >Other than relative imports, the only other thing I occasionally miss in > > >Python's import semantics is the fact that more than one file system > > >space can't be mapped into one package namespace without trickery. E.g. > > >if in "import my.package.sub1" and "import my.package.sub2", the > > >directories sub1 and sub2 didn't have to live under a directory called > > >my/package. > > > > Didn't Guido write a 'pkgutil' module for this? Or am I confusing what he > > wrote with what you want? :) > >He did! I totally spaced on that. Okay, Python packages are more close >to perfect than I thought. :) Obviously, the time machine still works in California, but now with added cross-continent spatial reloaction, allowing him to write 'pkgutil' not only some time ago, but on the opposite coast as well. :) From tim.one at comcast.net Fri Jan 30 22:50:29 2004 From: tim.one at comcast.net (Tim Peters) Date: Fri Jan 30 22:50:36 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: <1075514668.1075.375.camel@anthem> Message-ID: Tim attributes something to Barry: > [Barry] >> Yes, it was. I don't remember why it was changed; we were probably >> accused of exclusivity. That's how these things often happen. Someone also claiming to be "Barry" retorts: [Barry] > That wern't me. I think it was Fred. If you scan the headers of "Fred"'s message, you'll find 13 in there. That's Barry's secret number, so Barry must have sent it. Fred's secret number is 68, and the headers of this later msg claiming to be from Barry contain: Message-Id: <1075514668.1075.375.camel@anthem> ^^ QED. Nevertheless, I agree that the message I attributed to Barry *claimed* to have been sent by Fred. I'm afraid you'll have to work harder than that to make a fool of me, guys. From aahz at pythoncraft.com Fri Jan 30 23:41:52 2004 From: aahz at pythoncraft.com (Aahz) Date: Fri Jan 30 23:41:58 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: References: <1075514668.1075.375.camel@anthem> Message-ID: <20040131044151.GA21152@panix.com> On Fri, Jan 30, 2004, Tim Peters wrote: > > Tim attributes something to Barry: > >> [Barry] >>> Yes, it was. I don't remember why it was changed; we were probably >>> accused of exclusivity. That's how these things often happen. > > Someone also claiming to be "Barry" retorts: > > [Barry] >> That wern't me. I think it was Fred. > > If you scan the headers of "Fred"'s message, you'll find 13 in there. > That's Barry's secret number, so Barry must have sent it. Fred's secret > number is 68, and the headers of this later msg claiming to be from Barry > contain: > > Message-Id: <1075514668.1075.375.camel@anthem> > ^^ > > QED. Nevertheless, I agree that the message I attributed to Barry > *claimed* to have been sent by Fred. I'm afraid you'll have to work > harder than that to make a fool of me, guys. No, they won't. This is python-dev, not psf-members; we don't care who's posting here (as long experience with bots proves), only whether they make sense. You're not making sense, so your bot needs a tuneup. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From python at rcn.com Sat Jan 31 00:14:53 2004 From: python at rcn.com (Raymond Hettinger) Date: Sat Jan 31 00:16:08 2004 Subject: [Python-Dev] Proposed: change to posting rules for python-dev In-Reply-To: Message-ID: <000001c3e7b9$29040080$1822a044@oemcomputer> [Tim] > If you scan the headers of "Fred"'s message, you'll find 13 in there. > That's Barry's secret number, so Barry must have sent it. Fred's secret > number is 68, and the headers of this later msg claiming to be from Barry > contain ... Hey, why does everybody else get a secret number and all I get are occasional splats after my signature block? Raymond From tjreedy at udel.edu Sat Jan 31 12:19:46 2004 From: tjreedy at udel.edu (Terry Reedy) Date: Sat Jan 31 12:19:54 2004 Subject: [Python-Dev] Re: Proposed: change to posting rules for python-dev References: <877jz90wx9.fsf@hydra.localdomain> <1075512752.1075.350.camel@anthem> Message-ID: "Barry Warsaw" wrote in message news:1075512752.1075.350.camel@anthem... > On Fri, 2004-01-30 at 19:44, Kurt B. Kaiser wrote: > > > Moderating the lists is taking a fair amount of admin time right now. > > At least during storms, it seems reasonable to bounce non-subscribers > > automatically. The bounce will tell them what to do to get posted. > > That's really the biggest motive for my suggestion. OTOH, python-dev at > the moment doesn't have a ton of held messages, because the list was > actually misconfigured to accept all non-member posts until about 4 > hours ago. What's there now is all unwanted (spam, virus bounces, etc). > > Ideally, Mailman would have a challenge-response system for non-member > posters, but I don't want to get into that here. > > Right now, Jeremy and I are the only admins and neither of us has much > time to be diligent in clearing the queue. So maybe let's turn this > around: if I can get 3 volunteers to be python-dev moderators, we'll > leave the current policy of holding non-member postings for approval. > Otherwise, we'll auto reject them. > > Email me directly if you're interested. If this means something like visiting an admin page, say once a day, clicking on obvious junk, and then delete (and similar but 'admit' for obvious good stuff) yes. I would be happy to leave borderline cases (should this be clp instead) alone for 'higher' authority. Can't spurious virus bounces be filtered? They all seem so similar. Or are there occasional valid bounce messages to be dealt with? Terry