From bob at redivi.com Sun Oct 1 00:21:50 2006 From: bob at redivi.com (Bob Ippolito) Date: Sat, 30 Sep 2006 15:21:50 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> <451E2F32.9070405@v.loewis.de> <451E31ED.7030905@gmail.com> Message-ID: <6a36e7290609301521g44b56e59iecc3b0c448cd91c3@mail.gmail.com> On 9/30/06, Terry Reedy wrote: > > "Nick Coghlan" wrote in message > news:451E31ED.7030905 at gmail.com... > >I suspect the problem would typically stem from floating point values that > >are > >read in from a human-readable file rather than being the result of a > >'calculation' as such: > > For such situations, one could create a translation dict for both common > float values and for non-numeric missing value indicators. For instance, > flotran = {'*': None, '1.0':1.0, '2.0':2.0, '4.0':4.0} > The details, of course, depend on the specific case. But of course you have to know that common float values are never cached and that it may cause you problems. Some users may expect them to be because common strings and integers are cached. -bob From rasky at develer.com Sun Oct 1 00:19:22 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sun, 1 Oct 2006 00:19:22 +0200 Subject: [Python-Dev] PEP 355 status References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm><2mk63lfu6j.fsf@starship.python.net> Message-ID: <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Guido van Rossum wrote: > OK. Pronouncement: PEP 355 is dead. The authors (or the PEP editor) > can update the PEP. > > I'm looking forward to a new PEP. It would be terrific if you gave us some clue about what is wrong in PEP355, so that the next guy does not waste his time. For instance, I find PEP355 incredibly good for my own path manipulation (much cleaner and concise than the awful os.path+os+shutil+stat mix), and I have trouble understanding what is *so* wrong with it. You said "it's an amalgam of unrelated functionality", but you didn't say what exactly is "unrelated" for you. Giovanni Bajo From skip at pobox.com Sun Oct 1 00:37:49 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 30 Sep 2006 17:37:49 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <20060929081402.GB19781@craig-wood.com> Message-ID: <17694.61885.527128.686743@montanaro.dyndns.org> Steve> By these statistics I think the answer to the original question Steve> is clearly "no" in the general case. As someone else (Guido?) pointed out, the literal case isn't all that interesting. I modified floatobject.c to track a few interesting floating point values: static unsigned int nfloats[5] = { 0, /* -1.0 */ 0, /* 0.0 */ 0, /* +1.0 */ 0, /* everything else */ 0, /* whole numbers from -10.0 ... 10.0 */ }; PyObject * PyFloat_FromDouble(double fval) { register PyFloatObject *op; if (free_list == NULL) { if ((free_list = fill_free_list()) == NULL) return NULL; } if (fval == 0.0) nfloats[1]++; else if (fval == 1.0) nfloats[2]++; else if (fval == -1.0) nfloats[0]++; else nfloats[3]++; if (fval >= -10.0 && fval <= 10.0 && (int)fval == fval) { nfloats[4]++; } /* Inline PyObject_New */ op = free_list; free_list = (PyFloatObject *)op->ob_type; PyObject_INIT(op, &PyFloat_Type); op->ob_fval = fval; return (PyObject *) op; } static void _count_float_allocations(void) { fprintf(stderr, "-1.0: %d\n", nfloats[0]); fprintf(stderr, " 0.0: %d\n", nfloats[1]); fprintf(stderr, "+1.0: %d\n", nfloats[2]); fprintf(stderr, "rest: %d\n", nfloats[3]); fprintf(stderr, "whole numbers -10.0 to 10.0: %d\n", nfloats[4]); } then called atexit(_count_float_allocations) in _PyFloat_Init and ran "make test". The output was: ... ./python.exe -E -tt ../Lib/test/regrtest.py -l ... -1.0: 29048 0.0: 524241 +1.0: 91561 rest: 1749807 whole numbers -10.0 to 10.0: 1151442 So for a largely non-floating point "application", a fair number of floats are allocated, a bit over 25% of them are -1.0, 0.0 or +1.0, and nearly 50% of them are whole numbers between -10.0 and 10.0, inclusive. Seems like it at least deserves a serious look. It would be nice to have the numeric crowd contribute to this subject as well. Skip From bob at redivi.com Sun Oct 1 00:52:32 2006 From: bob at redivi.com (Bob Ippolito) Date: Sat, 30 Sep 2006 15:52:32 -0700 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: References: Message-ID: <6a36e7290609301552s45435ce7l7a841d9673f59101@mail.gmail.com> On 9/30/06, Scott David Daniels wrote: > Christos Georgiou wrote: > > Does anyone know why this happens? I can't find any information pointing to > > this being deliberate. > > > > I just upgraded to 2.5 on Windows (after making sure I can build extensions > > with the freeware VC++ Toolkit 2003) and some of my programs stopped > > operating. I saw in a French forum that someone else had the same problem, > > and what they did was to copy the relevant files from a 2.4.3 installation. > > I did the same, and it seems it works, with only a console message appearing > > as soon as a root window is created: > > Also note: the Os/X universal seems to include a Tix runtime for the > non-Intel processor, but not for the Intel processor. This > makes me think there is a build problem. Are you sure about that? What file are you referring to specifically? -bob From Scott.Daniels at Acm.Org Sun Oct 1 01:28:24 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat, 30 Sep 2006 16:28:24 -0700 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: <6a36e7290609301552s45435ce7l7a841d9673f59101@mail.gmail.com> References: <6a36e7290609301552s45435ce7l7a841d9673f59101@mail.gmail.com> Message-ID: Bob Ippolito wrote: > On 9/30/06, Scott David Daniels wrote: >> Christos Georgiou wrote: >>> Does anyone know why this happens? I can't find any information pointing to >>> this being deliberate. >> Also note: the Os/X universal seems to include a Tix runtime for the >> non-Intel processor, but not for the Intel processor. This >> makes me think there is a build problem. > > Are you sure about that? What file are you referring to specifically? OK, from the 2.5 universal: (hand-typed, I e-mail from another machine) =========== Using Idle =========== >>> import Tix >>> Tix.Tk() Traceback (most recent call last): File "(pyshell#8)", line 1, in (module) Tix.Tk() File "/Library/Frameworks/Python.framework/Versions/2.5/ lib/python2.5/lib-tk/Tix.py", line 210 in __init__ self.tk.eval('package require Tix') TclError: no suitable image found. Did find: /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. =========== From the command line =========== >>> import Tix >>> Tix.Tk() Traceback (most recent call last): File "", line 1, in (module) File "/Library/Frameworks/Python.framework/Versions/2.5/ lib/python2.5/lib-tk/Tix.py", line 210 in __init__ self.tk.eval('package require Tix') _tkinter.TclError: no suitable image found. Did find: /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. -- Scott David Daniels Scott.Daniels at Acm.Org From bob at redivi.com Sun Oct 1 01:33:22 2006 From: bob at redivi.com (Bob Ippolito) Date: Sat, 30 Sep 2006 16:33:22 -0700 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: References: <6a36e7290609301552s45435ce7l7a841d9673f59101@mail.gmail.com> Message-ID: <6a36e7290609301633h319cc7b7l8cbf796838a9af63@mail.gmail.com> On 9/30/06, Scott David Daniels wrote: > Bob Ippolito wrote: > > On 9/30/06, Scott David Daniels wrote: > >> Christos Georgiou wrote: > >>> Does anyone know why this happens? I can't find any information pointing to > >>> this being deliberate. > >> Also note: the Os/X universal seems to include a Tix runtime for the > >> non-Intel processor, but not for the Intel processor. This > >> makes me think there is a build problem. > > > > Are you sure about that? What file are you referring to specifically? > > OK, from the 2.5 universal: (hand-typed, I e-mail from another machine) > > > =========== Using Idle =========== > >>> import Tix > >>> Tix.Tk() > > Traceback (most recent call last): > File "(pyshell#8)", line 1, in (module) > Tix.Tk() > File "/Library/Frameworks/Python.framework/Versions/2.5/ > lib/python2.5/lib-tk/Tix.py", line 210 in __init__ > self.tk.eval('package require Tix') > TclError: no suitable image found. Did find: > /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. > > =========== From the command line =========== > > >>> import Tix > >>> Tix.Tk() > > Traceback (most recent call last): > File "", line 1, in (module) > File "/Library/Frameworks/Python.framework/Versions/2.5/ > lib/python2.5/lib-tk/Tix.py", line 210 in __init__ > self.tk.eval('package require Tix') > _tkinter.TclError: no suitable image found. Did find: > /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. Those files are not distributed with Python. -bob From kbk at shore.net Sun Oct 1 04:11:45 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat, 30 Sep 2006 22:11:45 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200610010211.k912BjNN001090@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 422 open ( +2) / 3415 closed ( +5) / 3837 total ( +7) Bugs : 933 open (+18) / 6212 closed (+26) / 7145 total (+44) RFE : 237 open ( +2) / 239 closed ( +1) / 476 total ( +3) New / Reopened Patches ______________________ platform.py support for IronPython (2006-09-23) http://python.org/sf/1563842 opened by Anthony Baxter pybench support for IronPython (2006-09-23) http://python.org/sf/1563844 opened by Anthony Baxter Py_signal_pipe (2006-09-24) http://python.org/sf/1564547 opened by Gustavo J. A. M. Carneiro tarfile depends on undocumented behaviour (2006-09-25) http://python.org/sf/1564981 opened by Seo Sanghyeon use LSB version information to detect a platform (2006-09-25) http://python.org/sf/1565037 opened by Matthias Klose doc changes for SMTP_SSL (2006-09-28) http://python.org/sf/1567274 opened by Monty Taylor super() and instancemethod() shouldn't accept keyword args (2006-09-29) CLOSED http://python.org/sf/1567691 opened by ?iga Seilnacht Patches Closed ______________ Python 2.5 fails with -Wl,--as-needed in LDFLAGS (2006-09-21) http://python.org/sf/1562825 closed by masterdriverz super() and instancemethod() shouldn't accept keyword args (2006-09-29) http://python.org/sf/1567691 closed by gbrandl Enable SSL for smtplib (2005-09-05) http://python.org/sf/1282340 closed by gbrandl pyclbr reports different module for Class and Function (2006-09-18) http://python.org/sf/1560617 closed by gbrandl datetime's strftime limits strings to 127 chars (2006-09-12) http://python.org/sf/1557390 closed by gbrandl New / Reopened Bugs ___________________ Quitter object masked (2006-05-01) http://python.org/sf/1479785 reopened by kbk ,msi fails for AMD Turion 64 mobile (2006-09-21) CLOSED http://python.org/sf/1563185 opened by Andy Harrington temporary file(s) (2006-09-22) CLOSED http://python.org/sf/1563236 opened by Grzegorz Makarewicz http//... test file (2006-09-22) CLOSED http://python.org/sf/1563238 opened by Grzegorz Makarewicz python_d python (2006-09-22) http://python.org/sf/1563243 opened by Grzegorz Makarewicz IDLE doesn't load - apparently without firewall problems (2006-09-22) http://python.org/sf/1563630 opened by dani struct.unpack doens't support buffer protocol objects (2006-09-23) http://python.org/sf/1563759 reopened by loewis struct.unpack doens't support buffer protocol objects (2006-09-23) http://python.org/sf/1563759 opened by Adal Chiriliuc Build of Python 2.5 on AIX 5.3 with GCC Fails (2006-09-22) http://python.org/sf/1563807 opened by Daniel Clark Typo in whatsnew/pep-342.html (2006-09-23) CLOSED http://python.org/sf/1563963 opened by Xavier Bassery IDLE invokes completion even when running code (2006-09-23) http://python.org/sf/1563981 opened by Martin v. L?wis 2.6 changes stomp on 2.5 docs (2006-09-23) http://python.org/sf/1564039 opened by ggpauly Fails to install on Fedora Core 5 (2006-09-20) CLOSED http://python.org/sf/1562171 reopened by mnsummerfield BaseCookie does not support "$Port" (2006-09-24) http://python.org/sf/1564508 opened by Anders Aagaard Unicode comparison change in 2.4 vs. 2.5 (2006-09-24) CLOSED http://python.org/sf/1564763 opened by Joe Wreschnig update Lib/plat-linux2/IN.py (2006-09-25) http://python.org/sf/1565071 opened by Matthias Klose Misbehaviour in zipfile (2006-09-25) CLOSED http://python.org/sf/1565087 opened by Richard Philips make plistlib.py available in every install (2006-09-25) http://python.org/sf/1565129 opened by Matthias Klose os.stat() subsecond file mode time is incorrect on Windows (2006-09-25) http://python.org/sf/1565150 opened by Mike Glassford Repair or Change installation error (2006-09-26) http://python.org/sf/1565509 opened by Greg Hazel does not raise SystemError on too many nested blocks (2006-09-26) http://python.org/sf/1565514 opened by Greg Hazel gc allowing tracebacks to eat up memory (2006-09-26) http://python.org/sf/1565525 opened by Greg Hazel webbrowser on gnome runs wrong browser (2006-09-26) CLOSED http://python.org/sf/1565661 opened by kangabroo 'all' documentation missing online (2006-09-26) http://python.org/sf/1565797 opened by Alan sets missing from standard types list in ref (2006-09-26) http://python.org/sf/1565919 opened by Georg Brandl pyexpat produces fals parsing results in CharacterDataHandle (2006-09-26) CLOSED http://python.org/sf/1565967 opened by Michael Gebetsroither RE (regular expression) matching stuck in loop (2006-09-27) http://python.org/sf/1566086 opened by Fabien Devaux T_ULONG -> double rounding in PyMember_GetOne() (2006-09-27) http://python.org/sf/1566140 opened by Piet Delport Logging problem on Windows XP (2006-09-27) http://python.org/sf/1566280 opened by Pavel Krupets Bad behaviour in .obuf* (2006-09-27) http://python.org/sf/1566331 opened by Sam Dennis test_posixpath failure (2006-09-27) CLOSED http://python.org/sf/1566602 opened by WallsRSolid Idle 2.1 - Calltips Hotkey dies not work (2006-09-27) http://python.org/sf/1566611 opened by fladd Library Reference Section 5.1.8.1 is wrong. (2006-09-27) CLOSED http://python.org/sf/1566663 opened by Chris Connett site-packages isn't created before install_egg_info (2006-09-27) http://python.org/sf/1566719 opened by James Oakley urllib doesn't raise IOError correctly with new IOError (2006-09-28) CLOSED http://python.org/sf/1566800 opened by Arthibus Gissehel unchecked metaclass mro (2006-09-28) http://python.org/sf/1567234 opened by ganges master logging.RotatingFileHandler has no "infinite" backupCount (2006-09-28) http://python.org/sf/1567331 opened by Skip Montanaro False sentence about formatted print in tutorial section 7.1 (2006-09-28) CLOSED http://python.org/sf/1567375 opened by David Benbennick tabs missing in idle options configure (2006-09-28) http://python.org/sf/1567450 opened by jrgutierrez GetFileAttributesExA and Win95 (2006-09-29) http://python.org/sf/1567666 opened by giomach missing _typesmodule.c,Visual Studio 2005 pythoncore.vcproj (2006-09-29) http://python.org/sf/1567910 opened by everbruin http://docs.python.org/tut/node10.html typo (2006-09-29) CLOSED http://python.org/sf/1567976 opened by Simon Morgan GUI scripts always return to an interpreter (2006-09-29) http://python.org/sf/1568075 reopened by jejackson GUI scripts always return to an interpreter (2006-09-29) http://python.org/sf/1568075 opened by jjackson Encoding bug (2006-09-30) CLOSED http://python.org/sf/1568120 opened by ?er FADIL USTA Tix is not included in 2.5 for Windows (2006-09-30) http://python.org/sf/1568240 opened by Christos Georgiou init_types (2006-09-30) http://python.org/sf/1568243 opened by Bosko Vukov broken info files generation (2006-09-30) http://python.org/sf/1568429 opened by Arkadiusz Miskiewicz Bugs Closed ___________ ,msi fails for AMD Turion 64 mobile (2006-09-22) http://python.org/sf/1563185 closed by loewis temporary file(s) (2006-09-22) http://python.org/sf/1563236 closed by gbrandl http//... test file (2006-09-22) http://python.org/sf/1563238 closed by gbrandl Parser crash (2006-09-12) http://python.org/sf/1557232 closed by gbrandl struct.unpack doens't support buffer protocol objects (2006-09-23) http://python.org/sf/1563759 closed by adalx Typo in whatsnew/pep-342.html (2006-09-23) http://python.org/sf/1563963 closed by nnorwitz Fails to install on Fedora Core 5 (2006-09-20) http://python.org/sf/1562171 closed by mnsummerfield Unicode comparison change in 2.4 vs. 2.5 (2006-09-25) http://python.org/sf/1564763 closed by lemburg python 2.5 fails to build with --as-needed (2006-09-18) http://python.org/sf/1560984 closed by gbrandl Misbehaviour in zipfile (2006-09-25) http://python.org/sf/1565087 closed by gbrandl webbrowser on gnome runs wrong browser (2006-09-26) http://python.org/sf/1565661 closed by gbrandl pyexpat produces fals parsing results in CharacterDataHandle (2006-09-26) http://python.org/sf/1565967 closed by loewis test_posixpath failure (2006-09-27) http://python.org/sf/1566602 closed by gbrandl Library Reference Section 5.1.8.1 is wrong. (2006-09-27) http://python.org/sf/1566663 closed by gbrandl urllib doesn't raise IOError correctly with new IOError (2006-09-28) http://python.org/sf/1566800 closed by gbrandl False sentence about formatted print in tutorial section 7.1 (2006-09-28) http://python.org/sf/1567375 closed by gbrandl http://docs.python.org/tut/node10.html typo (2006-09-30) http://python.org/sf/1567976 closed by quiver Encoding bug (2006-09-30) http://python.org/sf/1568120 closed by gbrandl locale.format gives wrong exception on some erroneous input (2006-01-23) http://python.org/sf/1412580 closed by gbrandl cgi.FormContentDict constructor should support parse options (2006-03-24) http://python.org/sf/1457823 closed by gbrandl inspect.getargspec() is wrong for def foo((x)): (2006-03-27) http://python.org/sf/1459159 closed by gbrandl Calls from VBScript clobber passed args (2005-03-03) http://python.org/sf/1156179 closed by gbrandl datetime's strftime limits strings to 127 chars (2006-09-12) http://python.org/sf/1556784 closed by gbrandl unicode('foo', '.utf99') does not raise LookupError (2006-03-09) http://python.org/sf/1446043 closed by gbrandl struct.unpack problem with @, =, < specifiers (2006-05-08) http://python.org/sf/1483963 closed by gbrandl Incomplete info in 7.18.1 ZipFile Objects (2006-08-24) http://python.org/sf/1545836 closed by gbrandl PyString_FromString() clarification (2006-08-24) http://python.org/sf/1546052 closed by gbrandl New / Reopened RFE __________________ Better order in file type descriptions (2006-09-27) http://python.org/sf/1566260 opened by Daniele Varrazzo poplib.py list interface (2006-09-29) http://python.org/sf/1567948 opened by Hasan Diwan From ncoghlan at gmail.com Sun Oct 1 05:56:53 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 01 Oct 2006 13:56:53 +1000 Subject: [Python-Dev] PEP 355 status In-Reply-To: <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm><2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Message-ID: <451F3C85.9050100@gmail.com> Giovanni Bajo wrote: > Guido van Rossum wrote: > >> OK. Pronouncement: PEP 355 is dead. The authors (or the PEP editor) >> can update the PEP. >> >> I'm looking forward to a new PEP. > > It would be terrific if you gave us some clue about what is wrong in PEP355, so > that the next guy does not waste his time. For instance, I find PEP355 > incredibly good for my own path manipulation (much cleaner and concise than the > awful os.path+os+shutil+stat mix), and I have trouble understanding what is > *so* wrong with it. > > You said "it's an amalgam of unrelated functionality", but you didn't say what > exactly is "unrelated" for you. Things the PEP 355 path object lumps together: - string manipulation operations - abstract path manipulation operations (work for non-existent filesystems) - read-only traversal of a concrete filesystem (dir, stat, glob, etc) - addition & removal of files/directories/links within a concrete filesystem Dumping all of these into a single class is certainly practical from a utility point of view, but it's about as far away from beautiful as you can get, which creates problems from a learnability point of view, and from a capability-based security point of view. PEP 355 itself splits the methods up into 11 distinct categories when listing the interface. At the very least, I would want to split the interface into separate abstract and concrete interfaces. The abstract object wouldn't care whether or not the path actually existed on the current filesystem (and hence could be relied on to never raise IOError), whereas the concrete object would include the many operations that might need to touch the real IO device. (the PEP has already made a step in the right direction here by removing the methods that accessed a file's contents, leaving that job to the file object where it belongs). There's a case to be made for the abstract object inheriting from str or unicode for compatiblity with existing code, but an alternative would be to enhance the standard library to better support the use of non-basestring objects to describe filesystem paths. A PEP should at least look into what would have to change at the Python API level and the C API level to go that route rather than the inheritance route. For the concrete interface, the behaviour is very dependent on whether the path refers to a file, directory or symlink on the current filesystem. For an OO filesystem interface, does it really make sense to leave them all lumped into the one class with a bunch of isdir() and islink() style methods? Or does it make more sense to have a method on the abstract object that will return the appropriate kind of filesystem info object? If the latter, then how would you deal with the issue of state coherency (i.e. it was a file when you last touched it on the filesystem, but someone else has since changed it to a link)? (that last question actually lends strong support to the idea of a *single* concrete interface that dynamically responds to changes in the underlying filesystem). Another key difference between the two is that the abstract objects would be hashable and serialisable, as their state is immutable and independent of the filesystem. For the concrete objects, the only immutable part of their state is the path name - the rest would reflect the state of the filesystem at the current point in time. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sun Oct 1 06:18:11 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 01 Oct 2006 14:18:11 +1000 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> References: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> Message-ID: <451F4183.5050907@gmail.com> Hans Polak wrote: > Hi, > > > > Just an opinion, but many uses of the ?while true loop? are instances of > a ?do loop?. I appreciate the language layout question, so I?ll give you > an alternative: > > > > do: > > > > > > while > I believe you meant to write PEP 315 in the subject line :) To fully account for loop else clauses, this suggestion would probably need to be modified to look something like this: Basic while loop: while : else: Using break to avoid code duplication: while True: if not : break Current version of PEP 315: do: while : else: This suggestion: do: while else: I personally like that style, and if the compiler can dig through a function looking for yield statements to identify generators, it should be able to dig through a do-loop looking for the termination condition. As I recall, the main objection to this style was that it could hide the loop termination condition, but that isn't actually mentioned in the PEP (and in the typical do-while case, the loop condition will still be clearly visible at the end of the loop body). Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From glyph at divmod.com Sun Oct 1 08:09:02 2006 From: glyph at divmod.com (glyph at divmod.com) Date: Sun, 1 Oct 2006 02:09:02 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <451F3C85.9050100@gmail.com> Message-ID: <20061001060902.1717.565172190.divmod.quotient.64146@ohm> On Sun, 01 Oct 2006 13:56:53 +1000, Nick Coghlan wrote: >Things the PEP 355 path object lumps together: > - string manipulation operations > - abstract path manipulation operations (work for non-existent filesystems) > - read-only traversal of a concrete filesystem (dir, stat, glob, etc) > - addition & removal of files/directories/links within a concrete filesystem > >Dumping all of these into a single class is certainly practical from a utility >point of view, but it's about as far away from beautiful as you can get, which >creates problems from a learnability point of view, and from a >capability-based security point of view. PEP 355 itself splits the methods up >into 11 distinct categories when listing the interface. > >At the very least, I would want to split the interface into separate abstract >and concrete interfaces. The abstract object wouldn't care whether or not the >path actually existed on the current filesystem (and hence could be relied on >to never raise IOError), whereas the concrete object would include the many >operations that might need to touch the real IO device. (the PEP has already >made a step in the right direction here by removing the methods that accessed >a file's contents, leaving that job to the file object where it belongs). > >There's a case to be made for the abstract object inheriting from str or >unicode for compatiblity with existing code, I think that compatibility can be achieved by having a "pathname" string attribute or similar to convert to a string when appropriate. It's not like datetime inherits from str to facilitate formatting or anything like that. >but an alternative would be to >enhance the standard library to better support the use of non-basestring >objects to describe filesystem paths. A PEP should at least look into what >would have to change at the Python API level and the C API level to go that >route rather than the inheritance route. In C, this is going to be really difficult. Existing C APIs want to use C functions to deal with pathnames, and many libraries are not going to support arbitrary VFS I/O operations. For some libraries, like GNOME or KDE, you'd have to use the appropriate VFS object for their platform. >For the concrete interface, the behaviour is very dependent on whether the >path refers to a file, directory or symlink on the current filesystem. For an >OO filesystem interface, does it really make sense to leave them all lumped >into the one class with a bunch of isdir() and islink() style methods? Or does >it make more sense to have a method on the abstract object that will return >the appropriate kind of filesystem info object? I don't think returning different types of objects makes sense. This sort of typing is inherently prone to race conditions. If you get a "DirectoryPath" object in Python, and then the underlying filesystem changes so that the name that used to be a directory is now a file (or a device, or UNIX socket, or whatever), how do you change the underlying type? >If the latter, then how would >you deal with the issue of state coherency (i.e. it was a file when you last >touched it on the filesystem, but someone else has since changed it to a >link)? (that last question actually lends strong support to the idea of a >*single* concrete interface that dynamically responds to changes in the >underlying filesystem). In non-filesystem cases, for example the "zip path" case, there are inherent failure modes that you can't really do anything about (what if the zip file is removed while you're in the middle of manipulating it?) but there are actual applications which depend on the precise atomic semantics and error conditions associated with moving, renaming, and deleting directories and files, at least on POSIX systems. The way Twisted does this is that FilePath objects explicitly cache the results of "stat" and then have an explicit "restat" method for resychronizing with the current state of the filesystem. None of their methods for *manipulating* the filesystem look at this state, since it is almost guaranteed to be out of date :). >Another key difference between the two is that the abstract objects would be >hashable and serialisable, as their state is immutable and independent of the >filesystem. For the concrete objects, the only immutable part of their state >is the path name - the rest would reflect the state of the filesystem at the >current point in time. It doesn't really make sense to separate these to me; whenever you're serializing or hashing that information, the "mutable" parts should just be discarded. From ronaldoussoren at mac.com Sun Oct 1 10:13:19 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 1 Oct 2006 10:13:19 +0200 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: References: Message-ID: <097B109A-8969-4DBC-800B-513E18C82A9B@mac.com> On Sep 30, 2006, at 11:13 PM, Scott David Daniels wrote: > Christos Georgiou wrote: >> Does anyone know why this happens? I can't find any information >> pointing to >> this being deliberate. >> >> I just upgraded to 2.5 on Windows (after making sure I can build >> extensions >> with the freeware VC++ Toolkit 2003) and some of my programs stopped >> operating. I saw in a French forum that someone else had the same >> problem, >> and what they did was to copy the relevant files from a 2.4.3 >> installation. >> I did the same, and it seems it works, with only a console message >> appearing >> as soon as a root window is created: > > Also note: the Os/X universal seems to include a Tix runtime for the > non-Intel processor, but not for the Intel processor. > This > makes me think there is a build problem. The OSX universal binaries don't include Tcl/Tk at all but link to the system version of the Tcl/Tk frameworks. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061001/b9bbae3c/attachment.bin From ronaldoussoren at mac.com Sun Oct 1 10:54:48 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 1 Oct 2006 10:54:48 +0200 Subject: [Python-Dev] HAVE_UINTPTR_T test in configure.in Message-ID: Hi, Someone reported on the pythonmac list that HAVE_UINTPTR_T wasn't defined in pyconfig.h while it should have been defined. I'm looking into this and am now wondering whether the configure snipped below is correct: AC_MSG_CHECKING(for uintptr_t support) have_uintptr_t=no AC_TRY_COMPILE([], [uintptr_t x; x = (uintptr_t)0;], [ AC_DEFINE(HAVE_UINTPTR_T, 1, [Define this if you have the type uintptr_t.]) have_uintptr_t=yes ]) AC_MSG_RESULT($have_uintptr_t) if test "$have_uintptr_t" = yes ; then AC_CHECK_SIZEOF(uintptr_t, 4) fi This seems to check for uintptr_t as a builtin type. Isn't one supposed to include to get this type? Chaning the AC_TRY_COMPILE line to the line below fixes the issue for me, but I've only tested on OSX and don't know if this is the right fix for all supported platforms. AC_TRY_COMPILE([#include ], [uintptr_t x; x = (uintptr_t) 0;], [ Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061001/e8397f2d/attachment-0001.bin From nick at craig-wood.com Sun Oct 1 11:38:46 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Sun, 1 Oct 2006 10:38:46 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <20061001093846.GA20938@craig-wood.com> On Fri, Sep 29, 2006 at 12:03:03PM -0700, Guido van Rossum wrote: > I see some confusion in this thread. > > If a *LITERAL* 0.0 (or any other float literal) is used, you only get > one object, no matter how many times it is used. For some reason that doesn't happen in the interpreter which has been confusing the issue slightly... $ python2.5 Python 2.5c1 (r25c1:51305, Aug 19 2006, 18:23:29) [GCC 4.1.2 20060814 (prerelease) (Debian 4.1.1-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a=0.0 >>> b=0.0 >>> id(a), id(b) (134737756, 134737772) >>> $ python2.5 -c 'a=0.0; b=0.0; print id(a),id(b)' 134737796 134737796 > But if the result of a *COMPUTATION* returns 0.0, you get a new object > for each such result. If you have 70 MB worth of zeros, that's clearly > computation results, not literals. In my application I'm receiving all the zeros from a server over TCP as ASCII and these are being float()ed in python. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From nick at craig-wood.com Sun Oct 1 11:43:38 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Sun, 1 Oct 2006 10:43:38 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <6a36e7290609301521g44b56e59iecc3b0c448cd91c3@mail.gmail.com> References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> <451E2F32.9070405@v.loewis.de> <451E31ED.7030905@gmail.com> <6a36e7290609301521g44b56e59iecc3b0c448cd91c3@mail.gmail.com> Message-ID: <20061001094338.GB20938@craig-wood.com> On Sat, Sep 30, 2006 at 03:21:50PM -0700, Bob Ippolito wrote: > On 9/30/06, Terry Reedy wrote: > > "Nick Coghlan" wrote in message news:451E31ED.7030905 at gmail.com... > > > I suspect the problem would typically stem from floating point > > > values that are read in from a human-readable file rather than > > > being the result of a 'calculation' as such: Over a TCP socket in ASCII format for my application > > For such situations, one could create a translation dict for both common > > float values and for non-numeric missing value indicators. For instance, > > flotran = {'*': None, '1.0':1.0, '2.0':2.0, '4.0':4.0} > > The details, of course, depend on the specific case. > > But of course you have to know that common float values are never > cached and that it may cause you problems. Some users may expect them > to be because common strings and integers are cached. I have to say I was surprised to find out how many copies of 0.0 there were in my code and I guess I was subconsciously expecting the immutable 0.0s to be cached even though I know consciously I've never seen anything but int and str mentioned in the docs. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From rrr at ronadam.com Sun Oct 1 12:20:07 2006 From: rrr at ronadam.com (Ron Adam) Date: Sun, 01 Oct 2006 05:20:07 -0500 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <451F4183.5050907@gmail.com> References: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> <451F4183.5050907@gmail.com> Message-ID: <451F9657.3010808@ronadam.com> Nick Coghlan wrote: > Hans Polak wrote: >> Hi, >> >> >> >> Just an opinion, but many uses of the ?while true loop? are instances of >> a ?do loop?. I appreciate the language layout question, so I?ll give you >> an alternative: >> >> >> >> do: >> >> >> >> >> >> while (I don't think this has been suggested yet.) while , : This would be a do-loop. while 1, : In situations where you want to enter a loop on one condition and exit on a second condition: if value1: value2 = True while value2: Would be ... while value1, value2: I've used that pattern on more than a few occasions. A single condition while would be the same as... while , : # same entry and exit condition So do just as we do now... while : # same entry and exit condition > As I recall, the main objection to this style was that it could hide the loop > termination condition, but that isn't actually mentioned in the PEP (and in > the typical do-while case, the loop condition will still be clearly visible at > the end of the loop body). Putting both the entry and exit conditions at the top is easier to read. The end of the first loop is also the beginning of all the following loops, so having the exit_condition at the top doesn't really put anything out of order. If the exit_condition is not evaluated until the top of the second loop, the names it uses do not need to be pre defined, they can just be assigned in the loop. Ron From murman at gmail.com Sun Oct 1 16:14:14 2006 From: murman at gmail.com (Michael Urman) Date: Sun, 1 Oct 2006 09:14:14 -0500 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <451F9657.3010808@ronadam.com> References: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> <451F4183.5050907@gmail.com> <451F9657.3010808@ronadam.com> Message-ID: On 10/1/06, Ron Adam wrote: > (I don't think this has been suggested yet.) > > while , : > [snip] > Putting both the entry and exit conditions at the top is easier to read. I agree in principle, but I thought the proposed syntax already has meaning today (as it turns out, parentheses are required to make a tuple in a while condition, at least in 2.4 and 2.5). To help stave off similar confusion I'd rather see a pseudo-keyword added. However my first candidate "until" seems to apply a negation to the exit condition. while True until False: # run once? run forever? while True until True: # run forever? run once? It's still very different from any syntactical syntax I can think of in python. I'm not sure I like the idea. Michael -- Michael Urman http://www.tortall.net/mu/blog From ark at acm.org Sun Oct 1 18:58:41 2006 From: ark at acm.org (Andrew Koenig) Date: Sun, 1 Oct 2006 12:58:41 -0400 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <451F9657.3010808@ronadam.com> Message-ID: <002501c6e57a$dbf69920$6402a8c0@arkdesktop> > (I don't think this has been suggested yet.) > > while , : > This usage makes me uneasy, not the least because I don't understand why the comma isn't creating a tuple. That is, why whould while x, y: be any different from while (x, y): ? My other concern is that is evaluated out of sequence. From tjreedy at udel.edu Sun Oct 1 19:54:31 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 1 Oct 2006 13:54:31 -0400 Subject: [Python-Dev] Caching float(0.0) References: <20061001093846.GA20938@craig-wood.com> Message-ID: "Nick Craig-Wood" wrote in message news:20061001093846.GA20938 at craig-wood.com... > On Fri, Sep 29, 2006 at 12:03:03PM -0700, Guido van Rossum wrote: >> I see some confusion in this thread. >> >> If a *LITERAL* 0.0 (or any other float literal) is used, you only get >> one object, no matter how many times it is used. > > For some reason that doesn't happen in the interpreter which has been > confusing the issue slightly... > > $ python2.5 >>>> a=0.0 >>>> b=0.0 >>>> id(a), id(b) > (134737756, 134737772) Guido said *a* literal (emphasis shifted), reused as in a loop or function recalled, while you used *a* literal, then *another* literal, without reuse. Try a=b=0.0 instead. Terry Jan Reedy From pje at telecommunity.com Sun Oct 1 19:55:06 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 01 Oct 2006 13:55:06 -0400 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <002501c6e57a$dbf69920$6402a8c0@arkdesktop> References: <451F9657.3010808@ronadam.com> Message-ID: <5.1.1.6.0.20061001135107.02f49e68@sparrow.telecommunity.com> At 12:58 PM 10/1/2006 -0400, Andrew Koenig wrote: > > (I don't think this has been suggested yet.) > > > > while , : > > > >This usage makes me uneasy, not the least because I don't understand why the >comma isn't creating a tuple. That is, why whould > > while x, y: > > >be any different from > > while (x, y): > > >? > >My other concern is that is evaluated out of sequence. This pattern: while entry_cond: ... and while not exit_cond: ... has been suggested before, and I believe that at least one of the times it was suggested, it had some support from Guido. Essentially, the "and while not exit" is equivalent to an "if exit: break" that's more visible due to not being indented. I'm not sure I like it, myself, but out of all the things that get suggested for this issue, I think it's the best. The fact that it's still not very good despite being the best, is probably the reason we don't have it yet. :) From exarkun at divmod.com Sun Oct 1 20:01:51 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Sun, 1 Oct 2006 14:01:51 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Message-ID: <20061001180151.1717.1491936593.divmod.quotient.64438@ohm> On Sun, 1 Oct 2006 13:54:31 -0400, Terry Reedy wrote: > >"Nick Craig-Wood" wrote in message >news:20061001093846.GA20938 at craig-wood.com... >> On Fri, Sep 29, 2006 at 12:03:03PM -0700, Guido van Rossum wrote: >>> I see some confusion in this thread. >>> >>> If a *LITERAL* 0.0 (or any other float literal) is used, you only get >>> one object, no matter how many times it is used. >> >> For some reason that doesn't happen in the interpreter which has been >> confusing the issue slightly... >> >> $ python2.5 >>>>> a=0.0 >>>>> b=0.0 >>>>> id(a), id(b) >> (134737756, 134737772) > >Guido said *a* literal (emphasis shifted), reused as in a loop or function >recalled, while you used *a* literal, then *another* literal, without >reuse. Try a=b=0.0 instead. Actually this just has to do with, um, "compilation units", for lack of a better term: exarkun at kunai:~$ python Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a = 0.0 >>> b = 0.0 >>> print a is b False >>> ^D exarkun at kunai:~$ cat > test.py a = 0.0 b = 0.0 print a is b ^D exarkun at kunai:~$ python test.py True exarkun at kunai:~$ cat > test_a.py a = 0.0 ^D exarkun at kunai:~$ cat > test_b.py b = 0.0 ^D exarkun at kunai:~$ cat > test.py from test_a import a from test_b import b print a is b ^D exarkun at kunai:~$ python test.py False exarkun at kunai:~$ python Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a = 0.0; b = 0.0 >>> print a is b True >>> exarkun at kunai:~$ Each line in an interactive session is compiled separately, like modules are compiled separately. With the current implementation, literals in a single compilation unit have a chance to be "cached" like this. Literals in different compilation units, even for the same value, don't. Jean-Paul From ronaldoussoren at mac.com Sun Oct 1 20:11:12 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 1 Oct 2006 20:11:12 +0200 Subject: [Python-Dev] HAVE_UINTPTR_T test in configure.in In-Reply-To: References: Message-ID: <27707AD9-69FE-41A1-A158-B08B0B791FFE@mac.com> On Oct 1, 2006, at 10:54 AM, Ronald Oussoren wrote: > Hi, > > Someone reported on the pythonmac list that HAVE_UINTPTR_T wasn't > defined in pyconfig.h while it should have been defined. I'm > looking into this and am now wondering whether the configure > snipped below is correct: > > AC_MSG_CHECKING(for uintptr_t support) > have_uintptr_t=no > AC_TRY_COMPILE([], [uintptr_t x; x = (uintptr_t)0;], [ > AC_DEFINE(HAVE_UINTPTR_T, 1, [Define this if you have the type > uintptr_t.]) > have_uintptr_t=yes > ]) > AC_MSG_RESULT($have_uintptr_t) > if test "$have_uintptr_t" = yes ; then > AC_CHECK_SIZEOF(uintptr_t, 4) > fi > > This seems to check for uintptr_t as a builtin type. Isn't one > supposed to include to get this type? > > Chaning the AC_TRY_COMPILE line to the line below fixes the issue > for me, but I've only tested on OSX and don't know if this is the > right fix for all supported platforms. > > AC_TRY_COMPILE([#include ], [uintptr_t x; x = (uintptr_t) > 0;], [ The same problem exists on Linux, and is fixed by the same change. BTW. Python 2.4 suffers from the same problem and I've filed a bugreport for this (http://www.python.org/sf/1568842). Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061001/800bb0d0/attachment.bin From ark at acm.org Sun Oct 1 20:44:55 2006 From: ark at acm.org (Andrew Koenig) Date: Sun, 1 Oct 2006 14:44:55 -0400 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <5.1.1.6.0.20061001135107.02f49e68@sparrow.telecommunity.com> Message-ID: <002e01c6e589$b324ee20$6402a8c0@arkdesktop> > This pattern: > > while entry_cond: > ... > and while not exit_cond: > ... > > has been suggested before, and I believe that at least one of the times it > was suggested, it had some support from Guido. Essentially, the "and > while not exit" is equivalent to an "if exit: break" that's more visible > due to not being indented. I like this suggestion. In fact it is possible that at one time I suggested something similar. It reminds me of something that Dijkstra suggested in his 1971 book "A Discipline of Programming." His ides looked somewhat like this: do condition 1 -> action 1 ... [] condition n -> action n od Here, the [] should be thought of as a delimiter; it was typeset as a tall narrow rectangle. The semantics are as follows: If all of the conditions are false, the statement does nothing. Otherwise, the implementation picks one of the true conditions, executes the corresponding action, and does it all again. There is no guarantee about which action is executed if more than one of the conditions is true. The general idea, then, is that each action should falsify its corresponding condition while bring the loop closer to termination; when all of the conditions are false, the loop is done. For example, he might write Euclid's algorithm this way: do x < y -> y := y mod x [] y < x -> x := x mod y od If we were to adopt "while ... and while" in Python, then Dijkstra's construct could be rendered this way: while x < y: y %= x or while y < x: x %= y I'm not suggesting this seriously as I don't have enough realistic use cases. Still, it's interesting to see that someone else has grappled with a similar problem. From rrr at ronadam.com Sun Oct 1 21:08:45 2006 From: rrr at ronadam.com (Ron Adam) Date: Sun, 01 Oct 2006 14:08:45 -0500 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: References: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> <451F4183.5050907@gmail.com> <451F9657.3010808@ronadam.com> Message-ID: <4520123D.90303@ronadam.com> Michael Urman wrote: > On 10/1/06, Ron Adam wrote: >> (I don't think this has been suggested yet.) >> >> while , : >> > > [snip] > >> Putting both the entry and exit conditions at the top is easier to read. > > I agree in principle, but I thought the proposed syntax already has > meaning today (as it turns out, parentheses are required to make a > tuple in a while condition, at least in 2.4 and 2.5). To help stave > off similar confusion I'd rather see a pseudo-keyword added. However > my first candidate "until" seems to apply a negation to the exit > condition. > > while True until False: # run once? run forever? > while True until True: # run forever? run once? > > It's still very different from any syntactical syntax I can think of > in python. I'm not sure I like the idea. > > Michael I thought the comma might be a sticking point. My first thought was to have a series of conditions evaluated on loops with the last condition repeated. while loop1_cond, loop2_cond, loop3_cond, ..., rest_condition: But I couldn't think of good uses past the first two that are obvious so I trimmed it down to just enter_condition and exit_condition which keeps it simple. But from this example you can see they are all really just top of the loop tests done in sequence. A do loop is just a matter of having the first one evaluate as True. The current while condition is an entry condition the first time it's evaluated and an exit condition on the rest. So by splitting it in two, we can specify an enter and exit test more explicitly. There's a certain consistency I like about this also. Is it just getting around or finding a nice alternative to the comma that is the biggest problem with this? Maybe just using "then" would work? while cond1 then cond2: Cheers, Ron From guido at python.org Sun Oct 1 22:35:56 2006 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Oct 2006 13:35:56 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Message-ID: On 9/30/06, Giovanni Bajo wrote: > It would be terrific if you gave us some clue about what is wrong in PEP355, so > that the next guy does not waste his time. For instance, I find PEP355 > incredibly good for my own path manipulation (much cleaner and concise than the > awful os.path+os+shutil+stat mix), and I have trouble understanding what is > *so* wrong with it. > > You said "it's an amalgam of unrelated functionality", but you didn't say what > exactly is "unrelated" for you. Sorry, no time. But others in this thread clearly agreed with me, so they can guide you. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From arigo at tunes.org Sun Oct 1 23:09:24 2006 From: arigo at tunes.org (Armin Rigo) Date: Sun, 1 Oct 2006 23:09:24 +0200 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: References: Message-ID: <20061001210923.GA31682@code0.codespeak.net> Hi Brett, On Wed, Sep 27, 2006 at 02:11:30PM -0700, Brett Cannon wrote: > is so bad that it is worth trying to re-implement the import semantics in > pure Python or if in the name of time to just work with the C code. In the name of time, sanity and usefulness, rewriting the expected semantics in Python would be a major good idea IMHO. I can cite many projects that have reimplemented half of the semantics in Python (runpy.py, the 'py' lib, PyPy...), but none that completed them. Having such a complete implementation available in the first place would be helpful. A bientot, Armin From Jack.Jansen at cwi.nl Sun Oct 1 23:04:53 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Sun, 1 Oct 2006 23:04:53 +0200 Subject: [Python-Dev] OT: How many other people got this spam? References: <0F6EC883$0A010F152D3B$9C41388@E457FDF720CE414> Message-ID: <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> I was wondering: how many other people who maintain websites (well: "maintain" might be a bit of a misnomer in my case:-) related to Python have also got this spam? Begin forwarded message: > From: "Snake Tracks" > Date: October 1, 2006 21:21:45 GMT+02:00 > To: Cwi > Subject: Special Invitation for cwi.nl from Snake Tracks > > Fellow Website Owner/Operator; > > As of September 29th, 2006 we will be launching what is soon to be the > worlds largest snake enthusiast website. The website contains > valuable > information for all those interested in snakes including care sheets, > species information and identification, breeding information, and an > extensive list of snake specific forums. > > We welcome you to visit our website and join our community of snake > enthusiasts worldwide. Currently we are browsing through Google and > other major search engines looking for websites we feel would make > good > link partners. I have personally come across your site and think that > exchanging links could benefit both of our businesses. By linking > to us > you will receive a reciprocal link and be showcased in front of all > our > visitors. > > If you are interested in this partnership please add one of the > following text links or banners to a high traffic area on your > website: > > 1) Snake Tracks - The Worlds Largest Snake Enthusiast Website. Visit > our site for care sheets, species information, field herping > information, breeding, captive care, and our extensive list of snake > enthusiast forums. > > 2) Snake Tracks Forums - Visit the Worlds Largest Collection of Snake > Enthusiast forums including our field herping, captive care, habitat > design, and regional forums. > > 3) Snake Care Sheets - Visit the Worlds Largest Snake Enthusiast > Website. Forums, Care Sheets, Field Herping, Species information and > more. > > You may also visit our link page to choose from several banner images > and text links. Once you have linked to our website, fill out the > form > and we will add your site to our directory. > > http://www.snaketracks.com/linktous.html > > I look forward to hearing from you in regards to this email. Please > allow up to 24 hours for a response as we are currently receiving > extremely large amounts of email. > > Sincerely; > Blair Russell - Snaketracks.com > -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From greg.ewing at canterbury.ac.nz Mon Oct 2 03:03:44 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Oct 2006 14:03:44 +1300 Subject: [Python-Dev] OT: How many other people got this spam? In-Reply-To: <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> References: <0F6EC883$0A010F152D3B$9C41388@E457FDF720CE414> <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> Message-ID: <45206570.9020802@canterbury.ac.nz> Jack Jansen wrote: > I was wondering: how many other people who maintain websites (well: > "maintain" might be a bit of a misnomer in my case:-) related to > Python have also got this spam? I got it. I was rather amused that they claim to have been "looking for sites that would make good link partners" when obviously no human eye of theirs has actually seen my site. Addressing me as "Canterbury" in the To: line wasn't a good sign either. :-) I'm tempted to take them up on the offer and see whether they actually make a link to my site from theirs... -- Greg From fredrik at pythonware.com Mon Oct 2 07:58:15 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 02 Oct 2006 07:58:15 +0200 Subject: [Python-Dev] OT: How many other people got this spam? In-Reply-To: <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> References: <0F6EC883$0A010F152D3B$9C41388@E457FDF720CE414> <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> Message-ID: Jack Jansen wrote: > I was wondering: how many other people who maintain websites (well: > "maintain" might be a bit of a misnomer in my case:-) related to > Python have also got this spam? probably everyone. I've gotten two copies, this far. From nick at craig-wood.com Mon Oct 2 09:54:35 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Mon, 2 Oct 2006 08:54:35 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20061001180151.1717.1491936593.divmod.quotient.64438@ohm> References: <20061001180151.1717.1491936593.divmod.quotient.64438@ohm> Message-ID: <20061002075434.GA18278@craig-wood.com> On Sun, Oct 01, 2006 at 02:01:51PM -0400, Jean-Paul Calderone wrote: > Each line in an interactive session is compiled separately, like modules > are compiled separately. With the current implementation, literals in a > single compilation unit have a chance to be "cached" like this. Literals > in different compilation units, even for the same value, don't. That makes sense - thanks for the explanation! -- Nick Craig-Wood -- http://www.craig-wood.com/nick From jason.orendorff at gmail.com Mon Oct 2 12:28:28 2006 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Mon, 2 Oct 2006 06:28:28 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Message-ID: On 9/30/06, Giovanni Bajo wrote: > Guido van Rossum wrote: > > OK. Pronouncement: PEP 355 is dead. [...] > > It would be terrific if you gave us some clue about what is > wrong in PEP355, [...] Here are my guesses. I believe Guido rejected this PEP for a lot of reasons. By the way, what I'm about to do is known as "channeling Guido (badly)" and I'm pretty sure it annoys him. Sorry, Guido. Please don't treat the following as authoritative; I have never met Guido and obviously I cannot speak for him. - I don't think Guido ever saw much benefit from "path objects". That is, the Motivation was not compelling. I think the main motivation is to eliminate some clutter and add a handful of useful methods to the stdlib, so it's easy to see how this could be the case. - Guido just flat-out didn't like the looks of the PEP. Too much weirdness. (path.py contains more weirdness, including some stuff Guido particularly disliked, and I think it's fair to say that PEP355 suffered somewhat by association.) - Any proposal to add a Second Way To Do It has to meet a very high standard. PEP355 was too big to be considered an incremental change. Yet it didn't even attempt to fix all the perceived problems with the existing APIs. A more thorough job would have had a better chance. - Nobody liked the API design--too many methods. - Now we're hearing rumors of better ideas out there, which comes as a relief. I suspect any one of these could have scuttled the proposal. -j From ncoghlan at gmail.com Mon Oct 2 12:48:05 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 02 Oct 2006 20:48:05 +1000 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <000601c6e5ed$d99ad290$1d2c440a@spain.capgemini.com> References: <000601c6e5ed$d99ad290$1d2c440a@spain.capgemini.com> Message-ID: <4520EE65.50507@gmail.com> Hans Polak wrote: > Hi Nick, > > Yep, PEP 315. Sorry about that. > > Now, about your suggestion > do: > > while > > else: > > > This is pythonic, but not logical. The 'do' will execute at least once, so > the else clause is not needed, nor is the . The body> should go before the while terminator. This objection is based on a misunderstanding of what the else clause is for in a Python loop. The else clause is only executed if the loop terminated naturally (the exit condition became false) rather than being explicitly terminated using a break statement. This behaviour is most commonly useful when using a for loop to search through an iterable (breaking when the object is found, and using the else clause to handle the 'not found' case), but it is also defined for while loops. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Mon Oct 2 15:43:48 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 Oct 2006 15:43:48 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <451E31ED.7030905@gmail.com> References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> <451E2F32.9070405@v.loewis.de> <451E31ED.7030905@gmail.com> Message-ID: <45211794.7020503@v.loewis.de> Nick Coghlan schrieb: >> Right. Although I do wonder what kind of software people write to run >> into this problem. As Guido points out, the numbers must be the result >> from some computation, or created by an extension module by different >> means. If people have many *simultaneous* copies of 0.0, I would expect >> there is something else really wrong with the data structures or >> algorithms they use. > > I suspect the problem would typically stem from floating point values > that are read in from a human-readable file rather than being the result > of a 'calculation' as such: That's how you can end up with 100 different copies of 0.0. But apparently, people are creating millions of them, and keep them in memory simultaneously. Unless the text file *only* consists of floating point numbers, I would expect they have bigger problems than that. Regards, Martin From martin at v.loewis.de Mon Oct 2 15:49:50 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 Oct 2006 15:49:50 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <129CEF95A523704B9D46959C922A28000451FED3@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A28000451FED3@nemesis.central.ccp.cc> Message-ID: <452118FE.6040704@v.loewis.de> Kristj?n V. J?nsson schrieb: > Well, a lot of extension code, like ours use PyFloat_FromDouble(foo); > This can be from vectors and stuff. Hmm. If you get a lot of 0.0 values from vectors and stuff, I would expect that memory usage is already high. In any case, a module that creates a lot of copies of 0.0 that way could do its own caching, right? > Very often these are values from a database. Integral float values > are very common in such case and id didn't occur to me that they > weren't being reused, at least for small values. Sure - but why are keeping people them in memory all the time? Also, isn't it a mis-design of the database if you have many float values in it that represent natural numbers? Shouldn't you use a more appropriate data type, then? > Also, a lot of arithmetic involving floats is expected to end in > integers, like computing some index from a float value. Integers get > promoted to floats when touched by them, as you know. Again, sounds like a programming error to me. > Anyway, I now precreate integral values from -10 to 10 with great > effect. The cost is minimal, the benefit great. In an extension module, the knowledge about the application domain is larger, so it may be reasonable to do the caching there. I would still expect that in the typical application where this is an issue, there is some kind of larger design bug. Regards, Martin From kristjan at ccpgames.com Mon Oct 2 16:19:18 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Mon, 2 Oct 2006 14:19:18 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> Well, Skip made the argument when analyzing the testsuite: "So for a largely non-floating point "application", a fair number of floats are allocated, a bit over 25% of them are -1.0, 0.0 or +1.0, and nearly 50% of them are whole numbers between -10.0 and 10.0, inclusive. " In C, there is no need to treat 0.0 any different from any other value, since they are literals. You will find axis aligned unit vectors to be very common in any 3D app. I can't say exactly where all those integral floats are coming from. I could investigate further, but it seems to me that they are simply quite common in real-world applications. Experience shows that 0.0 is _very_ common, even, and the test suite test skip made should make this abundantly clear. I can't see how this situation is any different from the re-use of low ints. There is no fundamental law that says that ints below 100 are more common than other, yet experience shows that this is so, and so they are reused. Rather than to view this as a programming error, why not simply accept that this is a recurring pattern and adjust python to be more efficient when faced by it? Surely a lot of karma lies that way? Cheers, Kristj?n > -----Original Message----- > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] > Sent: 2. okt?ber 2006 13:50 > To: Kristj?n V. J?nsson > Cc: Bob Ippolito; python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > Kristj?n V. J?nsson schrieb: > > Well, a lot of extension code, like ours use > PyFloat_FromDouble(foo); > > This can be from vectors and stuff. > > Hmm. If you get a lot of 0.0 values from vectors and stuff, I > would expect that memory usage is already high. > > In any case, a module that creates a lot of copies of 0.0 > that way could do its own caching, right? > > > Very often these are values from a database. Integral float values > > are very common in such case and id didn't occur to me that they > > weren't being reused, at least for small values. > > Sure - but why are keeping people them in memory all the time? > Also, isn't it a mis-design of the database if you have many > float values in it that represent natural numbers? Shouldn't > you use a more appropriate data type, then? > > > Also, a lot of arithmetic involving floats is expected to end in > > integers, like computing some index from a float value. > Integers get > > promoted to floats when touched by them, as you know. > > Again, sounds like a programming error to me. > > > Anyway, I now precreate integral values from -10 to 10 with great > > effect. The cost is minimal, the benefit great. > > In an extension module, the knowledge about the application > domain is larger, so it may be reasonable to do the caching > there. I would still expect that in the typical application > where this is an issue, there is some kind of larger design bug. > > Regards, > Martin > From martin at v.loewis.de Mon Oct 2 16:37:09 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 Oct 2006 16:37:09 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> Message-ID: <45212415.5000104@v.loewis.de> Kristj?n V. J?nsson schrieb: > I can't see how this situation is any different from the re-use of > low ints. There is no fundamental law that says that ints below 100 > are more common than other, yet experience shows that this is so, > and so they are reused. There are two important differences: 1. it is possible to determine whether the value is "special" in constant time, and also fetch the singleton value in constant time for ints; the same isn't possible for floats. 2. it may be that there is a loss of precision in reusing an existing value (although I'm not certain that this could really happen). For example, could it be that two values compare successful in ==, yet are different values? I know this can't happen for integers, so I feel much more comfortable with that cache. > Rather than to view this as a programming error, why not simply > accept that this is a recurring pattern and adjust python to be more > efficient when faced by it? Surely a lot of karma lies that way? I'm worried about the penalty that this causes in terms of run-time cost. Also, how do you chose what values to cache? Regards, Martin From kristjan at ccpgames.com Mon Oct 2 17:08:08 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Mon, 2 Oct 2006 15:08:08 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> I see, you are thinking of the general fractional case. My point was that whole numbers seem to pop up often and to reuse those is easy I did a test of tracking actual floating point numbers and the majority of heavy usage comes from integral values. It would indeed be strange if some fractional number were heavily use but it can be argued that integral ones are "special" in many ways. Anyway, Skip noted that 50% of all floats are whole numbers between -10 and 10 inclusive, and this is the code that I employ in our python build today: PyObject * PyFloat_FromDouble(double fval) { register PyFloatObject *op; int ival; if (free_list == NULL) { if ((free_list = fill_free_list()) == NULL) return NULL; /* CCP addition, cache common values */ if (!f_reuse[0]) { int i; for(i = 0; i<21; i++) f_reuse[i] = PyFloat_FromDouble((double)(i-10)); } } /* CCP addition, check for recycling */ ival = (int)fval; if ((double)ival == fval && ival>=-10 && ival <= 10) { ival+=10; if (f_reuse[ival]) { Py_INCREF(f_reuse[ival]); return f_reuse[ival]; } } ... Cheers, Kristj?n > -----Original Message----- > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] > Sent: 2. okt?ber 2006 14:37 > To: Kristj?n V. J?nsson > Cc: Bob Ippolito; python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > Kristj?n V. J?nsson schrieb: > > I can't see how this situation is any different from the > re-use of low > > ints. There is no fundamental law that says that ints > below 100 are > > more common than other, yet experience shows that this is > so, and so > > they are reused. > > There are two important differences: > 1. it is possible to determine whether the value is "special" in > constant time, and also fetch the singleton value in constant > time for ints; the same isn't possible for floats. > 2. it may be that there is a loss of precision in reusing an existing > value (although I'm not certain that this could really happen). > For example, could it be that two values compare successful in > ==, yet are different values? I know this can't happen for > integers, so I feel much more comfortable with that cache. > > > Rather than to view this as a programming error, why not > simply accept > > that this is a recurring pattern and adjust python to be more > > efficient when faced by it? Surely a lot of karma lies that way? > > I'm worried about the penalty that this causes in terms of > run-time cost. Also, how do you chose what values to cache? > > Regards, > Martin > From mwh at python.net Mon Oct 2 17:22:14 2006 From: mwh at python.net (Michael Hudson) Date: Mon, 02 Oct 2006 16:22:14 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45212415.5000104@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Mon, 02 Oct 2006 16:37:09 +0200") References: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> <45212415.5000104@v.loewis.de> Message-ID: <2m4pumg2u1.fsf@starship.python.net> "Martin v. L?wis" writes: > Kristj?n V. J?nsson schrieb: >> I can't see how this situation is any different from the re-use of >> low ints. There is no fundamental law that says that ints below 100 >> are more common than other, yet experience shows that this is so, >> and so they are reused. > > There are two important differences: > 1. it is possible to determine whether the value is "special" in > constant time, and also fetch the singleton value in constant > time for ints; the same isn't possible for floats. I don't think you mean "constant time" here do you? I think most of the code posted so far has been constant time, at least in terms of instruction count, though some might indeed be fairly slow on some processors -- conversion from double to integer on the PowerPC involves a trip off to memory for example. Even so, everything should be fairly efficient compared to allocation, even with PyMalloc. > 2. it may be that there is a loss of precision in reusing an existing > value (although I'm not certain that this could really happen). > For example, could it be that two values compare successful in > ==, yet are different values? I know this can't happen for > integers, so I feel much more comfortable with that cache. I think the only case is that the two zeros compare equal, which is unfortunate given that it's the most compelling value to cache... I don't know a reliable and fast way to distinguish +0.0 and -0.0. Cheers, mwh -- The bottom tier is what a certain class of wanker would call "business objects" ... -- Greg Ward, 9 Dec 1999 From martin at v.loewis.de Mon Oct 2 17:34:39 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 Oct 2006 17:34:39 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <2m4pumg2u1.fsf@starship.python.net> References: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> <45212415.5000104@v.loewis.de> <2m4pumg2u1.fsf@starship.python.net> Message-ID: <4521318F.6080002@v.loewis.de> Michael Hudson schrieb: >> 1. it is possible to determine whether the value is "special" in >> constant time, and also fetch the singleton value in constant >> time for ints; the same isn't possible for floats. > > I don't think you mean "constant time" here do you? Right; I really wondered whether the code was dependent or independent of the number of special-case numbers. > I think most of > the code posted so far has been constant time, at least in terms of > instruction count, though some might indeed be fairly slow on some > processors -- conversion from double to integer on the PowerPC > involves a trip off to memory for example. Kristian's code testing only for integers in a range would be of that kind. Code that tests for a list of literals determined at compile time typically needs time "linear" with the number of special-cased constants (of course, as that there is a fixed number of constants, this is O(1)). >> 2. it may be that there is a loss of precision in reusing an existing >> value (although I'm not certain that this could really happen). >> For example, could it be that two values compare successful in >> ==, yet are different values? I know this can't happen for >> integers, so I feel much more comfortable with that cache. > > I think the only case is that the two zeros compare equal, which is > unfortunate given that it's the most compelling value to cache... Thanks for pointing that out. I can believe this is the only case in IEEE-754; I also wonder whether alternative implementations could cause problems (although I don't really worry too much about VMS). Regards, Martin From aahz at pythoncraft.com Mon Oct 2 18:51:30 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 2 Oct 2006 09:51:30 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <4521318F.6080002@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> <45212415.5000104@v.loewis.de> <2m4pumg2u1.fsf@starship.python.net> <4521318F.6080002@v.loewis.de> Message-ID: <20061002165130.GA1166@panix.com> On Mon, Oct 02, 2006, "Martin v. L?wis" wrote: > Michael Hudson schrieb: >> >> I think most of >> the code posted so far has been constant time, at least in terms of >> instruction count, though some might indeed be fairly slow on some >> processors -- conversion from double to integer on the PowerPC >> involves a trip off to memory for example. > > Kristian's code testing only for integers in a range would be of > that kind. Code that tests for a list of literals determined > at compile time typically needs time "linear" with the number of > special-cased constants (of course, as that there is a fixed > number of constants, this is O(1)). What if we do this work only on float()? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "LL YR VWL R BLNG T S" -- www.nancybuttons.com From jcarlson at uci.edu Mon Oct 2 19:33:03 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 02 Oct 2006 10:33:03 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <2m4pumg2u1.fsf@starship.python.net> References: <45212415.5000104@v.loewis.de> <2m4pumg2u1.fsf@starship.python.net> Message-ID: <20061002101308.08FC.JCARLSON@uci.edu> Michael Hudson wrote: > "Martin v. L?wis" writes: > > Kristj?n V. J?nsson schrieb: > >> I can't see how this situation is any different from the re-use of > >> low ints. There is no fundamental law that says that ints below 100 > >> are more common than other, yet experience shows that this is so, > >> and so they are reused. > > > > There are two important differences: > > 1. it is possible to determine whether the value is "special" in > > constant time, and also fetch the singleton value in constant > > time for ints; the same isn't possible for floats. > > I don't think you mean "constant time" here do you? I think most of > the code posted so far has been constant time, at least in terms of > instruction count, though some might indeed be fairly slow on some > processors -- conversion from double to integer on the PowerPC > involves a trip off to memory for example. Even so, everything should > be fairly efficient compared to allocation, even with PyMalloc. > > > 2. it may be that there is a loss of precision in reusing an existing > > value (although I'm not certain that this could really happen). > > For example, could it be that two values compare successful in > > ==, yet are different values? I know this can't happen for > > integers, so I feel much more comfortable with that cache. > > I think the only case is that the two zeros compare equal, which is > unfortunate given that it's the most compelling value to cache... > > I don't know a reliable and fast way to distinguish +0.0 and -0.0. The same way one could handle the lookups quickly; cast the pointer to a uint64 and dereference it. For all non-extended floats (I don't know the proper terminology, but their >64 bit precision is stored on the processor, not in memory), this will disambiguate *which* value it is. It may cause problems with NaNs and infities, but we aren't caching them, so we don't care. The result of all this is that we can do the following on Intel x86 platforms (replace with hex if desired)... switch (*(uint64*(&fval))) { case 13845191154443747328ULL: case 13844628204490326016ULL: case 13844065254536904704ULL: case 13842939354630062080ULL: case 13841813454723219456ULL: case 13840687554816376832ULL: case 13839561654909534208ULL: case 13837309855095848960ULL: case 13835058055282163712ULL: case 13830554455654793216ULL: case 0ULL: case 4607182418800017408ULL: case 4611686018427387904ULL: case 4613937818241073152ULL: case 4616189618054758400ULL: case 4617315517961601024ULL: case 4618441417868443648ULL: case 4619567317775286272ULL: case 4620693217682128896ULL: case 4621256167635550208ULL: /*lookup in the table */ default: break; } Each platform would need a new block depending on their endianness mixing of float/uint64 (if any), as well as depending on their double representations (as long as it conforms to IEEE764 fp doubles for these 21 values, they don't need a new one). - Josiah From tim.hochberg at ieee.org Mon Oct 2 19:43:51 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon, 02 Oct 2006 10:43:51 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17694.61885.527128.686743@montanaro.dyndns.org> References: <20060929081402.GB19781@craig-wood.com> <17694.61885.527128.686743@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote:\/ > Steve> By these statistics I think the answer to the original question > Steve> is clearly "no" in the general case. > > As someone else (Guido?) pointed out, the literal case isn't all that > interesting. I modified floatobject.c to track a few interesting > floating point values: > [...code...] > > So for a largely non-floating point "application", a fair number of floats > are allocated, a bit over 25% of them are -1.0, 0.0 or +1.0, and nearly 50% > of them are whole numbers between -10.0 and 10.0, inclusive. > > Seems like it at least deserves a serious look. It would be nice to have > the numeric crowd contribute to this subject as well. As a representative of the numeric crowd, I'll say that I've never noticed this to be a problem. I suspect that it's a non issue since we generally store our numbers in arrays, not big piles of Python floats, so there's no opportunity for identical floats to pile up. -tim From brett at python.org Mon Oct 2 22:01:28 2006 From: brett at python.org (Brett Cannon) Date: Mon, 2 Oct 2006 13:01:28 -0700 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) Message-ID: In the interest of time I have decided to go ahead and do the PEP 302 phase 2 work in C. I fully expect to tackle rewriting import in Python in my spare time after I finish this work since I will be much more familiar with how the whole import machinery works and it sounds like a fun challenge. The branch for the work is in pep302_phase2 . Any help would be appreciated in this work. I plan on keeping a BRANCH_PLANS file that outlines the what/why/how of the whole thing. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061002/97f7a7b8/attachment.htm From pje at telecommunity.com Mon Oct 2 22:59:43 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 02 Oct 2006 16:59:43 -0400 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: Message-ID: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> At 01:01 PM 10/2/2006 -0700, Brett Cannon wrote: >In the interest of time I have decided to go ahead and do the PEP 302 >phase 2 work in C. Just FYI, it's not possible (so far as I know) to implement phase 2 while maintaining backward compatibility with existing 2.x code. So this work shouldn't go back to the 2.x trunk without discussion of those issues. Essentially, I abandoned trying to do the phase 2 work for Python 2.5 because there's too much code in the field that depends on the current order of when special/built-in imports are processed vs. when PEP 302 imports are processed. Thus, instead of adding new PEP 302 APIs (like get_loader) to 'imp', I added them to 'pkgutil'. There are, I believe, some notes in that module's source regarding what the ordering issues are w/meta_path vs. the way import works now. That having been said, we could possibly have a transition for 2.6, but everybody who's written any PEP 302 emulation code (outside of pkgutil itself) would have to adapt their code somewhat. I'm surprised, however, that you think working on this in C is going to be *less* time than it would take to simply replace __import__ with a Python function that reimplements PEP 302... especially since pkgutil contains a whole lot of the code you'd need, e.g.: def __import__(...): ... loader = pkgutil.find_loader(fullname) if loader is not None: module = loader.load_module(fullname) ... And much of the rest of the above can probably be filled out by swiping code from ihooks, imputil, or other Python __import__ implementations. From p.f.moore at gmail.com Tue Oct 3 00:27:07 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 2 Oct 2006 23:27:07 +0100 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> Message-ID: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> On 10/2/06, Phillip J. Eby wrote: > Just FYI, it's not possible (so far as I know) to implement phase 2 while > maintaining backward compatibility with existing 2.x code. So this work > shouldn't go back to the 2.x trunk without discussion of those issues. While that's a fair point, we need to be clear what compatibility issues there are. The built in import mechanisms aren't well documented, so it's not a black-and-white situation. An unqualified statement "there are issues" isn't much help on its own... > Essentially, I abandoned trying to do the phase 2 work for Python 2.5 > because there's too much code in the field that depends on the current > order of when special/built-in imports are processed vs. when PEP 302 > imports are processed. Can you say what that code is, and who we should be talking to to understand their issues? If not, how do we find such code? Presumably, you've got a lot of feedback through your work on setuptools/eggs - do you have a record of who might participate in a discussion? > Thus, instead of adding new PEP 302 APIs (like > get_loader) to 'imp', I added them to 'pkgutil'. How does that help? Where the code goes doesn't seem likely to make much difference... > There are, I believe, > some notes in that module's source regarding what the ordering issues are > w/meta_path vs. the way import works now. The only notes I could see in pkgutil.py refer to special locations like the Windows registry, and refer to the fact that they will be searched after path entries, not before (for reasons I couldn't quite follow, but that's likely because I only read the comments fairly quickly). But if the whole mechanism is moved to sys.meta_path (which is what Phase 2 is about) surely it's possible to choose the ordering just by the order the importers go on sys.meta_path? > That having been said, we could possibly have a transition for 2.6, but > everybody who's written any PEP 302 emulation code (outside of pkgutil > itself) would have to adapt their code somewhat. I don't really see how we're going to address that other than by implementing it, and waiting for people with issues to speak up. Highlighting the changes early is good, as it avoids a mid-beta rush of people "suddenly" finding issues, but I doubt we'll do much better than that. > I'm surprised, however, that you think working on this in C is going to be > *less* time than it would take to simply replace __import__ with a Python > function that reimplements PEP 302... That I do agree with. There's a bootstrapping issue (you can't import the Python module that does all this without using a C-coded import mechanism) but that should be resolvable. > especially since pkgutil contains a > whole lot of the code you'd need, e.g.: Yes, I'm quite surprised at how much has appeared in pkgutil. The "what's new" entry is very terse, and the module documentation itself hasn't been updated to mention the new stuff. That's a shame, as it looks very useful (and as you say, could form a substantial part of this change if we were coding it in Python). Paul. From brett at python.org Tue Oct 3 00:48:16 2006 From: brett at python.org (Brett Cannon) Date: Mon, 2 Oct 2006 15:48:16 -0700 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: On 10/2/06, Paul Moore wrote: > > On 10/2/06, Phillip J. Eby wrote: > [SNIP] > > I'm surprised, however, that you think working on this in C is going to > be > > *less* time than it would take to simply replace __import__ with a > Python > > function that reimplements PEP 302... > > That I do agree with. There's a bootstrapping issue (you can't import > the Python module that does all this without using a C-coded import > mechanism) but that should be resolvable. This is why I asked for input from people on which would take less time. Almost all the answers I got was that the the C code was delicate but that it was workable. Several people said they wished for a Python implementation, but hardly anyone said flat-out, "don't waste your time, the Python version will be faster to do". As for the bootstrapping, I am sure it is resolvable as well. There are several ways to go about it that are all tractable. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061002/2f47cbd4/attachment.html From tdelaney at avaya.com Tue Oct 3 01:47:03 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 3 Oct 2006 09:47:03 +1000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> skip at pobox.com wrote: > Steve> By these statistics I think the answer to the original > question Steve> is clearly "no" in the general case. > > As someone else (Guido?) pointed out, the literal case isn't all that > interesting. I modified floatobject.c to track a few interesting > floating point values: > > static unsigned int nfloats[5] = { > 0, /* -1.0 */ > 0, /* 0.0 */ > 0, /* +1.0 */ > 0, /* everything else */ > 0, /* whole numbers from -10.0 ... 10.0 */ > }; > > PyObject * > PyFloat_FromDouble(double fval) > { > register PyFloatObject *op; > if (free_list == NULL) { > if ((free_list = fill_free_list()) == NULL) > return NULL; > } > > if (fval == 0.0) nfloats[1]++; > else if (fval == 1.0) nfloats[2]++; > else if (fval == -1.0) nfloats[0]++; > else nfloats[3]++; > > if (fval >= -10.0 && fval <= 10.0 && (int)fval == fval) { > nfloats[4]++; > } This doesn't actually give us a very useful indication of potential memory savings. What I think would be more useful is tracking the maximum simultaneous count of each value i.e. what the maximum refcount would have been if they were shared. Tim Delaney From pje at telecommunity.com Tue Oct 3 01:52:09 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 02 Oct 2006 19:52:09 -0400 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: References: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: <5.1.1.6.0.20061002194406.03c28ac0@sparrow.telecommunity.com> At 03:48 PM 10/2/2006 -0700, Brett Cannon wrote: >On 10/2/06, Paul Moore <p.f.moore at gmail.com> >wrote: >>On 10/2/06, Phillip J. Eby >><pje at telecommunity.com> wrote: >>[SNIP] >> > I'm surprised, however, that you think working on this in C is going to be >> > *less* time than it would take to simply replace __import__ with a Python >> > function that reimplements PEP 302... >> >>That I do agree with. There's a bootstrapping issue (you can't import >>the Python module that does all this without using a C-coded import >>mechanism) but that should be resolvable. > >This is why I asked for input from people on which would take less >time. Almost all the answers I got was that the the C code was delicate >but that it was workable. Several people said they wished for a Python >implementation, but hardly anyone said flat-out, "don't waste your time, >the Python version will be faster to do". > >As for the bootstrapping, I am sure it is resolvable as well. There are >several ways to go about it that are all tractable. When I implemented the PEP 302 fix for the import speedups, I basically prototyped it using Python code that got loaded prior to 'site.py'. Once I had the Python version solid, I converted it to a C type via straightforward code transcription. That's pretty much the route I would follow for this too, although of course "freezing" the Python version into C code is also an option, since there's not much performance benefit to be had from a C translation, except for two parts of __import__: the part that checks sys.modules to shortcut the process, and the part that runs after the target module has been loaded or found. Aside from this "fast path" part of __import__, any additional interpretation overhead will probably be dwarfed by I/O considerations. From brett at python.org Tue Oct 3 01:52:46 2006 From: brett at python.org (Brett Cannon) Date: Mon, 2 Oct 2006 16:52:46 -0700 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for a new issue tracker Message-ID: On behalf of the PSF Infrastructure committee, I am happy to report that we have reached a recommendation for a new issue tracker for Python! But first, I want to extend our thanks to all who stepped forward to provide the committee with a test installation of an issue tracker to use as a basis of our evaluations. Having several trackers to compare may have made this more time-consuming, but it helped to realize what people did and did not like about the various issue trackers and solidify what we thought python-dev would want. Thank you! The Infrastructure committee (Andrew Kuchling, Thomas Wouters, Barry Warsaw, Martin v. Loewis, and myself; Richard Jones excused himself from the discussion because of personal bias) met and discussed the four trackers being considered to replace SourceForge: Launchpad, JIRA, Roundup, and Trac. After evaluating the trackers on several points (issue creation, querying, etc.), we reached a tie between JIRA and Roundup in terms of pure tracker features. For JIRA, members found it to be a very powerful, polished issue tracker. But some found it to be a little more complex than they would like in an issue tracker. Roundup, on the other hand, had the exact opposite points. While not as polished as JIRA, it is the simpler tracker which some committee members preferred. As for Trac and Launchpad, both had fundamental issues that led to them not being chosen in the end. Most of the considerations had to do with customization or UI problems. With JIRA and Roundup being considered equal overall in terms of the tracker themselves, there is the tie-breaking issue of hosting. Atlassian, the company that created JIRA, has offered us free hosting of a JIRA installation. This cannot be overlooked as keeping an issue tracker running is not easy and requires supervision at various hours of the day to make sure possible downtime is minimized. There is also always the issue of upgrading, etc. that come with any major software installation. Details on the hosting is pasted in at the end of this email as provided by Jonathan Nolen of Atlassian. He has also been cc:ed on this email so as to allow him to answer any questions directly. In order for Roundup to be considered equivalent in terms of an overall tracker package there needs to be a sufficient number of volunteer admins (roughly 6 - 10 people) who can help set up and maintain the Roundup installation. If enough people can be gathered, then Roundup will become the recommendation of the committee based on the fact that the trackers are roughly equal but that Roundup is implemented in Python and is FLOSS. If not enough support can be gathered, the committee's recommendation of going with JIRA will stand. If people want Roundup to be considered the tracker we go with by volunteering to be an admin, please email infrastructure at python.org and state your time commitment, the timezone you would be working from, and your level of Roundup knowledge. Please email the committee by October 16. If enough people step forward we will notify python-dev that Roundup should be considered the recommendation of the committee and graciously turn down Atlassian's offer. -Brett Cannon Chairman, PSF Infrastructure Committee ----------------------------------------------------------- [email from Jonathan, unedited, with details about hosting] Hosting is with http://contegix.com. They host all of our servers, as well as those of Cenqua, Codehaus, Jive (I think), and a bunch of other folks in the Java community. They have engineers online 24x7x365. I've contacted them at all hours of the night and weekend and never failed to get a response with 5 minutes, though they guarantee 30 minutes. The engineers I've worked with have been universally top-notch. They've been able to help with every kind of question I've thrown at them. It's hard to describe how great they are, but it's like having a full-time sysadmin on staff who knows everything about your systems, who never goes to sleep, and who is always seems chipper at the very thought of making any change you might ask. Ideally, we'd set it up so that the appropriate members of the Python team could contact Contegix directly for any requests you may have. You'll also have direct access yourself if you need to do any work on your own. As far as the export, they will set it up any way you like. The two obvious ways that come to mind are copying the XML backup or a database dump each night (Or whatever frequency you specify). Either option would allow you to fully restore a JIRA instance to the point of the backup with full history. They will pro-actively keep your apps up to date as well. They usually know as soon as we release new versions and will contact your to arrange upgrades almost immediately. They also perform things like OS upgrades and patches on a regular basis without having to be prompted. Contegix will set up monitoring on your server(s) to watch things like disk-space, memory, CPU and networking usage. If any of those resources starts to get maxed out, they'll let us know and offer advice on how to fix it. Right now, we have the Python stuff and the Mailman stuff on one server. There should be enough capacity for both, but if your usage grows to the point where we need more hardware, we can arrange that (within reason). If you ever needed to make your own arrangements with Contegix, their rates are reasonable, and you can either buy or lease hardware as you choose. I'm also sure that they would be flexible for a active, popular, open-source project such as Python. When Barry and I spoke, he told me that you had four or five other servers scattered around the world running things like SVN, mail and web. If you would ever be interested in consolidating those services with Contegix, it is likely that we could help you out with those as well. SVN would be a particular benefit, as the Fisheye Plugin for JIRA is really useful, and will perform better over the local network. It can still be used from your current host, it'll just be a little slower to get new information. I should also mention that Atlassian will soon be introducing two new products: Crowd, a user-management/single sign-on solution and Bamboo, a build server. if you guys are interested in trying either of those, you're welcome to them. I can imagine both might be useful to a project like Python. I'm happy to help out, and we continue to be very interested in seeing the project happen. If there's anything further we can do, don't hesitate to ask. Cheers, Jonathan P.S. Here's is Contegix's material about their service: Data Center Contegix's data center is located in the Bandwidth Exchange Building on Walnut Avenue, which is the premier, carrier building for this region. Security is very important to our clients and us. As a result, access beyond the lobby requires a code access for the elevators. Once someone reaches our floor, all of our perimeter doors require both a card key access combined with a matching biometric palm scan to access our facility. Once someone has been admitted to our suite, they are then required to log in and IDs checked against our Access Log for customers. Once authenticated, a Customer Badge will be issued. Visitors are only allowed escorted access to the data center and NOC on an as needed basis. In addition to all exterior doors being controlled access, all internal doors leading to the data center also require an additional card scan for access. Within the data center, all customer equipment is located in locked cabinets or cages. In addition to restricted access, the facility is monitored with digital cameras 24x7 recording all movement within the data center. Technical Support Our facility is staffed with Tier 3 Support Engineers 24x7x365. Our engineers are available to assist you with any needs you may have at any time. Because the highest level of support available is key to both of our businesses, Contegix engineers focus upon keeping your application and data available at all times. Therefore, we guarantee all support requests will be responded to within thirty minutes and the average response is three minutes. In addition, through our custom monitoring system, we are capable of actively monitoring almost anything you would like monitored. Many of our engineers are Dell certified technicians. In addition, we maintain ample stock of spare parts for Dell servers including hard drives, memory, etc. Rest assured, every precaution and measure is taken to ensure your equipment will be up and running should a hardware failure occur. Contegix Network Contegix offers one of the strongest networks available in the industry. Our network infrastructure is fully meshed, running redundant Juniper routers and Foundry BigIron core switches. We have five Tier 1 providers including Sprint, Level (3), MCI, XO and WilTel running BGP4. Because our data center is located in the Bandwidth Exchange Building, which is the Internet hub for this region, all of our connections to our providers in our managed network are "On Net" meaning we connect directly to the Internet avoiding local loops and local connections. Our core switching infrastructure provides the ability to deliver load balancing without a significant investment in equipment. When your needs grow, Contegix will be able to deliver. Our network is enhanced by our Intelligent Routing Solution and DDoS Mitigation/Protection system, which drastically improves the quality of our network performance and reliability. One of the benefits of our redundant, intelligently routed network is our 100% Network Uptime guarantee, delivered in writing. Power Infrastructure Our power infrastructure was built with redundancy in mind. All power supplied to the data center is clean and constant coming from the redundant UPS' (AC) or battery plants (DC). The PowerWare UPS systems run in a redundant configuration to maximize reliability. The UPS and battery plants are being constantly charged by our dual grid connection to Ameren. There is an Automatic Transfer Switch between the two grids. In addition, if power is interrupted to the UPS/battery plants, another Automatic Transfer Switch automatically starts the diesel generator farm. All of this occurs instantaneously without human intervention to eliminate potential mistakes or errors and maximizing performance. Environmental Controls and Protection Our Environmental Systems run in a redundant configuration. Each Environmental Control Unit/CRAC has a redundant "twin" on stand-by to take over in the event of a failure or service affecting health issue. These units maintain constant temperature (72?) and humidity (45%) in the data center. Contegix has configured our data center with hot and cold aisles for maximum cooling performance. Fire Detection / Suppression is configured with three independent systems. The first two monitor for temperature and smoke. The third system is a VESDA system that inspects air samples with a laser to detect any potential fire hazards prior to an actual fire event. Our sprinkler system is dry pipe / pre-action which means the sprinkler lines are filled with compressed air, not water. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061002/362491f0/attachment-0001.htm From tjreedy at udel.edu Tue Oct 3 02:05:28 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 2 Oct 2006 20:05:28 -0400 Subject: [Python-Dev] Caching float(0.0) References: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> Message-ID: "Kristj?n V. J?nsson" wrote in message news:129CEF95A523704B9D46959C922A280002FE99A9 at nemesis.central.ccp.cc... >Anyway, Skip noted that 50% of all floats are whole numbers between -10 >and 10 inclusive, Please, no. He said something like this about *non-floating-point applications* (evidence unspecified, that I remember). But such applications, by definition, usually don't have enough floats for caching (or conversion time) to matter too much. For true floating point measurements (of temperature, for instance), 'integral' measurements (which are an artifact of the scale used (degrees F versus C versus K)) should generally be no more common than other realized measurements. Thirty years ago, a major stat package written in Fortran (BMDP) required that all data be stored as (Fortran 4-byte) floats for analysis. So a column of yes/no or male/female data would be stored as 0.0/1.0 or perhaps 1.0/2.0. That skewed the distribution of floats. But Python and, I hope, Python apps, are more modern than that. >and this is the code that I employ in our python build today: [snip] For the analysis of typical floating point data, this is all pointless and a complete waste of time. After a billion comversions or so, I expect the extra time might add up to something noticeable. > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] >> I'm worried about the penalty that this causes in terms of >> run-time cost. Me too. >> Also, how do you chose what values to cache? At one time (don't know about today), it was mandatory in some Fortran circles to name the small float constants used in a particular program with the equivalent of C #defines. In Python, zero = 0.0, half = 0.5, one = 1.0, twopi = 6.29..., eee = 2.7..., phi = .617..., etc. (Note that naming is not restricted to integral or otherwise 'nice' values.) The purpose then was to allow easy conversion from float to double to extended double. And in some cases, it also made the code clearer. With Python, the same procedure would guarantee only one copy (caching) of the same floats for constructed data structures. Float caching strikes me a a good subject for cookbook recipies, but not, without real data and a willingness to slightly screw some users, for the default core code. Terry Jan Reedy From amk at amk.ca Tue Oct 3 02:21:10 2006 From: amk at amk.ca (A.M. Kuchling) Date: Mon, 2 Oct 2006 20:21:10 -0400 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: <20061003002110.GA20505@rogue.amk.ca> On Mon, Oct 02, 2006 at 11:27:07PM +0100, Paul Moore wrote: > Yes, I'm quite surprised at how much has appeared in pkgutil. The > "what's new" entry is very terse, and the module documentation itself > hasn't been updated to mention the new stuff. These two things are related, of course; I couldn't figure out which bits of pkgutil.py are intended to be publicly used and which weren't. There's an __all__ in the module, but some things such as read_code() don't look like they're intended for external use. --amk From skip at pobox.com Tue Oct 3 02:50:44 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 2 Oct 2006 19:50:44 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> Message-ID: <17697.46052.714229.687538@montanaro.dyndns.org> Tim> This doesn't actually give us a very useful indication of potential Tim> memory savings. What I think would be more useful is tracking the Tim> maximum simultaneous count of each value i.e. what the maximum Tim> refcount would have been if they were shared. Most definitely. I just posted what I came up with in about two minutes. I'll add some code to track the high water mark as well and report back. Skip From skip at pobox.com Tue Oct 3 02:53:34 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 2 Oct 2006 19:53:34 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> Message-ID: <17697.46222.513799.299606@montanaro.dyndns.org> Terry> "Kristj?n V. J?nsson" wrote: >> Anyway, Skip noted that 50% of all floats are whole numbers between >> -10 and 10 inclusive, Terry> Please, no. He said something like this about Terry> *non-floating-point applications* (evidence unspecified, that I Terry> remember). But such applications, by definition, usually don't Terry> have enough floats for caching (or conversion time) to matter too Terry> much. Correct. The non-floating-point application I chose was the one that was most immediately available, "make test". Note that I have no proof that regrtest.py isn't terribly floating point intensive. I just sort of guessed that it was. Skip From pje at telecommunity.com Tue Oct 3 03:04:31 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 02 Oct 2006 21:04:31 -0400 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: <20061003002110.GA20505@rogue.amk.ca> References: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: <5.1.1.6.0.20061002205715.03c331f0@sparrow.telecommunity.com> At 08:21 PM 10/2/2006 -0400, A.M. Kuchling wrote: >On Mon, Oct 02, 2006 at 11:27:07PM +0100, Paul Moore wrote: > > Yes, I'm quite surprised at how much has appeared in pkgutil. The > > "what's new" entry is very terse, and the module documentation itself > > hasn't been updated to mention the new stuff. > >These two things are related, of course; I couldn't figure out which >bits of pkgutil.py are intended to be publicly used and which weren't. >There's an __all__ in the module, but some things such as read_code() >don't look like they're intended for external use. The __all__ listing is correct; I intended to expose read_code() for the benefit of other importer implementations and Python utilities. Over the years, I've found myself writing the equivalent of read_code() several times, so it seemed to me to make sense to expose it as a utility function, since it already needed to be there for the ImpLoader class to work. In general, the idea behind the additions to pkgutil was to make life easier for people doing import-related operations, by being a Python reference implementation of commonly-reinvented parts of the import process. The '-m' machinery in 2.5 had a bunch of this stuff in it, and so did setuptools, so I yanked the code from both and refactored it to allow reuse by both, then fleshed it out to support all the optional PEP 302 loader protocols, and additional protocols needed to support tools like pydoc being able to run against arbitrary importers (esp. zip files). From skip at pobox.com Tue Oct 3 03:25:12 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 2 Oct 2006 20:25:12 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17697.46052.714229.687538@montanaro.dyndns.org> References: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> <17697.46052.714229.687538@montanaro.dyndns.org> Message-ID: <17697.48120.767852.672495@montanaro.dyndns.org> skip> Most definitely. I just posted what I came up with in about two skip> minutes. I'll add some code to track the high water mark as well skip> and report back. Using the smallest change I could get away with, I came up with these allocation figures (same as before): -1.0: 29048 0.0: 524340 +1.0: 91560 rest: 1753479 whole numbers -10.0 to 10.0: 1151543 and these max ref counts: -1.0: 16 0.0: 136 +1.0: 161 rest: 1 whole numbers -10.0 to 10.0: 161 When I have a couple more minutes I'll just implement a cache for whole numbers between -10.0 and 10.0 and test that whole range of values right. Skip From nick at craig-wood.com Tue Oct 3 10:14:41 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Tue, 3 Oct 2006 09:14:41 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> Message-ID: <20061003081441.GA12283@craig-wood.com> On Tue, Oct 03, 2006 at 09:47:03AM +1000, Delaney, Timothy (Tim) wrote: > This doesn't actually give us a very useful indication of potential > memory savings. What I think would be more useful is tracking the > maximum simultaneous count of each value i.e. what the maximum refcount > would have been if they were shared. It isn't just memory savings we are playing for. Even if 0.0 is allocated and de-allocated 10,000 times in a row, there would be no memory savings by caching its value. However there would be a) less allocator overhead - allocation objects is relatively expensive b) better caching of the value c) less cache thrashing I think you'll find that even in the no memory saving case a few cycles spent on comparison with 0.0 (or maybe a few other values) will speed up programs. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From nick at craig-wood.com Tue Oct 3 10:17:26 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Tue, 3 Oct 2006 09:17:26 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17697.46222.513799.299606@montanaro.dyndns.org> References: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> <17697.46222.513799.299606@montanaro.dyndns.org> Message-ID: <20061003081726.GB12283@craig-wood.com> On Mon, Oct 02, 2006 at 07:53:34PM -0500, skip at pobox.com wrote: > Terry> "Kristj?n V. J?nsson" wrote: > >> Anyway, Skip noted that 50% of all floats are whole numbers between > >> -10 and 10 inclusive, > > Terry> Please, no. He said something like this about > Terry> *non-floating-point applications* (evidence unspecified, that I > Terry> remember). But such applications, by definition, usually don't > Terry> have enough floats for caching (or conversion time) to matter too > Terry> much. > > Correct. The non-floating-point application I chose was the one that was > most immediately available, "make test". Note that I have no proof that > regrtest.py isn't terribly floating point intensive. I just sort of guessed > that it was. For my application caching 0.0 is by far the most important. 0.0 has ~200,000 references - the next highest reference count is only about ~200. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From fredrik at pythonware.com Tue Oct 3 10:32:07 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 03 Oct 2006 10:32:07 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> Message-ID: Terry Reedy wrote: > For true floating point measurements (of temperature, for instance), > 'integral' measurements (which are an artifact of the scale used (degrees F > versus C versus K)) should generally be no more common than other realized > measurements. a real-life sensor is of course where the 121.216 in my original post to this thread came from. (note that most real-life sensors involve A/D conversion at some point, which means that they provide a limited number of discrete values. but only the code dealing with the source data will be able to make any meaningful assumptions about those values.) I still think it might make sense to special-case float("0.0") (padding, default values, etc) inside PyFloat_FromDouble, and possibly also float("1.0") (scale factors, unit vectors, normalized max values, etc) but everything else is just generalizing from random observations. adding a few notes to the C API documentation won't hurt either, I suppose. (e.g. "note that each call to PyFloat_FromDouble may create a new floating point object; if you're converting data from some internal format to Python floats, it's often more efficient to map directly to preallocated shared PyFloat objects, instead of mapping first to float or double and then calling PyFloat_FromDouble on that value"). From nmm1 at cus.cam.ac.uk Tue Oct 3 11:12:04 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 10:12:04 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Mon, 02 Oct 2006 20:05:28 EDT." Message-ID: "Terry Reedy" wrote: > > For true floating point measurements (of temperature, for instance), > 'integral' measurements (which are an artifact of the scale used (degrees F > versus C versus K)) should generally be no more common than other realized > measurements. Not quite, but close enough. A lot of algorithms use a conversion to integer, or some of the values are actually counts (e.g. in statistics), which makes them a bit more likely. Not enough to get excited about, in general. > Thirty years ago, a major stat package written in Fortran (BMDP) required > that all data be stored as (Fortran 4-byte) floats for analysis. So a > column of yes/no or male/female data would be stored as 0.0/1.0 or perhaps > 1.0/2.0. That skewed the distribution of floats. But Python and, I hope, > Python apps, are more modern than that. And SPSS and Genstat and others - now even Excel .... > Float caching strikes me a a good subject for cookbook recipies, but not, > without real data and a willingness to slightly screw some users, for the > default core code. Yes. It is trivial (if tedious) to add analysis code - the problem is finding suitable representative applications. That was always my difficulty when I was analysing this sort of thing - and still is when I need to do it! > Nick Craig-Wood wrote: > > For my application caching 0.0 is by far the most important. 0.0 has > ~200,000 references - the next highest reference count is only about ~200. Yes. All the experience I have ever seen over the past 4 decades confirms that is the normal case, with the exception of floating-point representations that have a missing value indicator. Even in IEEE 754, infinities and NaN are rare unless the application is up the spout. There are claims that a lot of important ones have a lot of NaNs and use them as missing values but, despite repeated requests, none of the people claiming that have ever provided an example. There are some pretty solid grounds for believing that those claims are not based in fact, but are polemic. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From ncoghlan at gmail.com Tue Oct 3 11:34:03 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 03 Oct 2006 19:34:03 +1000 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> Message-ID: <45222E8B.30304@gmail.com> Hans Polak wrote: > Ok, I see your point. Really, I've read more about Python than worked with > it, so I'm out of my league here. > > Can I combine your suggestion with mine and come up with the following: > > do: > > > while > else: > In my example, the 3 sections (, and are all optional. A basic do-while loop would look like this: do: while (That is, is still repeated each time around the loop - it's called that because it is run before the loop evaluated condition is evaluated) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fuzzyman at voidspace.org.uk Tue Oct 3 12:00:18 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Tue, 03 Oct 2006 11:00:18 +0100 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <45222E8B.30304@gmail.com> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> <45222E8B.30304@gmail.com> Message-ID: <452234B2.8040103@voidspace.org.uk> Nick Coghlan wrote: >Hans Polak wrote: > > >>Ok, I see your point. Really, I've read more about Python than worked with >>it, so I'm out of my league here. >> >>Can I combine your suggestion with mine and come up with the following: >> >> do: >> >> >> while >> else: >> >> >> > >In my example, the 3 sections (, and code> are all optional. A basic do-while loop would look like this: > > do: > > while > >(That is, is still repeated each time around the loop - it's >called that because it is run before the loop evaluated condition is evaluated) > > +1 This looks good. The current idiom works fine, but looks unnatural : while True: if : break Would a 'while' outside of a 'do' block (but without the colon) then be a syntax error ? 'do:' would just be syntactic sugar for 'while True:' I guess. Michael Foord http://www.voidspace.org.uk >Cheers, >Nick. > > > From kristjan at ccpgames.com Tue Oct 3 12:15:26 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Tue, 3 Oct 2006 10:15:26 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99B1@nemesis.central.ccp.cc> But that is precisely the point. A non-floating point application tends to use floating point values in a predictable way, with a lot of integral values floating around and lots of zeroes. As this constitutes the majority of python applications (okay, daring assumption here) it seems to warrant some consideration. In one of my first messages on the subject I promised to report refcounts of -1.0, 0.0 and 1.0 for the EVE server as being. I didn't but instead gave you the frequency of the values reported. Well , now I can provide you with refcounts for the [-10, 10] range plus the total float count, of a server that has just started up: -10,0 589 -9,0 56 -8,0 65 -7,0 63 -6,0 243 -5,0 731 -4,0 550 -3,0 246 -2,0 246 -1,0 1096 0,0 195446 1,0 79382 2,0 9650 3,0 6224 4,0 5223 5,0 14766 6,0 2616 7,0 1303 8,0 3307 9,0 1447 10,0 8102 total: 331351 The total count of floating point numbers allocated at this point is 985794. Without the reuse, they would be 1317145, so this is a saving of 25%, and of 5Mb. Kristj?n > -----Original Message----- > From: python-dev-bounces+kristjan=ccpgames.com at python.org > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] > On Behalf Of skip at pobox.com > Sent: 3. okt?ber 2006 00:54 > To: Terry Reedy > Cc: python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > > Terry> "Kristj?n V. J?nsson" wrote: > >> Anyway, Skip noted that 50% of all floats are whole > numbers between > >> -10 and 10 inclusive, > > Terry> Please, no. He said something like this about > Terry> *non-floating-point applications* (evidence > unspecified, that I > Terry> remember). But such applications, by definition, > usually don't > Terry> have enough floats for caching (or conversion > time) to matter too > Terry> much. > > Correct. The non-floating-point application I chose was the > one that was most immediately available, "make test". Note > that I have no proof that regrtest.py isn't terribly floating > point intensive. I just sort of guessed that it was. > > Skip From nmm1 at cus.cam.ac.uk Tue Oct 3 12:32:05 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 11:32:05 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 10:15:26 -0000." <129CEF95A523704B9D46959C922A280002FE99B1@nemesis.central.ccp.cc> Message-ID: =?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?= wrote: > > The total count of floating point numbers allocated at this point is 985794. > Without the reuse, they would be 1317145, so this is a saving of 25%, and > of 5Mb. And, if you optimised just 0.0, you would get 60% of that saving at a small fraction of the cost and considerably greater generality. It isn't clear whether the effort justifies doing more. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From skip at pobox.com Tue Oct 3 13:21:08 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 3 Oct 2006 06:21:08 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE99B1@nemesis.central.ccp.cc> Message-ID: <17698.18340.82069.83941@montanaro.dyndns.org> >> The total count of floating point numbers allocated at this point is >> 985794. Without the reuse, they would be 1317145, so this is a >> saving of 25%, and of 5Mb. Nick> And, if you optimised just 0.0, you would get 60% of that saving Nick> at a small fraction of the cost and considerably greater Nick> generality. It isn't clear whether the effort justifies doing Nick> more. Doesn't that presume that optimizing just 0.0 could be done easily? Suppose 0.0 is generated all over the place in EVE? Skip From nmm1 at cus.cam.ac.uk Tue Oct 3 13:38:35 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 12:38:35 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 06:21:08 CDT." <17698.18340.82069.83941@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > > Doesn't that presume that optimizing just 0.0 could be done easily? Suppose > 0.0 is generated all over the place in EVE? Yes, and it isn't, respectively! The changes in floatobject.c would be trivial (if tedious), and my recollection of my scan is that floating values are not generated elsewhere. It would be equally easy to add a general caching algorithm, but that would be a LOT slower than a simple floating-point comparison. The problem (in Python) isn't hooking the checks into place, though it could be if Python were implemented differently. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From martin at v.loewis.de Tue Oct 3 14:25:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 14:25:38 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20061003081441.GA12283@craig-wood.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> <20061003081441.GA12283@craig-wood.com> Message-ID: <452256C2.60206@v.loewis.de> Nick Craig-Wood schrieb: > Even if 0.0 is allocated and de-allocated 10,000 times in a row, there > would be no memory savings by caching its value. > > However there would be > a) less allocator overhead - allocation objects is relatively expensive > b) better caching of the value > c) less cache thrashing > > I think you'll find that even in the no memory saving case a few > cycles spent on comparison with 0.0 (or maybe a few other values) will > speed up programs. Can you demonstrate that speedup? It is quite difficult to anticipate the performance impact of a change, in particular if there is no change in computational complexity. Various effects tend to balance out each other. Regards, Martin From martin at v.loewis.de Tue Oct 3 14:30:35 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 14:30:35 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <452257EB.6070601@v.loewis.de> Nick Maclaren schrieb: >> The total count of floating point numbers allocated at this point is 985794. >> Without the reuse, they would be 1317145, so this is a saving of 25%, and >> of 5Mb. > > And, if you optimised just 0.0, you would get 60% of that saving at > a small fraction of the cost and considerably greater generality. As Michael Hudson observed, this is difficult to implement, though: You can't distinguish between -0.0 and +0.0 easily, yet you should. Regards, Martin From fredrik at pythonware.com Tue Oct 3 14:56:54 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 3 Oct 2006 14:56:54 +0200 Subject: [Python-Dev] what's really new in python 2.5 ? Message-ID: just noticed that the first google hit for "what's new in python 2.5": http://docs.python.org/dev/whatsnew/whatsnew25.html points to a document that's a weird mix between that actual document, and a placeholder for "what's new in python 2.6". From nmm1 at cus.cam.ac.uk Tue Oct 3 15:11:27 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 14:11:27 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 14:30:35 +0200." <452257EB.6070601@v.loewis.de> Message-ID: =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > > >> The total count of floating point numbers allocated at this point is 985794. > >> Without the reuse, they would be 1317145, so this is a saving of 25%, and > >> of 5Mb. > > > > And, if you optimised just 0.0, you would get 60% of that saving at > > a small fraction of the cost and considerably greater generality. > > As Michael Hudson observed, this is difficult to implement, though: > You can't distinguish between -0.0 and +0.0 easily, yet you should. That was the point of a previous posting of mine in this thread :-( You shouldn't, despite what IEEE 754 says, at least if you are allowing for either portability or numeric validation. There are a huge number of good reasons why IEEE 754 signed zeroes fit extremely badly into any normal programming language and are seriously incompatible with numeric validation, but Python adds more. Is there any other type where there are two values that are required to be different, but where both the hash is required to be zero and both are required to evaluate to False in truth value context? Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From amk at amk.ca Tue Oct 3 15:40:51 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 3 Oct 2006 09:40:51 -0400 Subject: [Python-Dev] 2.4.4 fixes Message-ID: <20061003134051.GA21154@rogue.amk.ca> I've gone through the 'backport candidate' bugs listed on and applied most of them. Some I didn't apply because I don't understand them well enough to determine if they're correct for 2.4: * r47061 (recursionerror fix) * r46602 (tokenizer.c bug; patch doesn't apply cleanly) * r46589 (let dicts propagate eq errors; dictresize bug -- this led to a big long 2.5 discussion, so I won't backport. Maybe someone can extract just the dictresize bugfix.) * r39044 (A C threading API bug) There are also some other bugs listed on the wiki page that involve metaclasses; I'm not going to touch them. subprocess.py received a number of bugfixes in 2.5, but also some API additions. Can someone please look at these and apply the fixes? The wiki page now lists all the revisions stemming from valgrind and Klocwork errors. There are a lot of them; more volunteers will be necessary if they're all to get looked at and possibly backported. --amk From ncoghlan at gmail.com Tue Oct 3 15:51:22 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 03 Oct 2006 23:51:22 +1000 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <452234B2.8040103@voidspace.org.uk> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> <45222E8B.30304@gmail.com> <452234B2.8040103@voidspace.org.uk> Message-ID: <45226ADA.9080306@gmail.com> Fuzzyman wrote: > Nick Coghlan wrote: >> In my example, the 3 sections (, and > code> are all optional. A basic do-while loop would look like this: >> >> do: >> >> while >> >> (That is, is still repeated each time around the loop - it's >> called that because it is run before the loop evaluated condition is evaluated) >> >> > > +1 > > This looks good. I'm pretty sure it was proposed by someone else a long time ago - I was surprised to find it wasn't mentioned in PEP 315. That said, Guido's observation on PEP 315 from earlier this year holds for me too: "I kind of like it but it doesn't strike me as super important" [1] > The current idiom works fine, but looks unnatural : > > while True: > if : > break There's the rationale for the PEP in a whole 5 lines counting whitespace ;) > Would a 'while' outside of a 'do' block (but without the colon) then be > a syntax error ? > > 'do:' would just be syntactic sugar for 'while True:' I guess. That's the slight issue I still have with the idea - you could end up with multiple ways of spelling some of the basic loop forms, such as these 3 flavours of infinite loop: do: pass # Is there an implicit 'while True' at the end of the loop body? do: while True while True: pass The other issue I have is that I'm not yet 100% certain it's implementable with Python's parser and grammar. I *think* changing the definition of the while statement from: while_stmt ::= "while" expression ":" suite ["else" ":" suite] to while_stmt ::= "while" expression [":" suite ["else" ":" suite]] And adding a new AST node and a new type of compiler frame block "DO_LOOP" would do the trick (the compilation of a while statement without a trailing colon would then check that it was in a DO_LOOP block and raise an error if not). Cheers, Nick. [1] http://mail.python.org/pipermail/python-dev/2006-February/060711.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Tue Oct 3 16:10:31 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 16:10:31 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <45226F57.108@v.loewis.de> Nick Maclaren schrieb: > That was the point of a previous posting of mine in this thread :-( > > You shouldn't, despite what IEEE 754 says, at least if you are > allowing for either portability or numeric validation. > > There are a huge number of good reasons why IEEE 754 signed zeroes > fit extremely badly into any normal programming language and are > seriously incompatible with numeric validation, but Python adds more. > Is there any other type where there are two values that are required > to be different, but where both the hash is required to be zero and > both are required to evaluate to False in truth value context? Ah, you are proposing a semantic change, then: -0.0 will become unrepresentable, right? Regards, Martin From fdrake at acm.org Tue Oct 3 16:18:50 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 3 Oct 2006 10:18:50 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: Message-ID: <200610031018.50930.fdrake@acm.org> On Tuesday 03 October 2006 08:56, Fredrik Lundh wrote: > just noticed that the first google hit for "what's new in python 2.5": > > http://docs.python.org/dev/whatsnew/whatsnew25.html > > points to a document that's a weird mix between that actual document, and > a placeholder for "what's new in python 2.6". I suspect Google (and all other search engines) should be warded off from docs.python.org/dev/. -Fred -- Fred L. Drake, Jr. From fuzzyman at voidspace.org.uk Tue Oct 3 16:28:31 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Tue, 03 Oct 2006 15:28:31 +0100 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <45226ADA.9080306@gmail.com> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> <45222E8B.30304@gmail.com> <452234B2.8040103@voidspace.org.uk> <45226ADA.9080306@gmail.com> Message-ID: <4522738F.80303@voidspace.org.uk> Nick Coghlan wrote: > [snip..] > >> The current idiom works fine, but looks unnatural : >> >> while True: >> if : >> break > > > There's the rationale for the PEP in a whole 5 lines counting > whitespace ;) > >> Would a 'while' outside of a 'do' block (but without the colon) then be >> a syntax error ? >> >> 'do:' would just be syntactic sugar for 'while True:' I guess. > > > That's the slight issue I still have with the idea - you could end up > with multiple ways of spelling some of the basic loop forms, such as > these 3 flavours of infinite loop: > > do: > pass # Is there an implicit 'while True' at the end of the loop > body? > > do: > while True > > while True: > pass > Following the current idiom, isn't it more natural to repeat the loop 'until' a condition is met. If we introduced two new keywords, it would avoid ambiguity in the use of 'while'. do: until A do loop could require an 'until', meaning 'do' is not *just* a replacement for an infinite loop. (Assuming the parser can be coerced into co-operation.) It is obviously still a new construct in terms of Python syntax (not requiring a colon after ''.) I'm sure this has been suggested, but wonder if it has already been ruled out. An 'else' block could then retain its current meaning (execute if the loop is not terminated early by an explicit break.) Michael Foord http://www.voidspace.org.uk From ncoghlan at gmail.com Tue Oct 3 16:27:59 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 04 Oct 2006 00:27:59 +1000 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: Message-ID: <4522736F.9040101@gmail.com> Fredrik Lundh wrote: > just noticed that the first google hit for "what's new in python 2.5": > > http://docs.python.org/dev/whatsnew/whatsnew25.html > > points to a document that's a weird mix between that actual document, and > a placeholder for "what's new in python 2.6". D'oh. It's going to take a while for the stable docs to catch up to that one given the large number of external links to that page using that title :( Since the URL for the actual Python 2.6 What's New finishes with whatsnew26.html, perhaps this URL could be updated to redirect users to the stable version instead? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From amk at amk.ca Tue Oct 3 16:30:15 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 3 Oct 2006 10:30:15 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: Message-ID: <20061003143015.GA25511@localhost.localdomain> On Tue, Oct 03, 2006 at 02:56:54PM +0200, Fredrik Lundh wrote: > just noticed that the first google hit for "what's new in python 2.5": > > http://docs.python.org/dev/whatsnew/whatsnew25.html > > points to a document that's a weird mix between that actual document, and > a placeholder for "what's new in python 2.6". Thanks for pointing this out! I've added a redirect from /whatsnew25.html to the correct location, but am puzzled by the 2.6 document; it has section names like 'pep-308.html', which are set by a \label{pep-308} directive in the LaTeX, but no such \label exists in the 2.6 document. Neal, could you please delete all the temp files in whatever directory is used to build the documentation? I wonder if there's a *.aux file or something that still has labels from the 2.5 document. It might be easiest to just delete the whatsnew/ directory and then do an 'svn up' to get it back. --amk From amk at amk.ca Tue Oct 3 16:35:43 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 3 Oct 2006 10:35:43 -0400 Subject: [Python-Dev] 2.4.4 fixes In-Reply-To: <20061003134051.GA21154@rogue.amk.ca> References: <20061003134051.GA21154@rogue.amk.ca> Message-ID: <20061003143543.GB25511@localhost.localdomain> On Tue, Oct 03, 2006 at 09:40:51AM -0400, A.M. Kuchling wrote: > The wiki page now lists all the revisions stemming from valgrind and > Klocwork errors. There are a lot of them; more volunteers will be > necessary if they're all to get looked at and possibly backported. I've now looked at the Valgrind errors; most of them were already in 2.4 or don't matter (ctypes, sqlite3 fixes). One revision remains, changing the size of strings allocated in the confstr() wrapper in posixmodule.c. The patch doesn't apply cleanly -- can someone please look at this? --amk From fdrake at acm.org Tue Oct 3 16:39:52 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 3 Oct 2006 10:39:52 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <20061003143015.GA25511@localhost.localdomain> References: <20061003143015.GA25511@localhost.localdomain> Message-ID: <200610031039.52434.fdrake@acm.org> On Tuesday 03 October 2006 10:30, A.M. Kuchling wrote: > Neal, could you please delete all the temp files in whatever directory > is used to build the documentation? I wonder if there's a *.aux file > or something that still has labels from the 2.5 document. It might be > easiest to just delete the whatsnew/ directory and then do an 'svn up' > to get it back. I would guess this has everything to do with how the updated docs are deployed and little or nothing about the cleanliness of the working area. The mkhowto script should be cleaning out the old HTML before generating the new. I'm guessing the deployment simply unpacks the new on top of the old; the old should be removed first. For the /dev/ area, I don't think redirects are warranted. I'd rather see the crawlers just not bother with that, since those are more likely decoys than usable end-user docs. -Fred -- Fred L. Drake, Jr. From nmm1 at cus.cam.ac.uk Tue Oct 3 17:12:19 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 16:12:19 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 16:10:31 +0200." <45226F57.108@v.loewis.de> Message-ID: =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > > Ah, you are proposing a semantic change, then: -0.0 will become > unrepresentable, right? Well, it is and it isn't. Python currently supports only some of IEEE 754, and that is more by accident than design - because that is exactly what C90 implementations do! There is code in floatobject.c that assumes IEEE 754, but Python does NOT attempt to support it in toto (it is not clear if it could), not least because it uses C90. And, as far as I know, none of that is in the specification, because Python is at least in theory portable to systems that use other arithmetics and there is no current way to distinguish -0.0 from 0.0 except by comparing their representations! And even THAT depends entirely on whether the C library distinguishes the cases, as far as I can see. So distinguishing -0.0 from 0.0 isn't really in Python's current semantics at all. And, for reasons that we could go into, I assert that it should not be - which is NOT the same as not supporting branch cuts in cmath. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From martin at v.loewis.de Tue Oct 3 17:41:05 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 17:41:05 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <45228491.9010103@v.loewis.de> Nick Maclaren schrieb: > So distinguishing -0.0 from 0.0 isn't really in Python's current > semantics at all. And, for reasons that we could go into, I assert > that it should not be - which is NOT the same as not supporting > branch cuts in cmath. Are you talking about "Python the language specification" or "Python the implementation" here? It is not a change to the language specification, as this aspect of the behavior (as you point out) is unspecified. However, it is certainly a change to the observable behavior of the Python implementation, and no amount of arguing can change that. Regards, Martin P.S. For that matter, *any* kind of changes to the singleton nature of certain immutable values is a change in semantics. It's just that dropping -0.0 is an *additional* change (on top of the change that "1.0-1.0 is 0.0" would change from False to True). From nicko at nicko.org Tue Oct 3 17:45:16 2006 From: nicko at nicko.org (Nicko van Someren) Date: Tue, 3 Oct 2006 16:45:16 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45226F57.108@v.loewis.de> References: <45226F57.108@v.loewis.de> Message-ID: <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> On 3 Oct 2006, at 15:10, Martin v. L?wis wrote: > Nick Maclaren schrieb: >> That was the point of a previous posting of mine in this thread :-( >> >> You shouldn't, despite what IEEE 754 says, at least if you are >> allowing for either portability or numeric validation. >> >> There are a huge number of good reasons why IEEE 754 signed zeroes >> fit extremely badly into any normal programming language and are >> seriously incompatible with numeric validation, but Python adds more. >> Is there any other type where there are two values that are required >> to be different, but where both the hash is required to be zero and >> both are required to evaluate to False in truth value context? > > Ah, you are proposing a semantic change, then: -0.0 will become > unrepresentable, right? It's only a semantic change on platforms that "happen to" use IEEE 754 float representations, or some other representation that exposes the sign of zero. The Python docs have for many years stated with regard to the float type: "All bets on their precision are off unless you happen to know the machine you are working with." and that "You are at the mercy of the underlying machine architecture...". Not all floating point representations support sign of zero, though in the modern world it's true that the vast majority do. It would be instructive to understand how much, if any, python code would break if we lost -0.0. I'm do not believe that there is any reliable way for python code to tell the difference between all of the different types of IEEE 754 zeros and in the special case of -0.0 the best test I can come up with is repr(n)[0]=='-'. Is there an compelling case, to do with compatibility or otherwise, for exposing the sign of a zero? It seems like a numerical anomaly to me. Nicko From aahz at pythoncraft.com Tue Oct 3 18:59:04 2006 From: aahz at pythoncraft.com (Aahz) Date: Tue, 3 Oct 2006 09:59:04 -0700 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for a new issue tracker In-Reply-To: References: Message-ID: <20061003165903.GB12427@panix.com> If nothing else, Brett deserves a hearty round of applause for this work: Three cheers for Brett! -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "LL YR VWL R BLNG T S" -- www.nancybuttons.com From p.f.moore at gmail.com Tue Oct 3 19:04:39 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 3 Oct 2006 18:04:39 +0100 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for a new issue tracker In-Reply-To: <20061003165903.GB12427@panix.com> References: <20061003165903.GB12427@panix.com> Message-ID: <79990c6b0610031004t536cdb15h4d21526afc22f675@mail.gmail.com> On 10/3/06, Aahz wrote: > If nothing else, Brett deserves a hearty round of applause for this work: > > Three cheers for Brett! Definitely. Paul From foom at fuhm.net Tue Oct 3 18:47:02 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 3 Oct 2006 12:47:02 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <452257EB.6070601@v.loewis.de> References: <452257EB.6070601@v.loewis.de> Message-ID: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: > As Michael Hudson observed, this is difficult to implement, though: > You can't distinguish between -0.0 and +0.0 easily, yet you should. Of course you can. It's absolutely trivial. The only part that's even *the least bit* sketchy in this is assuming that a double is 64 bits. Practically speaking, that is true on all architectures I know of, and if it's not guaranteed, it could easily be a 'configure' time check. typedef union { double d; uint64_t i; } rawdouble; int isposzero(double a) { rawdouble zero; zero.d = 0.0; rawdouble aa; aa.d = a; return aa.i == zero.i; } int main() { if (sizeof(double) != sizeof(uint64_t)) return 1; printf("%d\n", isposzero(0.0)); printf("%d\n", isposzero(-0.0)); } James From martin at v.loewis.de Tue Oct 3 19:27:05 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 19:27:05 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> References: <45226F57.108@v.loewis.de> <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> Message-ID: <45229D69.4020407@v.loewis.de> Nicko van Someren schrieb: > It's only a semantic change on platforms that "happen to" use IEEE > 754 float representations, or some other representation that exposes > the sign of zero. Right. Later, you admit that this is vast majority of modern machines. > It would be instructive to understand how much, if any, python code > would break if we lost -0.0. I'm do not believe that there is any > reliable way for python code to tell the difference between all of > the different types of IEEE 754 zeros and in the special case of -0.0 > the best test I can come up with is repr(n)[0]=='-'. Is there an > compelling case, to do with compatibility or otherwise, for exposing > the sign of a zero? It seems like a numerical anomaly to me. I think it is reasonable to admit that a) this change is a change in semantics for the majority of the machines b) it is likely that this change won't affect a significant number of applications (I'm pretty sure someone will notice, though; someone always notices). Regards, Martin From skip at pobox.com Tue Oct 3 19:37:49 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 3 Oct 2006 12:37:49 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45228491.9010103@v.loewis.de> References: <45228491.9010103@v.loewis.de> Message-ID: <17698.40941.317868.702398@montanaro.dyndns.org> Martin> However, it is certainly a change to the observable behavior of Martin> the Python implementation, and no amount of arguing can change Martin> that. If C90 doesn't distinguish -0.0 and +0.0, how can Python? Can you give a simple example where the difference between the two is apparent to the Python programmer? Skip From skip at pobox.com Tue Oct 3 19:40:59 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 3 Oct 2006 12:40:59 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45229D69.4020407@v.loewis.de> References: <45226F57.108@v.loewis.de> <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> <45229D69.4020407@v.loewis.de> Message-ID: <17698.41131.396198.330141@montanaro.dyndns.org> Martin> b) it is likely that this change won't affect a significant Martin> number of applications (I'm pretty sure someone will notice, Martin> though; someone always notices). +1 QOTF. Skip From Scott.Daniels at Acm.Org Tue Oct 3 19:45:50 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Tue, 03 Oct 2006 10:45:50 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> Message-ID: James Y Knight wrote: > On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: >> As Michael Hudson observed, this is difficult to implement, though: >> You can't distinguish between -0.0 and +0.0 easily, yet you should. > > Of course you can. It's absolutely trivial. The only part that's even > *the least bit* sketchy in this is assuming that a double is 64 bits. > Practically speaking, that is true on all architectures I know of, > and if it's not guaranteed, it could easily be a 'configure' time check. > > typedef union { > double d; > uint64_t i; > } rawdouble; > > int isposzero(double a) { > rawdouble zero; > zero.d = 0.0; > rawdouble aa; > aa.d = a; > return aa.i == zero.i; > } > > int main() { > if (sizeof(double) != sizeof(uint64_t)) > return 1; > > printf("%d\n", isposzero(0.0)); > printf("%d\n", isposzero(-0.0)); > > } > And you should be able to cache the single positive zero with something vaguely like: PyObject * PyFloat_FromDouble(double fval) { ... if (fval == 0.0 && raw_match(&fval, cached)) { PY_INCREF(cached); return cached; } ... -- -- Scott David Daniels Scott.Daniels at Acm.Org From martin at v.loewis.de Tue Oct 3 19:55:43 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 19:55:43 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17698.40941.317868.702398@montanaro.dyndns.org> References: <45228491.9010103@v.loewis.de> <17698.40941.317868.702398@montanaro.dyndns.org> Message-ID: <4522A41F.6090704@v.loewis.de> skip at pobox.com schrieb: > If C90 doesn't distinguish -0.0 and +0.0, how can Python? Can you give a > simple example where the difference between the two is apparent to the > Python programmer? Sure: py> x=-0.0 py> y=0.0 py> x,y (-0.0, 0.0) py> hash(x),hash(y) (0, 0) py> x==y True py> str(x)==str(y) False py> str(x),str(y) ('-0.0', '0.0') py> float(str(x)),float(str(y)) (-0.0, 0.0) Imagine an application that reads floats from a text file, manipulates some of them, and then writes back the complete list of floats. Further assume that somehow, -0.0 got into the file. Currently, the sign "round-trips"; under the proposed change, it would stop doing so. Of course, there likely wouldn't be any "real" change to value, as the sign of 0 is likely of no significance to the application. Regards, Martin From amk at amk.ca Tue Oct 3 20:08:48 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 3 Oct 2006 14:08:48 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <200610031039.52434.fdrake@acm.org> References: <20061003143015.GA25511@localhost.localdomain> <200610031039.52434.fdrake@acm.org> Message-ID: <20061003180848.GB31361@localhost.localdomain> On Tue, Oct 03, 2006 at 10:39:52AM -0400, Fred L. Drake, Jr. wrote: > and little or nothing about the cleanliness of the working area. The mkhowto > script should be cleaning out the old HTML before generating the new. I'm > guessing the deployment simply unpacks the new on top of the old; the old > should be removed first. That doesn't explain it, though; the contents of whatsnew26.html contain references to pep-308.html. It's not simply a matter of new files being untarred on top of old. I've added a robots.txt to keep crawlers out of /dev/. --amk From fdrake at acm.org Tue Oct 3 20:19:27 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 3 Oct 2006 14:19:27 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <20061003180848.GB31361@localhost.localdomain> References: <200610031039.52434.fdrake@acm.org> <20061003180848.GB31361@localhost.localdomain> Message-ID: <200610031419.28281.fdrake@acm.org> On Tuesday 03 October 2006 14:08, A.M. Kuchling wrote: > That doesn't explain it, though; the contents of whatsnew26.html > contain references to pep-308.html. It's not simply a matter of new > files being untarred on top of old. Ah; I missed that the new HTML file was referring to an old heading. That does sound like a .aux file got left around. I don't know what the build process is for the material in docs.python.org/dev/; I think the right thing would be to start each build with a fresh checkout/export. -Fred -- Fred L. Drake, Jr. From nmm1 at cus.cam.ac.uk Tue Oct 3 20:26:29 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 19:26:29 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 19:55:43 +0200." <4522A41F.6090704@v.loewis.de> Message-ID: =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > > py> x=-0.0 > py> y=0.0 > py> x,y Nobody is denying that SOME C90 implementations distinguish them, but it is no part of the standard - indeed, a C90 implementation is permitted to use ANY criterion for deciding when to display -0.0 and 0.0. C99 is ambiguous to the point of internal inconsistency, except when __STDC_IEC_559__ is set to 1, though the intent is clear. And my reading of Python's code is that it relies on C's handling of such values. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From foom at fuhm.net Tue Oct 3 21:13:01 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 3 Oct 2006 15:13:01 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <0D01F7BC-1EC4-42DB-8D70-31E767E98257@fuhm.net> On Oct 3, 2006, at 2:26 PM, Nick Maclaren wrote: > =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >> >> py> x=-0.0 >> py> y=0.0 >> py> x,y > > Nobody is denying that SOME C90 implementations distinguish them, > but it is no part of the standard - indeed, a C90 implementation is > permitted to use ANY criterion for deciding when to display -0.0 and > 0.0. C99 is ambiguous to the point of internal inconsistency, except > when __STDC_IEC_559__ is set to 1, though the intent is clear. > > And my reading of Python's code is that it relies on C's handling > of such values. This is a really poor argument. Python should be moving *towards* proper '754 fp support, not away from it. On the platforms that are most important, the C implementations distinguish positive and negative 0. That the current python implementation may be defective when the underlying C implementation is defective doesn't excuse a change to intentionally break python on the common platforms. IEEE 754 is so widely implemented that IMO it would make sense to make Python's floating point specify it, and simply declare floating point operations on non-IEEE 754 machines as "use at own risk, may not conform to python language standard". (or if someone wants to use a software fp library for such machines, that's fine too). James From martin at v.loewis.de Tue Oct 3 21:37:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 21:37:53 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <4522BC11.4090901@v.loewis.de> Nick Maclaren schrieb: >> py> x=-0.0 >> py> y=0.0 >> py> x,y > > Nobody is denying that SOME C90 implementations distinguish them, > but it is no part of the standard - indeed, a C90 implementation is > permitted to use ANY criterion for deciding when to display -0.0 and > 0.0. C99 is ambiguous to the point of internal inconsistency, except > when __STDC_IEC_559__ is set to 1, though the intent is clear. > > And my reading of Python's code is that it relies on C's handling > of such values. So what is your conclusion? That applications will not break? People don't care that their code may break on a different platform, if they aren't using these platforms. They care if it breaks on their platform just because they use a new Python version. (Of course, they sometimes also complain that Python behaves differently on different platforms, and cannot really accept the explanation that the language didn't guarantee the same behavior on all systems. This explanation doesn't help them: they still need to modify the application). Regards, Martin From rrr at ronadam.com Tue Oct 3 21:34:59 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 03 Oct 2006 14:34:59 -0500 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <45226ADA.9080306@gmail.com> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> <45222E8B.30304@gmail.com> <452234B2.8040103@voidspace.org.uk> <45226ADA.9080306@gmail.com> Message-ID: <4522BB63.800@ronadam.com> Nick Coghlan wrote: > Fuzzyman wrote: >> Nick Coghlan wrote: >>> In my example, the 3 sections (, and >> code> are all optional. A basic do-while loop would look like this: >>> >>> do: >>> >>> while >>> >>> (That is, is still repeated each time around the loop - it's >>> called that because it is run before the loop evaluated condition is evaluated) >>> >>> >> +1 >> >> This looks good. > > I'm pretty sure it was proposed by someone else a long time ago - I was > surprised to find it wasn't mentioned in PEP 315. > > That said, Guido's observation on PEP 315 from earlier this year holds for me too: > > "I kind of like it but it doesn't strike me as super important" [1] I looked though a few files in the library for different while usage patterns and there really wasn't as many while loops that would fit this pattern as I expected. There are much more while loops with one or more exit conditions in the middle as things in the loop are calculated or received. So it might be smart to find out just how many places in the library it would make a difference. Ron From greg at electricrain.com Tue Oct 3 21:47:06 2006 From: greg at electricrain.com (Gregory P. Smith) Date: Tue, 3 Oct 2006 12:47:06 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45229D69.4020407@v.loewis.de> References: <45226F57.108@v.loewis.de> <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> <45229D69.4020407@v.loewis.de> Message-ID: <20061003194706.GE7484@zot.electricrain.com> > > It would be instructive to understand how much, if any, python code > > would break if we lost -0.0. I'm do not believe that there is any > > reliable way for python code to tell the difference between all of > > the different types of IEEE 754 zeros and in the special case of -0.0 > > the best test I can come up with is repr(n)[0]=='-'. Is there an > > compelling case, to do with compatibility or otherwise, for exposing > > the sign of a zero? It seems like a numerical anomaly to me. > > I think it is reasonable to admit that > a) this change is a change in semantics for the majority of the > machines > b) it is likely that this change won't affect a significant number > of applications (I'm pretty sure someone will notice, though; > someone always notices). If you're really going to bother doing this rather than just adding a note in the docs about testing for and reusing the most common float values to save memory when instantiating them from external input: Just do a binary comparison of the float with predefined + and - 0.0 float values or any other special values that you wish to catch rather than a floating point comparison. -g From alastair at alastairs-place.net Wed Oct 4 01:40:26 2006 From: alastair at alastairs-place.net (Alastair Houghton) Date: Wed, 4 Oct 2006 00:40:26 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> Message-ID: On 3 Oct 2006, at 17:47, James Y Knight wrote: > On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: >> As Michael Hudson observed, this is difficult to implement, though: >> You can't distinguish between -0.0 and +0.0 easily, yet you should. > > Of course you can. It's absolutely trivial. The only part that's even > *the least bit* sketchy in this is assuming that a double is 64 bits. > Practically speaking, that is true on all architectures I know of, How about doing 1.0 / x, where x is the number you want to test? On systems with sane semantics, it should result in an infinity, the sign of which should depend on the sign of the zero. While I'm sure there are any number of places where it will break, on those platforms it seems to me that you're unlikely to care about the difference between +0.0 and -0.0 anyway, since it's hard to otherwise distinguish them. e.g. double value_to_test; ... if (value_to_test == 0.0) { double my_inf = 1.0 / value_to_test; if (my_inf < 0.0) { /* We have a -ve zero */ } else if (my_inf > 0.0) { /* We have a +ve zero */ } else { /* This platform might not support infinities (though we might get a signal or something rather than getting here in that case...) */ } } (I should add that presently I've only tried it on a PowerPC, because it's late and that's what's in front of me. It seems to work OK here.) Kind regards, Alastair -- http://alastairs-place.net From jcarlson at uci.edu Wed Oct 4 03:38:43 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 03 Oct 2006 18:38:43 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> Message-ID: <20061003180809.092A.JCARLSON@uci.edu> Alastair Houghton wrote: > On 3 Oct 2006, at 17:47, James Y Knight wrote: > > > On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: > >> As Michael Hudson observed, this is difficult to implement, though: > >> You can't distinguish between -0.0 and +0.0 easily, yet you should. > > > > Of course you can. It's absolutely trivial. The only part that's even > > *the least bit* sketchy in this is assuming that a double is 64 bits. > > Practically speaking, that is true on all architectures I know of, > > How about doing 1.0 / x, where x is the number you want to test? On > systems with sane semantics, it should result in an infinity, the > sign of which should depend on the sign of the zero. While I'm sure > there are any number of places where it will break, on those > platforms it seems to me that you're unlikely to care about the > difference between +0.0 and -0.0 anyway, since it's hard to otherwise > distinguish them. There is, of course, the option of examining their representations in memory (I described the general technique in another posting on this thread). From what I understand of IEEE 764 FP doubles, -0.0 and +0.0 have different representations, and if we look at the underlying representation (perhaps by a "*((uint64*)(&float_input))"), we can easily distinguish all values we want to cache... We can observe it directly, for example on x86: >>> import struct >>> struct.pack('d', -0.0) '\x00\x00\x00\x00\x00\x00\x00\x80' >>> struct.pack('d', 0.0) '\x00\x00\x00\x00\x00\x00\x00\x00' >>> And as I stated before, we can switch on those values. Alternatively, if we can't switch on the 64 bit values directly... uint32* p = (uint32*)(&double_input) if (!p[0]) { /* p[1] on big-endian platforms */ switch p[1] { /* p[0] on big-endian platforms */ ... } } - Josiah From tonynelson at georgeanelson.com Wed Oct 4 02:28:44 2006 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Tue, 3 Oct 2006 20:28:44 -0400 Subject: [Python-Dev] 2.4.4 fix: Socketmodule Ctl-C patch Message-ID: I've put a patch for 2.4.4 of the Socketmodule Ctl-C patch for 2.5, at the old closed bug . It passes "make EXTRAOPS-=unetwork test". Should I try to put this into the wiki at Python24Fixes? I haven't used the wiki before. -- ____________________________________________________________________ TonyN.:' The Great Writ ' is no more. From steve at holdenweb.com Wed Oct 4 05:58:01 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 04 Oct 2006 04:58:01 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20061003180809.092A.JCARLSON@uci.edu> References: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <20061003180809.092A.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: [yet more on this topic] If the brainpower already expended on this issue were proportional to its significance then we'd be reading about it on CNN news. This thread has disappeared down a rat-hole, never to re-emerge with anything of significant benefit to users. C'mon, guys, implement a patch or leave it alone :-) regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From guido at python.org Wed Oct 4 06:06:54 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 3 Oct 2006 21:06:54 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <20061003180809.092A.JCARLSON@uci.edu> Message-ID: On 10/3/06, Steve Holden wrote: > If the brainpower already expended on this issue were proportional to > its significance then we'd be reading about it on CNN news. > > This thread has disappeared down a rat-hole, never to re-emerge with > anything of significant benefit to users. C'mon, guys, implement a patch > or leave it alone :-) Hear, hear. My proposal: only cache positive 0.0. My prediction: biggest bang for the buck, nobody's code will break. On platforms that don't distinguish between +/- 0.0, of course this would cache all zeros. On platforms that do distinguish them, -0.0 is left alone, which is just fine. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nnorwitz at gmail.com Wed Oct 4 06:32:43 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 3 Oct 2006 21:32:43 -0700 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <200610031419.28281.fdrake@acm.org> References: <200610031039.52434.fdrake@acm.org> <20061003180848.GB31361@localhost.localdomain> <200610031419.28281.fdrake@acm.org> Message-ID: On 10/3/06, Fred L. Drake, Jr. wrote: > On Tuesday 03 October 2006 14:08, A.M. Kuchling wrote: > > That doesn't explain it, though; the contents of whatsnew26.html > > contain references to pep-308.html. It's not simply a matter of new > > files being untarred on top of old. > > Ah; I missed that the new HTML file was referring to an old heading. That > does sound like a .aux file got left around. > > I don't know what the build process is for the material in > docs.python.org/dev/; I think the right thing would be to start each build > with a fresh checkout/export. I probably did not do that to begin with. I did rm -rf Doc && svn up Doc && cd Doc && make. Let me know if there's anything else I should do. I did this for both the 2.5 and 2.6 versions. Let me know if you see anything screwed up after an hour or so. The new versions should be up by then. n From tim.peters at gmail.com Wed Oct 4 06:42:04 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 4 Oct 2006 00:42:04 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17698.40941.317868.702398@montanaro.dyndns.org> References: <45228491.9010103@v.loewis.de> <17698.40941.317868.702398@montanaro.dyndns.org> Message-ID: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> [skip at pobox.com] > If C90 doesn't distinguish -0.0 and +0.0, how can Python? With liberal applications of piss & vinegar ;-) > Can you give a simple example where the difference between the two is apparent > to the Python programmer? Perhaps surprsingly, many (well, comparatively many, compared to none ....) people have noticed that the platform atan2 cares a lot: >>> from math import atan2 as a >>> z = 0.0 # postive zero >>> m = -z # minus zero >>> a(z, z) # the result here is actually +0.0 0.0 >>> a(z, m) 3.1415926535897931 >>> a(m, z) # the result here is actually -0.0 0.0 >>> a(m, m) -3.1415926535897931 It work like that "even on Windows", and these are the results C99's 754-happy appendix mandates for atan2 applied to signed zeroes. I've even seen a /complaint/ on c.l.py that atan2 doesn't do the same when z = 0.0 is replaced by z = 0 That is, at least one person thought it was "a bug" that integer zeroes didn't deliver the same behaviors. Do people actually rely on this? I know I don't, but given that more than just 2 people have remarked on it seeming to like it, I expect that changing this would break /some/ code out there. BTW, on /some/ platforms all those examples trigger EDOM from the platform libm instead -- which is also fine by C99, for implementations ignoring C99's optional 754-happy appendix. From tim.peters at gmail.com Wed Oct 4 06:53:55 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 4 Oct 2006 00:53:55 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> References: <45228491.9010103@v.loewis.de> <17698.40941.317868.702398@montanaro.dyndns.org> <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> Message-ID: <1f7befae0610032153t25bd0503u27628436ce3b794f@mail.gmail.com> [skip at pobox.com] > Can you give a simple example where the difference between the two is apparent > to the Python programmer? BTW, I don't recall the details and don't care enough to reconstruct them, but when Python's front end was first changed to recognize "negative literals", it treated +0.0 and -0.0 the same, and we did get bug reports as a result. A bit more detail, because it's necessary to understand that even minimally. Python's grammar doesn't have negative numeric literals; e.g., according to the grammar, -1 and -1.1 are applications of the unary minus operator to the positive numeric literals 1 and 1.1. And for years Python generated code accordingly: LOAD_CONST followed by the unary minus opcode. Someone (Fred, I think) introduced a front-end optimization to collapse that to plain LOAD_CONST, doing the negation at compile time. The code object contains a vector of compile-time constants, and the optimized code initially didn't distinguish between +0.0 and -0.0. As a result, if the first float 0.0 in a code block "looked postive", /all/ float zeroes in the code block were in effect treated as positive; and similarly if the first float zero was -0.0, all float zeroes were in effect treated as negative. That did break code. IIRC, it was fixed by special-casing the snot out of "-0.0", leaving that single case as a LOAD_CONST followed by UNARY_NEGATIVE. From fdrake at acm.org Wed Oct 4 06:56:56 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 4 Oct 2006 00:56:56 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: <200610031419.28281.fdrake@acm.org> Message-ID: <200610040056.56632.fdrake@acm.org> On Wednesday 04 October 2006 00:32, Neal Norwitz wrote: > I probably did not do that to begin with. I did rm -rf Doc && svn up > Doc && cd Doc && make. Let me know if there's anything else I should > do. I did this for both the 2.5 and 2.6 versions. That certainly sounds like it should be sufficient. The doc build should never write anywhere but within the Doc/ tree; it doesn't even use the tempfile module to pick up any other temporary scratch space. -Fred -- Fred L. Drake, Jr. From fdrake at acm.org Wed Oct 4 07:01:06 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 4 Oct 2006 01:01:06 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <1f7befae0610032153t25bd0503u27628436ce3b794f@mail.gmail.com> References: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> <1f7befae0610032153t25bd0503u27628436ce3b794f@mail.gmail.com> Message-ID: <200610040101.06109.fdrake@acm.org> On Wednesday 04 October 2006 00:53, Tim Peters wrote: > Someone (Fred, I think) introduced a front-end optimization to > collapse that to plain LOAD_CONST, doing the negation at compile time. I did the original change to make negative integers use just LOAD_CONST, but I don't think I changed what was generated for float literals. That could be my memory going bad, though. The code changed several times as people with more numeric-fu that myself fixed all sorts of border cases. I've tried really hard to stay away from the code generator since then. :-) -Fred -- Fred L. Drake, Jr. From nnorwitz at gmail.com Wed Oct 4 07:12:50 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 3 Oct 2006 22:12:50 -0700 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: On 10/2/06, Brett Cannon wrote: > > This is why I asked for input from people on which would take less time. > Almost all the answers I got was that the the C code was delicate but that > it was workable. Several people said they wished for a Python > implementation, but hardly anyone said flat-out, "don't waste your time, the > Python version will be faster to do". I didn't respond mostly because I pushed this direction to begin with. That and I'm lazy. :-) There is a lot of string manipulation and some list manipulation that is a royal pain in C and trivial in python. Caching will be much easier to experiement with in Python too. The Python version will be much smaller. It will take far less time to code it in Python and recode in C, than to try to get it right in C the first time. If the code is fast enough, there's no reason to rewrite in C. It will probably be easier to subclass a Python based version that a C based version. > As for the bootstrapping, I am sure it is resolvable as well. There are > several ways to go about it that are all tractable. Right, I had bootstrapping with implementing xrange in Python, but it was pretty easy to resolve in the end. You might even want to use part of that patch (from pythonrun.c?). There was some re-org to make bootstrapping easier/possible (I don't remember exactly right now). n From tim.peters at gmail.com Wed Oct 4 07:29:58 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 4 Oct 2006 01:29:58 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <200610040101.06109.fdrake@acm.org> References: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> <1f7befae0610032153t25bd0503u27628436ce3b794f@mail.gmail.com> <200610040101.06109.fdrake@acm.org> Message-ID: <1f7befae0610032229i7af5c76eg6bd0c459fe7338de@mail.gmail.com> [Tim] >> Someone (Fred, I think) introduced a front-end optimization to >> collapse that to plain LOAD_CONST, doing the negation at compile time. > I did the original change to make negative integers use just LOAD_CONST, but I > don't think I changed what was generated for float literals. That could be > my memory going bad, though. It is ;-) Here under Python 2.2.3: >>> from dis import dis >>> def f(): return 0.0 + -0.0 + 1.0 + -1.0 ... >>> dis(f) 0 SET_LINENO 1 3 SET_LINENO 1 6 LOAD_CONST 1 (0.0) 9 LOAD_CONST 1 (0.0) 12 UNARY_NEGATIVE 13 BINARY_ADD 14 LOAD_CONST 2 (1.0) 17 BINARY_ADD 18 LOAD_CONST 3 (-1.0) 21 BINARY_ADD 22 RETURN_VALUE 23 LOAD_CONST 0 (None) 26 RETURN_VALUE Note there that "0.0", "1.0", and "-1.0" were all treated as literals, but that "-0.0" still triggered a UNARY_NEGATIVE opcode. That was after "the fix". You don't remember this as well as I do since I probably had to fix it, /and/ I ate enormous quantities of chopped, pressed, smoked, preservative-laden bag o' ham at the time. You really need to do both to remember floating-point trivia. Indeed, since I gave up my bag o' ham habit, I hardly ever jump into threads about fp trivia anymore. Mostly it's because I'm too weak from not eating anything, though -- how about lunch tomorrow? > The code changed several times as people with more numeric-fu that myself > fixed all sorts of border cases. I've tried really hard to stay away from > the code generator since then. :-) Successfully, too! It's admirable. From jcarlson at uci.edu Wed Oct 4 07:35:43 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 03 Oct 2006 22:35:43 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <20061003180809.092A.JCARLSON@uci.edu> Message-ID: <20061003222551.0933.JCARLSON@uci.edu> Steve Holden wrote: > Josiah Carlson wrote: > [yet more on this topic] > > If the brainpower already expended on this issue were proportional to > its significance then we'd be reading about it on CNN news. Goodness, I wasn't aware that pointer manipulation took that much brainpower. I presume you mean what others have spent time thinking about with regards to this topic. > This thread has disappeared down a rat-hole, never to re-emerge with > anything of significant benefit to users. C'mon, guys, implement a patch > or leave it alone :-) Heh. So be it. The following is untested (I lack a build system for the Python trunk). It adds a new global cache for floats, a new 'fill the global cache' function, and an updated PyFloat_FromDouble() function. All in all, it took about 10 minutes to generate, and understands the difference between fp +0.0 and -0.0 (assuming sane IEEE 754 fp double behavior on non-x86 platforms). - Josiah /* This should go into floatobject.c */ static PyFloatObject *cached_list = NULL; static PyFloatObject * fill_cached_list(void) { cached_list = (PyFloatObject *) 1; PyFloatObject *p; int i; p = (PyFloatObject *) PyMem_MALLOC(sizeof(PyFloatObject)*22); if (p == NULL) { cached_list = NULL; return (PyFloatObject *) PyErr_NoMemory(); } for (i=0;i<=10;i++) { p[i] = (PyFloatObject*) PyFloat_fromDouble((double) i); p[21-i] = (PyFloatObject*) PyFloat_fromDouble(-((double) i)); } cached_list = NULL; return p; } PyObject * PyFloat_FromDouble(double fval) { register PyFloatObject *op; register long* fvali = (int*)(&fval); if (free_list == NULL) { if ((free_list = fill_free_list()) == NULL) return NULL; } #ifdef LITTLE_ENDIAN if (!p[0]) #else if (!p[1]) #endif { if (cached_list == NULL) { if ((cached_list = fill_cached_list()) == NULL) return NULL; } if ((cached_list != 1) && (cached_list != NULL)) { #ifdef LITTLE_ENDIAN switch p[1] #else switch p[0] #endif { case 0: PY_INCREF(cached_list[0]); return cached_list[0]; case 1072693248: PY_INCREF(cached_list[1]); return cached_list[1]; case 1073741824: PY_INCREF(cached_list[2]); return cached_list[2]; case 1074266112: PY_INCREF(cached_list[3]); return cached_list[3]; case 1074790400: PY_INCREF(cached_list[4]); return cached_list[4]; case 1075052544: PY_INCREF(cached_list[5]); return cached_list[5]; case 1075314688: PY_INCREF(cached_list[6]); return cached_list[6]; case 1075576832: PY_INCREF(cached_list[7]); return cached_list[7]; case 1075838976: PY_INCREF(cached_list[8]); return cached_list[8]; case 1075970048: PY_INCREF(cached_list[9]); return cached_list[9]; case 1076101120: PY_INCREF(cached_list[10]); return cached_list[10]; case -1071382528: PY_INCREF(cached_list[11]); return cached_list[11]; case -1071513600: PY_INCREF(cached_list[12]); return cached_list[12]; case -1071644672: PY_INCREF(cached_list[13]); return cached_list[13]; case -1071906816: PY_INCREF(cached_list[14]); return cached_list[14]; case -1072168960: PY_INCREF(cached_list[15]); return cached_list[15]; case -1072431104: PY_INCREF(cached_list[16]); return cached_list[16]; case -1072693248: PY_INCREF(cached_list[17]); return cached_list[17]; case -1073217536: PY_INCREF(cached_list[18]); return cached_list[18]; case -1073741824: PY_INCREF(cached_list[19]); return cached_list[19]; case -1074790400: PY_INCREF(cached_list[20]); return cached_list[20]; case -2147483648: PY_INCREF(cached_list[21]); return cached_list[21]; default: } } } /* Inline PyObject_New */ op = free_list; free_list = (PyFloatObject *)op->ob_type; PyObject_INIT(op, &PyFloat_Type); op->ob_fval = fval; return (PyObject *) op; } From martin at v.loewis.de Wed Oct 4 07:34:51 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 04 Oct 2006 07:34:51 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> Message-ID: <452347FB.3050804@v.loewis.de> Alastair Houghton schrieb: > On 3 Oct 2006, at 17:47, James Y Knight wrote: > >> On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: >>> As Michael Hudson observed, this is difficult to implement, though: >>> You can't distinguish between -0.0 and +0.0 easily, yet you should. >> >> Of course you can. It's absolutely trivial. The only part that's even >> *the least bit* sketchy in this is assuming that a double is 64 bits. >> Practically speaking, that is true on all architectures I know of, > > How about doing 1.0 / x, where x is the number you want to test? This is a bad idea. It may cause a trap, leading to program termination. Regards, Martin From nnorwitz at gmail.com Wed Oct 4 08:16:12 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 3 Oct 2006 23:16:12 -0700 Subject: [Python-Dev] [Python-checkins] r51862 - python/branches/release25-maint/Tools/msi/msi.py In-Reply-To: <450711CE.8040201@v.loewis.de> References: <20060912091628.9CBFA1E400C@bag.python.org> <200609122116.01922.anthony@interlink.com.au> <450711CE.8040201@v.loewis.de> Message-ID: On 9/12/06, "Martin v. L?wis" wrote: > > If you wonder how this all happened: Neal added sgml_input.html after > c1, but didn't edit msi.py to make it included on Windows. I found out > after running the test suite on the installed version, edited msi.py, > and rebuilt the installer. Is there an easy to fix this sort of problem so it doesn't happen in the future (other than revoke my checkin privileges :-) ? There are already so many things to remember for changes. If we can automate finding these sorts of problems (installation, fixing something for one platform, but not another, etc), the submitter can fix these things with a little prodding from the buildbots. Or is this too minor to worry about? It would also be great if we could automate complaint emails about missing NEWS entries, doc, and tests so I wouldn't have to do it. :-) Unless anyone has better ideas how to improve Python. n From martin at v.loewis.de Wed Oct 4 08:40:10 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 04 Oct 2006 08:40:10 +0200 Subject: [Python-Dev] [Python-checkins] r51862 - python/branches/release25-maint/Tools/msi/msi.py In-Reply-To: References: <20060912091628.9CBFA1E400C@bag.python.org> <200609122116.01922.anthony@interlink.com.au> <450711CE.8040201@v.loewis.de> Message-ID: <4523574A.2010702@v.loewis.de> Neal Norwitz schrieb: > Is there an easy to fix this sort of problem so it doesn't happen in > the future (other than revoke my checkin privileges :-) ? Sure: Don't make changes after a release candidate. That files are missing can only be detected by actually producing the installer and testing whether it works; the closer the release, the less testing recent changes get. It might be possible to improve msi.py to better guess what files are test files, but I'd rather package too little than too much. One thing it *should* do is to report files that it skipped - but that really just helps me, since you have to run msi.py to see these messages. > There are already so many things to remember for changes. If we can > automate finding these sorts of problems (installation, fixing > something for one platform, but not another, etc), the submitter can > fix these things with a little prodding from the buildbots. Or is > this too minor to worry about? This specific instance is not to worry about. I noticed before making the release, and fixed it; me changing the branch while it is frozen is not even a policy violation. It's unfortunate that you can't recreate the installer from the tag that had been made, but it's just a release candidate, so that's a really minor issue. > It would also be great if we could automate complaint emails about > missing NEWS entries, doc, and tests so I wouldn't have to do it. :-) > Unless anyone has better ideas how to improve Python. I don't think this can be automated in a reasonable way. People apparently have different views on what is good policy and what is overkill; in a free software project, you can only have so much policy enforcement. If there is a wide consensus on some issue, committers will pick up the consensus; if they don't, it typically means they disagree. Regards, Martin From alastair at alastairs-place.net Wed Oct 4 10:00:19 2006 From: alastair at alastairs-place.net (Alastair Houghton) Date: Wed, 4 Oct 2006 09:00:19 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <452347FB.3050804@v.loewis.de> References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <452347FB.3050804@v.loewis.de> Message-ID: On 4 Oct 2006, at 06:34, Martin v. L?wis wrote: > Alastair Houghton schrieb: >> On 3 Oct 2006, at 17:47, James Y Knight wrote: >> >>> On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: >>>> As Michael Hudson observed, this is difficult to implement, though: >>>> You can't distinguish between -0.0 and +0.0 easily, yet you should. >>> >>> Of course you can. It's absolutely trivial. The only part that's >>> even >>> *the least bit* sketchy in this is assuming that a double is 64 >>> bits. >>> Practically speaking, that is true on all architectures I know of, >> >> How about doing 1.0 / x, where x is the number you want to test? > > This is a bad idea. It may cause a trap, leading to program > termination. AFAIK few systems have floating point traps enabled by default (in fact, isn't that what IEEE 754 specifies?), because they often aren't very useful. And in the specific case of the Python interpreter, why would you ever want them turned on? Surely in order to get consistent floating point semantics, they need to be *off* and Python needs to handle any exceptional cases itself; even if they're on, by your argument Python must do that to avoid being terminated. (Not to mention the problem that floating point traps are typically delivered by a signal, the problems with which were discussed extensively in a recent thread on this list.) And it does have two advantages over the other methods proposed: 1. You don't have to write the value to memory; this test will work entirely in the machine's floating point registers. 2. It doesn't rely on the machine using IEEE floating point. (Of course, neither does the binary comparison method, but it still involves a trip to memory, and assumes that the machine doesn't have multiple representations for +0.0 or -0.0.) Even if you're saying that there's a significant chance of a trap (which I don't believe, not on common platforms anyway), the configure script could test to see if this will happen and fall back to one of the other approaches, or see if it can't turn them off using the C99 APIs. (I think I'd agree with you that handling SIGFPE is undesirable, which is perhaps what you were driving at.) Anyway, it's only an idea, and I thought I'd point it out as nobody else had yet. If 0.0 is going to be cached, then I certainly think -0.0 and +0.0 should be two separate values if they exist on a given machine. I'm less concerned about exactly how that comes about. Kind regards, Alastair. -- http://alastairs-place.net From alastair at alastairs-place.net Wed Oct 4 10:05:56 2006 From: alastair at alastairs-place.net (Alastair Houghton) Date: Wed, 4 Oct 2006 09:05:56 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20061003180809.092A.JCARLSON@uci.edu> References: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <20061003180809.092A.JCARLSON@uci.edu> Message-ID: <6BB57169-781E-4397-9A5D-297308DBF3D9@alastairs-place.net> On 4 Oct 2006, at 02:38, Josiah Carlson wrote: > Alastair Houghton wrote: > > There is, of course, the option of examining their representations in > memory (I described the general technique in another posting on this > thread). From what I understand of IEEE 764 FP doubles, -0.0 and +0.0 > have different representations, and if we look at the underlying > representation (perhaps by a "*((uint64*)(&float_input))"), we can > easily distinguish all values we want to cache... Yes, though a trip via memory isn't necessarily cheap, and you're also assuming that the machine doesn't use an FP representation with multiple +0s or -0s. Perhaps they should be different anyway though, I suppose. > And as I stated before, we can switch on those values. Alternatively, > if we can't switch on the 64 bit values directly... > > uint32* p = (uint32*)(&double_input) > if (!p[0]) { /* p[1] on big-endian platforms */ > switch p[1] { /* p[0] on big-endian platforms */ > ... > } > } That's worse, IMHO, because it assumes more about the representation. If you're going to look directly at the binary, I think all you can reasonably do is a straight binary comparison. I don't think you should poke at the bits without first knowing that the platform uses IEEE floating point. The reason I suggested 1.0/x is that it's one of the few ways (maybe the only way?) to distinguish -0.0 and +0.0 using arithmetic, which is what people that care about the difference between the two are going to care about. Kind regards, Alastair. -- http://alastairs-place.net From Hans.Polak at capgemini.com Mon Oct 2 08:41:53 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Mon, 2 Oct 2006 08:41:53 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <451F4183.5050907@gmail.com> Message-ID: <000601c6e5ed$d99ad290$1d2c440a@spain.capgemini.com> Hi Nick, Yep, PEP 315. Sorry about that. Now, about your suggestion do: while else: This is pythonic, but not logical. The 'do' will execute at least once, so the else clause is not needed, nor is the . The should go before the while terminator. I'm bound to reiterate my proposal: do: while Example (if you know there will be at least one val). source.open() do: val = source.read(1) process(val) while val != lastitem source.close() The c syntax is: do { block of code } while (condition is satisfied); The VB syntax is: do block loop while Cheers & thanks for your reply, Hans Polak. -----Original Message----- From: Nick Coghlan [mailto:ncoghlan at gmail.com] Sent: domingo, 01 de octubre de 2006 6:18 To: Hans Polak Cc: python-dev at python.org Subject: Re: [Python-Dev] PEP 351 - do while Hans Polak wrote: > Hi, > > > > Just an opinion, but many uses of the 'while true loop' are instances of > a 'do loop'. I appreciate the language layout question, so I'll give you > an alternative: > > > > do: > > > > > > while > I believe you meant to write PEP 315 in the subject line :) To fully account for loop else clauses, this suggestion would probably need to be modified to look something like this: Basic while loop: while : else: Using break to avoid code duplication: while True: if not : break Current version of PEP 315: do: while : else: This suggestion: do: while else: I personally like that style, and if the compiler can dig through a function looking for yield statements to identify generators, it should be able to dig through a do-loop looking for the termination condition. As I recall, the main objection to this style was that it could hide the loop termination condition, but that isn't actually mentioned in the PEP (and in the typical do-while case, the loop condition will still be clearly visible at the end of the loop body). Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From Hans.Polak at capgemini.com Mon Oct 2 13:36:52 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Mon, 2 Oct 2006 13:36:52 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <4520EE65.50507@gmail.com> Message-ID: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> Ok, I see your point. Really, I've read more about Python than worked with it, so I'm out of my league here. Can I combine your suggestion with mine and come up with the following: do: while else: Cheers, Hans. -----Original Message----- From: Nick Coghlan [mailto:ncoghlan at gmail.com] Sent: lunes, 02 de octubre de 2006 12:48 To: Hans Polak Cc: python-dev at python.org Subject: Re: [Python-Dev] PEP 315 - do while Hans Polak wrote: > Hi Nick, > > Yep, PEP 315. Sorry about that. > > Now, about your suggestion > do: > > while > > else: > > > This is pythonic, but not logical. The 'do' will execute at least once, so > the else clause is not needed, nor is the . The body> should go before the while terminator. This objection is based on a misunderstanding of what the else clause is for in a Python loop. The else clause is only executed if the loop terminated naturally (the exit condition became false) rather than being explicitly terminated using a break statement. This behaviour is most commonly useful when using a for loop to search through an iterable (breaking when the object is found, and using the else clause to handle the 'not found' case), but it is also defined for while loops. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From Hans.Polak at capgemini.com Tue Oct 3 12:14:45 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Tue, 3 Oct 2006 12:14:45 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <452234B2.8040103@voidspace.org.uk> Message-ID: <002901c6e6d4$c05b9340$1d2c440a@spain.capgemini.com> Thanks for your reply Nick, and your support Michael. I'll leave the PEP talk to you guys :) Cheers, Hans -----Original Message----- From: Michael Foord [mailto:fuzzyman at gmail.com] On Behalf Of Fuzzyman Sent: martes, 03 de octubre de 2006 12:00 To: Nick Coghlan Cc: Hans Polak; python-dev at python.org Subject: Re: [Python-Dev] PEP 315 - do while Nick Coghlan wrote: >Hans Polak wrote: > > >>Ok, I see your point. Really, I've read more about Python than worked with >>it, so I'm out of my league here. >> >>Can I combine your suggestion with mine and come up with the following: >> >> do: >> >> >> while >> else: >> >> >> > >In my example, the 3 sections (, and code> are all optional. A basic do-while loop would look like this: > > do: > > while > >(That is, is still repeated each time around the loop - it's >called that because it is run before the loop evaluated condition is evaluated) > > +1 This looks good. The current idiom works fine, but looks unnatural : while True: if : break Would a 'while' outside of a 'do' block (but without the colon) then be a syntax error ? 'do:' would just be syntactic sugar for 'while True:' I guess. Michael Foord http://www.voidspace.org.uk >Cheers, >Nick. > > > This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From Hans.Polak at capgemini.com Tue Oct 3 16:14:12 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Tue, 3 Oct 2006 16:14:12 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <45226ADA.9080306@gmail.com> Message-ID: <003c01c6e6f6$336f6700$1d2c440a@spain.capgemini.com> I'm against infinite loops -something religious :), which explains the call for the do loop. The issue about the parser is over my head, but the thought had occurred to me. Now, it would not affect while loops inside do loops, wouldn't it? Cheers, Hans. -----Original Message----- From: Nick Coghlan [mailto:ncoghlan at gmail.com] Sent: martes, 03 de octubre de 2006 15:51 To: Fuzzyman Cc: Hans Polak; python-dev at python.org Subject: Re: [Python-Dev] PEP 315 - do while Fuzzyman wrote: > Nick Coghlan wrote: >> In my example, the 3 sections (, and > code> are all optional. A basic do-while loop would look like this: >> >> do: >> >> while >> >> (That is, is still repeated each time around the loop - it's >> called that because it is run before the loop evaluated condition is evaluated) >> >> > > +1 > > This looks good. I'm pretty sure it was proposed by someone else a long time ago - I was surprised to find it wasn't mentioned in PEP 315. That said, Guido's observation on PEP 315 from earlier this year holds for me too: "I kind of like it but it doesn't strike me as super important" [1] > The current idiom works fine, but looks unnatural : > > while True: > if : > break There's the rationale for the PEP in a whole 5 lines counting whitespace ;) > Would a 'while' outside of a 'do' block (but without the colon) then be > a syntax error ? > > 'do:' would just be syntactic sugar for 'while True:' I guess. That's the slight issue I still have with the idea - you could end up with multiple ways of spelling some of the basic loop forms, such as these 3 flavours of infinite loop: do: pass # Is there an implicit 'while True' at the end of the loop body? do: while True while True: pass The other issue I have is that I'm not yet 100% certain it's implementable with Python's parser and grammar. I *think* changing the definition of the while statement from: while_stmt ::= "while" expression ":" suite ["else" ":" suite] to while_stmt ::= "while" expression [":" suite ["else" ":" suite]] And adding a new AST node and a new type of compiler frame block "DO_LOOP" would do the trick (the compilation of a while statement without a trailing colon would then check that it was in a DO_LOOP block and raise an error if not). Cheers, Nick. [1] http://mail.python.org/pipermail/python-dev/2006-February/060711.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From Hans.Polak at capgemini.com Tue Oct 3 17:17:36 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Tue, 3 Oct 2006 17:17:36 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <4522738F.80303@voidspace.org.uk> Message-ID: <003d01c6e6ff$0ec7ec70$1d2c440a@spain.capgemini.com> Please note that until <==> while not. do:

	until count > 10

do:

	while count <= 10

Cheers,
Hans.

-----Original Message-----
From: Michael Foord [mailto:fuzzyman at gmail.com] On Behalf Of Fuzzyman
Sent: martes, 03 de octubre de 2006 16:29
To: Nick Coghlan
Cc: Hans Polak; python-dev at python.org
Subject: Re: [Python-Dev] PEP 315 - do while

Nick Coghlan wrote:

> [snip..]
>
>> The current idiom works fine, but looks unnatural :
>>
>> while True:
>>     if :
>>        break
>
>
> There's the rationale for the PEP in a whole 5 lines counting
> whitespace ;)
>
>> Would a 'while' outside of a 'do' block (but without the colon) then be
>> a syntax error ?
>>
>> 'do:' would just be syntactic sugar for 'while True:' I guess.
>
>
> That's the slight issue I still have with the idea - you could end up
> with multiple ways of spelling some of the basic loop forms, such as
> these 3 flavours of infinite loop:
>
>   do:
>       pass # Is there an implicit 'while True' at the end of the loop
> body?
>
>   do:
>       while True
>
>   while True:
>       pass
>
Following the current idiom, isn't it more natural to repeat the loop
'until' a condition is met. If we introduced two new keywords, it would
avoid ambiguity in the use of 'while'.

do:

    until 

A do loop could require an 'until', meaning 'do' is not *just* a
replacement for an infinite loop. (Assuming the parser can be coerced
into co-operation.)

It is obviously still a new construct in terms of Python syntax (not
requiring a colon after ''.)

I'm sure this has been suggested, but wonder if it has already been
ruled out. An 'else' block could then retain its current meaning
(execute if the loop is not terminated early by an explicit  break.)

Michael Foord
http://www.voidspace.org.uk

This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient,  you are not authorized to read, print, retain, copy, disseminate,  distribute, or use this message or any part thereof. If you receive this  message in error, please notify the sender immediately and delete all  copies of this message.

From nmm1 at cus.cam.ac.uk  Wed Oct  4 13:26:46 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Wed, 04 Oct 2006 12:26:46 +0100
Subject: [Python-Dev] Caching float(0.0)
Message-ID: 

Alastair Houghton  wrote:
> 
> AFAIK few systems have floating point traps enabled by default (in  
> fact, isn't that what IEEE 754 specifies?), because they often aren't  
> very useful.

The first two statements are true; the last isn't.  They are extremely
useful, not least because they are the only practical way to locate
numeric errors in most 3 GL programs (including C, Fortran etc.)

> And in the specific case of the Python interpreter, why  
> would you ever want them turned on?  Surely in order to get  
> consistent floating point semantics, they need to be *off* and Python  
> needs to handle any exceptional cases itself; even if they're on, by  
> your argument Python must do that to avoid being terminated.

Grrk.  Why are you assuming that turning them off means that the
result is what you expect?  That isn't always so - sometimes it
merely means that you get wrong answers but no indication of that.

> or see if it can't turn them off using the C99  APIs.

That is a REALLY bad idea.  You have no idea how broken that is,
and what the impact it would be on Python.

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From nmm1 at cus.cam.ac.uk  Wed Oct  4 13:39:07 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Wed, 04 Oct 2006 12:39:07 +0100
Subject: [Python-Dev] Caching float(0.0)
Message-ID: 

James Y Knight  wrote:
> 
> This is a really poor argument. Python should be moving *towards*  
> proper '754 fp support, not away from it. On the platforms that are  
> most important, the C implementations distinguish positive and  
> negative 0. That the current python implementation may be defective  
> when the underlying C implementation is defective doesn't excuse a  
> change to intentionally break python on the common platforms.

Perhaps you might like to think why only IBM POWERx (and NOT the
Cell or most embedded POWERs) is the ONLY mainstream system to have
implemented all of IEEE 754 in hardware after 22 years?  Or why
NO programming language has provided support in those 22 years,
and only Java and C have even claimed to?

See Kahan's "How Javas Floating-Point Hurts Everyone Everywhere",
note that C99 is much WORSE, and then note that Java and C99 are
the only languages that have even attempted to include IEEE 754.

You have also misunderstood the issue.  The fact that a C implementation
doesn't support it does NOT mean that the implementation is defective;
quite the contrary.  The issue always has been that IEEE 754's basic
model is incompatible with the basic models of all programming
languages that I am familiar with (which is a lot).  And the specific
problems with C99 are in the STANDARD, not the IMPLEMENTATIONS.

> IEEE 754 is so widely implemented that IMO it would make sense to  
> make Python's floating point specify it, and simply declare floating  
> point operations on non-IEEE 754 machines as "use at own risk, may  
> not conform to python language standard". (or if someone wants to use  
> a software fp library for such machines, that's fine too).

Firstly, see the above.  Secondly, Python would need MAJOR semantic
changes to conform to IEEE 754R.  Thirdly, what would you say to
the people who want reliable error detection on floating-point of
the form that Python currently provides?

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From nick at craig-wood.com  Wed Oct  4 13:52:32 2006
From: nick at craig-wood.com (Nick Craig-Wood)
Date: Wed, 4 Oct 2006 12:52:32 +0100
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com>
References: 
	<45228491.9010103@v.loewis.de>
	<17698.40941.317868.702398@montanaro.dyndns.org>
	<1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com>
Message-ID: <20061004115232.GA10725@craig-wood.com>

On Wed, Oct 04, 2006 at 12:42:04AM -0400, Tim Peters wrote:
> [skip at pobox.com]
> > If C90 doesn't distinguish -0.0 and +0.0, how can Python?
> 
> With liberal applications of piss & vinegar ;-)
> 
> > Can you give a simple example where the difference between the two is apparent
> > to the Python programmer?
> 
> Perhaps surprsingly, many (well, comparatively many, compared to none
> ....) people have noticed that the platform atan2 cares a lot:
> 
> >>> from math import atan2 as a
> >>> z = 0.0  # postive zero
> >>> m = -z   # minus zero
> >>> a(z, z)   # the result here is actually +0.0
> 0.0
> >>> a(z, m)
> 3.1415926535897931
> >>> a(m, z)    # the result here is actually -0.0
> 0.0

This actually returns -0.0 under linux...

> >>> a(m, m)
> -3.1415926535897931
> 
> It work like that "even on Windows", and these are the results C99's
> 754-happy appendix mandates for atan2 applied to signed zeroes.  I've
> even seen a /complaint/ on c.l.py that atan2 doesn't do the same when
> 
> z = 0.0
> 
> is replaced by
> 
> z = 0
> 
> That is, at least one person thought it was "a bug" that integer
> zeroes didn't deliver the same behaviors.
> 
> Do people actually rely on this?  I know I don't, but given that more
> than just 2 people have remarked on it seeming to like it, I expect
> that changing this would break /some/ code out there.

Probably!

It surely isn't a big problem though is it?

instead of writing

  if (result == 0.0)
      returned cached_float_0;

we just write something like

  if (memcmp((&result, &static_zero, sizeof(double)) == 0))
      returned cached_float_0;

Eg the below prints (gcc/linux)

The memcmp() way
1: 0 == 0.0
2: -0 != 0.0
The == way
3: 0 == 0.0
4: -0 == 0.0

#include 
#include 

int main(void)
{
    static double zero_value = 0.0;
    double result;

    printf("The memcmp() way\n");
    result = 0.0;
    if (memcmp(&result, &zero_value, sizeof(double)) == 0)
	printf("1: %g == 0.0\n", result);
    else
	printf("1: %g != 0.0\n", result);

    result = -0.0;
    if (memcmp(&result, &zero_value, sizeof(double)) == 0)
	printf("2: %g == 0.0\n", result);
    else
	printf("2: %g != 0.0\n", result);

    printf("The == way\n");
    result = 0.0;
    if (result == 0.0)
	printf("3: %g == 0.0\n", result);
    else
	printf("3: %g != 0.0\n", result);

    result = -0.0;
    if (result == 0.0)
	printf("4: %g == 0.0\n", result);
    else
	printf("4: %g != 0.0\n", result);

    return 0;
}   

-- 
Nick Craig-Wood  -- http://www.craig-wood.com/nick

From kristjan at ccpgames.com  Wed Oct  4 13:56:49 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Wed, 4 Oct 2006 11:56:49 -0000
Subject: [Python-Dev] Caching float(0.0)
Message-ID: <129CEF95A523704B9D46959C922A280002FE99BD@nemesis.central.ccp.cc>

Hm, doesn?t seem to be so for my regular python.

Python 2.3.3 Stackless 3.0 040407 (#51, Apr  7 2004, 19:28:46) [MSC v.1200 32 bi
t (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> x = -0.0
>>> y = 0.0
>>> x,y
(0.0, 0.0)
>>> 

maybe it is 2.3.3, or maybe it is stackless from back then.
K

> -----Original Message-----
> From: python-dev-bounces+kristjan=ccpgames.com at python.org 
> [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] 
> On Behalf Of "Martin v. L?wis"
> Sent: 3. okt?ber 2006 17:56
> To: skip at pobox.com
> Cc: Nick Maclaren; python-dev at python.org
> Subject: Re: [Python-Dev] Caching float(0.0)
> 
> skip at pobox.com schrieb:
> > If C90 doesn't distinguish -0.0 and +0.0, how can Python?  Can you 
> > give a simple example where the difference between the two 
> is apparent 
> > to the Python programmer?
> 
> Sure:
> 
> py> x=-0.0
> py> y=0.0
> py> x,y

From nmm1 at cus.cam.ac.uk  Wed Oct  4 14:12:06 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Wed, 04 Oct 2006 13:12:06 +0100
Subject: [Python-Dev] Caching float(0.0)
Message-ID: 

On Wed, Oct 04, 2006 at 12:42:04AM -0400, Tim Peters wrote:
>
> > If C90 doesn't distinguish -0.0 and +0.0, how can Python?
> 
> > Can you give a simple example where the difference between the two
> > is apparent to the Python programmer?
> 
> Perhaps surprsingly, many (well, comparatively many, compared to none
> ....) people have noticed that the platform atan2 cares a lot:

Once upon a time, floating-point was used as an approximation to
mathematical real numbers, and anything which was mathematically
undefined in real arithmetic was regarded as an error in floating-
point.  This allowed a reasonable amount of numeric validation,
because the main remaining discrepancy was that floating-point
has only limited precision and range.

Most of the numerical experts that I know of still favour that
approach, and it is the one standardised by the ISO LIA-1, LIA-2
and LIA-3 standards for floating-point arithmetic.

atan2(0.0,0.0) should be an error.

But C99 differs.  While words do not fail me, they are inappropriate
for this mailing list :-(

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From amk at amk.ca  Wed Oct  4 14:28:45 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Wed, 4 Oct 2006 08:28:45 -0400
Subject: [Python-Dev] what's really new in python 2.5 ?
In-Reply-To: 
References:  <200610031039.52434.fdrake@acm.org>
	<20061003180848.GB31361@localhost.localdomain>
	<200610031419.28281.fdrake@acm.org>

Message-ID: <20061004122845.GA22146@rogue.amk.ca>

On Tue, Oct 03, 2006 at 09:32:43PM -0700, Neal Norwitz wrote:
> Let me know if you see anything screwed up after an hour or so.  The
> new versions should be up by then.

Thanks!  That seems to have cleared things up -- the section names are
now node2.html, node3.html, ..., which is what I'd expect for the 2.6
document.

--amk

From guido at python.org  Wed Oct  4 16:30:57 2006
From: guido at python.org (Guido van Rossum)
Date: Wed, 4 Oct 2006 07:30:57 -0700
Subject: [Python-Dev] PEP 315 - do while
In-Reply-To: <003d01c6e6ff$0ec7ec70$1d2c440a@spain.capgemini.com>
References: <4522738F.80303@voidspace.org.uk>
	<003d01c6e6ff$0ec7ec70$1d2c440a@spain.capgemini.com>
Message-ID: 

You are all wasting your time on this. It won't go in.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jeremy at alum.mit.edu  Wed Oct  4 17:44:08 2006
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 4 Oct 2006 11:44:08 -0400
Subject: [Python-Dev] PEP 315 - do while
In-Reply-To: 
References: <4522738F.80303@voidspace.org.uk>
	<003d01c6e6ff$0ec7ec70$1d2c440a@spain.capgemini.com>

Message-ID: 

On 10/4/06, Guido van Rossum  wrote:
> You are all wasting your time on this. It won't go in.

+1 from me.  Should you mark PEP 315 as rejected?

Jeremy

>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>

From rhettinger at ewtllc.com  Wed Oct  4 18:07:52 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Wed, 4 Oct 2006 09:07:52 -0700
Subject: [Python-Dev] PEP 315 - do while
Message-ID: <34FE2A7A34BC3544BC3127D023DF3D12128737@EWTEXCH.office.bhtrader.com>

I'll mark it as withdrawn.

Raymond

-----Original Message-----
From: python-dev-bounces+rhettinger=ewtllc.com at python.org
[mailto:python-dev-bounces+rhettinger=ewtllc.com at python.org] On Behalf
Of Jeremy Hylton
Sent: Wednesday, October 04, 2006 8:44 AM
To: Guido van Rossum
Cc: Hans Polak; python-dev at python.org
Subject: Re: [Python-Dev] PEP 315 - do while

On 10/4/06, Guido van Rossum  wrote:
> You are all wasting your time on this. It won't go in.

+1 from me.  Should you mark PEP 315 as rejected?

Jeremy

>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/rhettinger%40ewtllc.co
m

From brett at python.org  Wed Oct  4 21:13:18 2006
From: brett at python.org (Brett Cannon)
Date: Wed, 4 Oct 2006 12:13:18 -0700
Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C)
In-Reply-To: 
References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com>
	<79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com>

Message-ID: 

On 10/3/06, Neal Norwitz  wrote:
>
> On 10/2/06, Brett Cannon  wrote:
> >
> > This is why I asked for input from people on which would take less time.
> > Almost all the answers I got was that the the C code was delicate but
> that
> > it was workable.  Several people said they wished for a Python
> > implementation, but hardly anyone said flat-out, "don't waste your time,
> the
> > Python version will be faster to do".
>
> I didn't respond mostly because I pushed this direction to begin with.
> That and I'm lazy. :-)

But couldn't you be lazy in a timely fashion?

There is a lot of string manipulation and some list manipulation that
> is a royal pain in C and trivial in python.  Caching will be much
> easier to experiement with in Python too.  The Python version will be
> much smaller.  It will take far less time to code it in Python and
> recode in C, than to try to get it right in C the first time.  If the
> code is fast enough, there's no reason to rewrite in C.  It will
> probably be easier to subclass a Python based version that a C based
> version.
>
> > As for the bootstrapping, I am sure it is resolvable as well.  There are
> > several ways to go about it that are all tractable.
>
> Right, I had bootstrapping with implementing xrange in Python, but it
> was pretty easy to resolve in the end.  You might even want to use
> part of that patch (from pythonrun.c?).  There was some re-org to make
> bootstrapping easier/possible (I don't remember exactly right now).

OK, OK, I get the hint.  I will rewrite import in Python and just make it my
research work and personal project.  Probably will do the initial pure
Python stuff in the sandbox to really isolate it and then move it over to
the pep302_phase2 branch when C code has to be changed.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061004/7d84fdb0/attachment.htm 

From martin at v.loewis.de  Wed Oct  4 21:14:46 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 04 Oct 2006 21:14:46 +0200
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: 
References: 
	<452257EB.6070601@v.loewis.de>
	<3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net>

	<452347FB.3050804@v.loewis.de>

Message-ID: <45240826.6040600@v.loewis.de>

Alastair Houghton schrieb:
> AFAIK few systems have floating point traps enabled by default (in fact,
> isn't that what IEEE 754 specifies?), because they often aren't very
> useful.  And in the specific case of the Python interpreter, why would
> you ever want them turned on?

That reasoning is irrelevant. If it breaks a few systems, that already
is some systems too many. Python should never crash; and we have no
control over the floating point exception handling in any portable
manner.

Regards,
Martin

From martin at v.loewis.de  Wed Oct  4 21:29:19 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 04 Oct 2006 21:29:19 +0200
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99BD@nemesis.central.ccp.cc>
References: <129CEF95A523704B9D46959C922A280002FE99BD@nemesis.central.ccp.cc>
Message-ID: <45240B8F.6070106@v.loewis.de>

Kristj?n V. J?nsson schrieb:
> Hm, doesn?t seem to be so for my regular python.
> 
> maybe it is 2.3.3, or maybe it is stackless from back then.

It's because you are using Windows. The way -0.0 gets rendered
depends on the platform. As Tim points out, try
math.atan2(0.0, -0.0) vs math.atan2(0.0, 0.0).

Regards,
Martin

From alastair at alastairs-place.net  Wed Oct  4 22:14:49 2006
From: alastair at alastairs-place.net (Alastair Houghton)
Date: Wed, 4 Oct 2006 21:14:49 +0100
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <45240826.6040600@v.loewis.de>
References: 
	<452257EB.6070601@v.loewis.de>
	<3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net>

	<452347FB.3050804@v.loewis.de>

	<45240826.6040600@v.loewis.de>
Message-ID: 

On Oct 4, 2006, at 8:14 PM, Martin v. L?wis wrote:

> If it breaks a few systems, that already is some systems too many.  
> Python should never crash; and we have no control over the floating  
> point exception handling in any portable manner.

You're quite right, though there is already plenty of platform  
dependent code in Python for just that purpose (see fpectlmodule.c,  
for instance).

Anyway, all I originally wanted was to point out that using division  
was one possible way to tell the difference that didn't involve  
relying on the representation being IEEE compliant.  It's true that  
there are problems with FP exceptions.

Kind regards,

Alastair.

--
http://alastairs-place.net

From hasan.diwan at gmail.com  Wed Oct  4 22:39:50 2006
From: hasan.diwan at gmail.com (Hasan Diwan)
Date: Wed, 4 Oct 2006 13:39:50 -0700
Subject: [Python-Dev] Fwd: [ python-Feature Requests-1567948 ] poplib.py
	list interface
In-Reply-To: 
References: 
Message-ID: <2cda2fc90610041339saf5b6b9sa0ea97e3cf13cb74@mail.gmail.com>

I've made some changes to poplib.py, submitted them to Sourceforge, and
emailed Piers regarding taking over maintenance of the module. I have his
support to do so, along with Guido's. However, I would like to ask one of
the more senior developers to review the change and commit it. Many thanks
for your kind assistance!

---------- Forwarded message ----------
From: SourceForge.net 
Date: 04-Oct-2006 13:29
Subject: [ python-Feature Requests-1567948 ] poplib.py list interface
To: noreply at sourceforge.net

Feature Requests item #1567948, was opened at 2006-09-29 11:51
Message generated for change (Comment added) made by hdiwan650
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1567948&group_id=5470

Please note that this message will contain a full copy of the comment
thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.6
Status: Open
Resolution: None
Priority: 5
Submitted By: Hasan Diwan (hdiwan650)
Assigned to: Nobody/Anonymous (nobody)
Summary: poplib.py list interface

Initial Comment:
Adds a list-like interface to poplib.py, poplib_as_list.

----------------------------------------------------------------------

>Comment By: Hasan Diwan (hdiwan650)
Date: 2006-10-04 13:29

Message:
Logged In: YES
user_id=1185570

I changed it a little bit, added my name at the top of the
file as the maintainer.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1567948&group_id=5470

-- 
Cheers,
Hasan Diwan 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061004/a59230c2/attachment.html 

From larry at hastings.org  Wed Oct  4 20:08:16 2006
From: larry at hastings.org (Larry Hastings)
Date: Wed, 04 Oct 2006 11:08:16 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation,
 now as fast as "".join(x) idiom
Message-ID: <4523F890.9060804@hastings.org>

I've never liked the "".join([]) idiom for string concatenation; in my 
opinion it violates the principles "Beautiful is better than ugly." and 
"There should be one-- and preferably only one --obvious way to do it.". 
(And perhaps several others.)  To that end I've submitted patch #1569040 
to SourceForge:

http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470
This patch speeds up using + for string concatenation.  It's been in 
discussion on c.l.p for about a week, here:

http://groups.google.com/group/comp.lang.python/browse_frm/thread/b8a8f20bc3c81bcf

I'm not a Python guru, and my initial benchmark had many mistakes. With 
help from the community correct benchmarks emerged: + for string 
concatenation is now roughly as fast as the usual "".join() idiom when 
appending.  (It appears to be *much* faster for prepending.)  The 
patched Python passes all the tests in regrtest.py for which I have 
source; I didn't install external packages such as bsddb and sqlite3.

My approach was to add a "string concatenation" object; I have since 
learned this is also called a "rope".  Internally, a 
PyStringConcatationObject is exactly like a PyStringObject but with a 
few extra members taking an additional thirty-six bytes of storage.  
When you add two PyStringObjects together, string_concat() returns a 
PyStringConcatationObject which contains references to the two strings.  
Concatenating any mixture of PyStringObjects and 
PyStringConcatationObjects works similarly, though there are some 
internal optimizations.

These changes are almost entirely contained within 
Objects/stringobject.c and Include/stringobject.h.  There is one major 
externally-visible change in this patch: PyStringObject.ob_sval is no 
longer a char[1] array, but a char *. Happily, this only requires a 
recompile, because the CPython source is *marvelously* consistent about 
using the macro PyString_AS_STRING().  (One hopes extension authors are 
as consistent.)  I only had to touch two other files (Python/ceval.c and 
Objects/codeobject.c) and those were one-line changes.  There is one 
remaining place that still needs fixing: the self-described "hack" in 
Mac/Modules/MacOS.c.  Fixing that is beyond my pay grade.

I changed the representation of ob_sval for two reasons: first, it is 
initially NULL for a string concatenation object, and second, because it 
may point to separately-allocated memory.  That's where the speedup came 
from--it doesn't render the string until someone asks for the string's 
value.  It is telling to see my new implementation of 
PyString_AS_STRING, as follows (casts and extra parentheses removed for 
legibility):
    #define PyString_AS_STRING(x) ( x->ob_sval ? x->ob_sval : 
PyString_AsString(x) )
This adds a layer of indirection for the string and a branch, adding a 
tiny (but measurable) slowdown to the general case.  Again, because the 
changes to PyStringObject are hidden by this macro, external users of 
these objects don't notice the difference.

The patch is posted, and I have donned the thickest skin I have handy.  
I look forward to your feedback.

Cheers,

/larry/

From greg at electricrain.com  Thu Oct  5 21:28:58 2006
From: greg at electricrain.com (Gregory P. Smith)
Date: Thu, 5 Oct 2006 12:28:58 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string
	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <4523F890.9060804@hastings.org>
References: <4523F890.9060804@hastings.org>
Message-ID: <20061005192858.GA9435@zot.electricrain.com>

> I've never liked the "".join([]) idiom for string concatenation; in my 
> opinion it violates the principles "Beautiful is better than ugly." and 
> "There should be one-- and preferably only one --obvious way to do it.". 
> (And perhaps several others.)  To that end I've submitted patch #1569040 
> to SourceForge:
>     
> http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470
> This patch speeds up using + for string concatenation.

yay!  i'm glad to see this.  i hate the "".join syntax.  i still write
that as string.join() because thats at least readable).  it also fixes
the python idiom for fast string concatenation as intended; anyone
whos ever written code that builds a large string value by pushing
substrings into a list only to call join later should agree.

mystr = "prefix"
while bla:
  #...
  mystr += moredata

is much nicer to read than

mystr = "prefix"
strParts = [mystr]
while bla:
  #...
  strParts.append(moredata)
mystr = "".join(strParts)

have you run any generic benchmarks such as pystone to get a better
idea of what the net effect on "typical" python code is?

From jcarlson at uci.edu  Thu Oct  5 22:05:09 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 05 Oct 2006 13:05:09 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string
	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <20061005192858.GA9435@zot.electricrain.com>
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
Message-ID: <20061005130119.0951.JCARLSON@uci.edu>

"Gregory P. Smith"  wrote:
> 
> > I've never liked the "".join([]) idiom for string concatenation; in my 
> > opinion it violates the principles "Beautiful is better than ugly." and 
> > "There should be one-- and preferably only one --obvious way to do it.". 
> > (And perhaps several others.)  To that end I've submitted patch #1569040 
> > to SourceForge:
> >     
> > http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470
> > This patch speeds up using + for string concatenation.
> 
> yay!  i'm glad to see this.  i hate the "".join syntax.  i still write
> that as string.join() because thats at least readable).  it also fixes
> the python idiom for fast string concatenation as intended; anyone
> whos ever written code that builds a large string value by pushing
> substrings into a list only to call join later should agree.
> 
> mystr = "prefix"
> while bla:
>   #...
>   mystr += moredata

Regardless of "nicer to read", I would just point out that Guido has
stated that Python will not have strings implemented as trees.  Also,
Python 3.x will have a data type called 'bytes', which will be the
default return of file.read() (when files are opened as binary), which
uses an over-allocation strategy like lists to get relatively fast
concatenation (on the order of lst1 += lst2).

 - Josiah

From nicko at nicko.org  Thu Oct  5 22:29:28 2006
From: nicko at nicko.org (Nicko van Someren)
Date: Thu, 5 Oct 2006 21:29:28 +0100
Subject: [Python-Dev] PATCH submitted: Speed up + for string
	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <20061005192858.GA9435@zot.electricrain.com>
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
Message-ID: 

On 5 Oct 2006, at 20:28, Gregory P. Smith wrote:

>> I've never liked the "".join([]) idiom for string concatenation;  
>> in my
>> opinion it violates the principles "Beautiful is better than  
>> ugly." and
>> "There should be one-- and preferably only one --obvious way to do  
>> it.".
>> (And perhaps several others.)  To that end I've submitted patch  
>> #1569040
>> to SourceForge:
>>
>> http://sourceforge.net/tracker/index.php? 
>> func=detail&aid=1569040&group_id=5470&atid=305470
>> This patch speeds up using + for string concatenation.
>
> yay!  i'm glad to see this.  i hate the "".join syntax.

Here here.  Being able to write what you mean and have the language  
get decent performance none the less seems to me to be a "good thing".

> have you run any generic benchmarks such as pystone to get a better
> idea of what the net effect on "typical" python code is?

Yeah, "real world" performance testing is always important with  
anything that uses lazy evaluation.  If you get to control if and  
when the computation actually happens you have even more scope than  
usual for getting the benchmark answer you want to see!

Cheers,
	Nicko

From larry at hastings.org  Thu Oct  5 23:23:08 2006
From: larry at hastings.org (Larry Hastings)
Date: Thu, 05 Oct 2006 14:23:08 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>

Message-ID: <452577BC.9010506@hastings.org>

Gregory P. Smith wrote:
> have you run any generic benchmarks such as pystone to get a better
> idea of what the net effect on "typical" python code is?
I hadn't, but I'm happy to.  On my machine (a fire-breathing Athlon 64 
x2 4400+), best of three runs:

Python 2.5 release:
    Pystone(1.1) time for 50000 passes = 1.01757
    This machine benchmarks at 49136.8 pystones/second

Python 2.5 concat:
    Pystone(1.1) time for 50000 passes = 0.963191
    This machine benchmarks at 51910.8 pystones/second

I'm surprised by this; I had expected it to be slightly *slower*, not 
the other way 'round.  I'm not sure why this is.  A cursory glance at 
pystone.py doesn't reveal any string concatenation using +, so I doubt 
it's benefiting from my speedup.  And I didn't change the optimization 
flags when I compiled Python, so that should be the same.

Josiah Carlson wrote:
> Regardless of "nicer to read", I would just point out that Guido has
> stated that Python will not have strings implemented as trees.
>   
I suspect it was more a caution that Python wouldn't *permanently* store 
strings as "ropes".  In my patch, the rope only exists until someone 
asks for the string's value, at which point the tree is rendered and 
dereferenced.  From that point on the object is exactly like a normal 
PyStringObject to the external viewer.

But you and I are, as I believe the saying goes, "channeling Guido 
(badly)".  Perhaps some adult supervision will intervene soon and make 
its opinions known.

For what it's worth, I've realized two things I want to change about my 
patch:

  * I left in a couple of /* lch */ comments I used during development 
as markers to find my own code.  Whoops; I'll strip those out.

  * I realized that, because of struct packing, all PyStringObjects are 
currently wasting an average of two bytes apiece.  (As in, that's 
something Python 2.5 does, not something added by my code.)  I'll change 
my patch so strings are allocated more precisely.  If my string 
concatenation patch is declined, I'll be sure to submit this patch 
separately.

I'll try to submit an updated patch today.

Cheers,

/larry/

From steve at holdenweb.com  Fri Oct  6 08:35:19 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 06 Oct 2006 07:35:19 +0100
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: <20061005192858.GA9435@zot.electricrain.com>
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
Message-ID: 

Gregory P. Smith wrote:
>>I've never liked the "".join([]) idiom for string concatenation; in my 
>>opinion it violates the principles "Beautiful is better than ugly." and 
>>"There should be one-- and preferably only one --obvious way to do it.". 
>>(And perhaps several others.)  To that end I've submitted patch #1569040 
>>to SourceForge:
>>    
>>http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470
>>This patch speeds up using + for string concatenation.
> 
> 
> yay!  i'm glad to see this.  i hate the "".join syntax.  i still write
> that as string.join()  [...]

instance.method(*args) <==> type.method(instance, *args)

You can nowadays spell this as str.join("", lst) - no need to import a 
whole module!

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From fredrik at pythonware.com  Fri Oct  6 08:38:24 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 06 Oct 2006 08:38:24 +0200
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <4523F890.9060804@hastings.org>	<20061005192858.GA9435@zot.electricrain.com>

Message-ID: 

Steve Holden wrote:

> instance.method(*args) <==> type.method(instance, *args)
> 
> You can nowadays spell this as str.join("", lst) - no need to import a 
> whole module!

except that str.join isn't polymorphic:

 >>> str.join(u",", ["1", "2", "3"])
Traceback (most recent call last):
   File "", line 1, in 
TypeError: descriptor 'join' requires a 'str' object but received a 
'unicode'
 >>> string.join(["1", "2", "3"], u",")
u'1,2,3'

From skip at pobox.com  Fri Oct  6 12:45:10 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 6 Oct 2006 05:45:10 -0500
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: <20061005192858.GA9435@zot.electricrain.com>
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
Message-ID: <17702.13238.684094.6289@montanaro.dyndns.org>

    Greg> have you run any generic benchmarks such as pystone to get a
    Greg> better idea of what the net effect on "typical" python code is?

MAL's pybench would probably be better for this presuming it does some
addition with string operands.

Skip

From fredrik at pythonware.com  Fri Oct  6 12:54:13 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 06 Oct 2006 12:54:13 +0200
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: <17702.13238.684094.6289@montanaro.dyndns.org>
References: <4523F890.9060804@hastings.org>	<20061005192858.GA9435@zot.electricrain.com>
	<17702.13238.684094.6289@montanaro.dyndns.org>
Message-ID: 

skip at pobox.com wrote:

>     Greg> have you run any generic benchmarks such as pystone to get a
>     Greg> better idea of what the net effect on "typical" python code is?
> 
> MAL's pybench would probably be better for this presuming it does some
> addition with string operands.

or stringbench.

From rrr at ronadam.com  Fri Oct  6 13:37:09 2006
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 06 Oct 2006 06:37:09 -0500
Subject: [Python-Dev] PATCH submitted: Speed up + for
 string	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <20061005192858.GA9435@zot.electricrain.com>
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
Message-ID: <45263FE5.3070604@ronadam.com>

Gregory P. Smith wrote:
>> I've never liked the "".join([]) idiom for string concatenation; in my 
>> opinion it violates the principles "Beautiful is better than ugly." and 
>> "There should be one-- and preferably only one --obvious way to do it.". 
>> (And perhaps several others.)  To that end I've submitted patch #1569040 
>> to SourceForge:
>>     
>> http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470
>> This patch speeds up using + for string concatenation.
> 
> yay!  i'm glad to see this.  i hate the "".join syntax.  i still write
> that as string.join() because thats at least readable).  it also fixes
> the python idiom for fast string concatenation as intended; anyone
> whos ever written code that builds a large string value by pushing
> substrings into a list only to call join later should agree.

Well I always like things to run faster, but I disagree that this idiom is broken.

I like using lists to store sub strings and I think it's just a matter of 
changing your frame of reference in how you think about them.  For example it 
doesn't bother me to have an numeric type with many digits, and to have lists of 
many, many digit numbers, and work with those.  Working with lists of many 
character strings is not that different.  I've even come to the conclusion (just 
my opinion) that mutable lists of strings probably would work better than a long 
mutable string of characters in most situations.

What I've found is there seems to be an optimum string length depending on what 
you are doing.  Too long (hundreds or thousands of characters) and repeating 
some string operations (not just concatenations) can be slow (relative to short 
strings), and using many short (single character) strings would use more memory 
than is needed.  So a list of medium length strings is actually a very nice 
compromise.  I'm not sure what the optimal strings length is, but lines of about 
80 columns seems to work very well for most things.

I think what may be missing is a larger set of higher level string functions 
that will work with lists of strings directly.  Then lists of strings can be 
thought of as a mutable string type by its use, and then working with substrings 
in lists and using ''.join() will not seem as out of place.  So maybe instead of 
splitting, modifying, then joining, (and again, etc ...), just pass the whole 
list around and have operations that work directly on the list of strings and 
return a list of strings as the result.  Pretty much what the Patch does under 
the covers, but it only works with concatenation.  Having more functions that 
work with lists of strings directly will reduce the need for concatenation as well.

Some operations that could work well with whole lists of strings of lines may be 
indent_lines, dedent_lines, prepend_lines, wrap_lines, and of course join_lines 
as in '\n'.join(L), the inverse of s.splitlines(), and there also readlines() 
and writelines(). Also possilby find_line or find_in_lines(). These really 
shouldn't seem anymore out of place than numeric operations that work with lists 
such as sum, max, and min.  So to me...  "".join(L) as a string operation that 
works on a list of strings seems perfectly natural. :-)

Cheers,
    Ron

From fredrik at pythonware.com  Fri Oct  6 13:55:09 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 06 Oct 2006 13:55:09 +0200
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: <45263FE5.3070604@ronadam.com>
References: <4523F890.9060804@hastings.org>	<20061005192858.GA9435@zot.electricrain.com>
	<45263FE5.3070604@ronadam.com>
Message-ID: 

Ron Adam wrote:

> I think what may be missing is a larger set of higher level string functions 
> that will work with lists of strings directly.  Then lists of strings can be 
> thought of as a mutable string type by its use, and then working with substrings 
> in lists and using ''.join() will not seem as out of place.

as important is the observation that you don't necessarily have to join 
string lists; if the data ends up being sent over a wire or written to 
disk, you might as well skip the join step, and work directly from the list.

(it's no accident that ET has grown "tostringlist" and "fromstringlist" 
functions, for example ;-)

From amk at amk.ca  Fri Oct  6 15:40:58 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Fri, 6 Oct 2006 09:40:58 -0400
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
Message-ID: <20061006134058.GA16266@localhost.localdomain>

I was looking at the logs for classobject.c and noticed this commit
that adds Py_TPFLAGS_HAVE_WEAKREFS to the instance type.  Should it be
backported to 2.4?  (It looks to me like it should, but I don't know
anything about weakref implementation and want to get approval from
someone who knows.)

--amk

r39038 | rhettinger | 2005-06-19 04:42:20 -0400 (Sun, 19 Jun 2005) | 2 lines

Insert missing flag.

------------------------------------------------------------------------
Index: classobject.c
===================================================================
--- classobject.c       (revision 39037)
+++ classobject.c       (revision 39038)
@@ -2486,7 +2486,7 @@
        (getattrofunc)instancemethod_getattro,  /* tp_getattro */
        PyObject_GenericSetAttr,                /* tp_setattro */
        0,                                      /* tp_as_buffer */
-       Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
+       Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC  | Py_TPFLAGS_HAVE_WEAKREFS, /* tp_flags */
        instancemethod_doc,                     /* tp_doc */
        (traverseproc)instancemethod_traverse,  /* tp_traverse */
        0,                                      /* tp_clear */
svn merge -r 39037:39038 svn+ssh://pythondev at svn.python.org/python/trunk

From rhettinger at ewtllc.com  Fri Oct  6 17:48:15 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Fri, 6 Oct 2006 08:48:15 -0700
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
Message-ID: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com>

No need to backport.  Py_TPFLAGS_DEFAULT implies
Py_TPFLAGS_HAVE_WEAKREFS.

The change was for clarity -- most things that have the weakref slots
filled-in will also make the flag explicit -- that makes it easier on
the brain when verifying code that checks the weakref flag.

Raymond

-----Original Message-----
From: python-dev-bounces+rhettinger=ewtllc.com at python.org
[mailto:python-dev-bounces+rhettinger=ewtllc.com at python.org] On Behalf
Of A.M. Kuchling
Sent: Friday, October 06, 2006 6:41 AM
To: python-dev at python.org
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?

I was looking at the logs for classobject.c and noticed this commit
that adds Py_TPFLAGS_HAVE_WEAKREFS to the instance type.  Should it be
backported to 2.4?  (It looks to me like it should, but I don't know
anything about weakref implementation and want to get approval from
someone who knows.)

--amk

r39038 | rhettinger | 2005-06-19 04:42:20 -0400 (Sun, 19 Jun 2005) | 2
lines

Insert missing flag.

------------------------------------------------------------------------
Index: classobject.c
===================================================================
--- classobject.c       (revision 39037)
+++ classobject.c       (revision 39038)
@@ -2486,7 +2486,7 @@
        (getattrofunc)instancemethod_getattro,  /* tp_getattro */
        PyObject_GenericSetAttr,                /* tp_setattro */
        0,                                      /* tp_as_buffer */
-       Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
+       Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC  |
Py_TPFLAGS_HAVE_WEAKREFS, /* tp_flags */
        instancemethod_doc,                     /* tp_doc */
        (traverseproc)instancemethod_traverse,  /* tp_traverse */
        0,                                      /* tp_clear */
svn merge -r 39037:39038 svn+ssh://pythondev at svn.python.org/python/trunk
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/rhettinger%40ewtllc.co
m

From jcarlson at uci.edu  Fri Oct  6 18:03:29 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 06 Oct 2006 09:03:29 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string
	concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <45263FE5.3070604@ronadam.com> 
Message-ID: <20061006085913.0963.JCARLSON@uci.edu>

Fredrik Lundh  wrote:
> 
> Ron Adam wrote:
> 
> > I think what may be missing is a larger set of higher level string functions 
> > that will work with lists of strings directly.  Then lists of strings can be 
> > thought of as a mutable string type by its use, and then working with substrings 
> > in lists and using ''.join() will not seem as out of place.
> 
> as important is the observation that you don't necessarily have to join 
> string lists; if the data ends up being sent over a wire or written to 
> disk, you might as well skip the join step, and work directly from the list.
> 
> (it's no accident that ET has grown "tostringlist" and "fromstringlist" 
> functions, for example ;-)

I've personally added a line-based abstraction with indent/dedent
handling, etc., for the editor I use, which helps make macros and
underlying editor functionality easier to write.

 - Josiah

From amk at amk.ca  Fri Oct  6 18:34:56 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Fri, 6 Oct 2006 12:34:56 -0400
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com>
References: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com>
Message-ID: <20061006163456.GA24036@rogue.amk.ca>

On Fri, Oct 06, 2006 at 08:48:15AM -0700, Raymond Hettinger wrote:
> The change was for clarity -- most things that have the weakref slots
> filled-in will also make the flag explicit -- that makes it easier on
> the brain when verifying code that checks the weakref flag.

OK; I won't backport this.  Thanks!

--amk

From bob at redivi.com  Fri Oct  6 19:41:16 2006
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 6 Oct 2006 10:41:16 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string
	concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
	<45263FE5.3070604@ronadam.com> 
Message-ID: <6a36e7290610061041j6e501552kcb4f525668c96f55@mail.gmail.com>

On 10/6/06, Fredrik Lundh  wrote:
> Ron Adam wrote:
>
> > I think what may be missing is a larger set of higher level string functions
> > that will work with lists of strings directly.  Then lists of strings can be
> > thought of as a mutable string type by its use, and then working with substrings
> > in lists and using ''.join() will not seem as out of place.
>
> as important is the observation that you don't necessarily have to join
> string lists; if the data ends up being sent over a wire or written to
> disk, you might as well skip the join step, and work directly from the list.
>
> (it's no accident that ET has grown "tostringlist" and "fromstringlist"
> functions, for example ;-)

The just make lists paradigm is used by Erlang too, it's called
"iolist" there (it's not a type, just a convention). The lists can be
nested though, so concatenating chunks of data for IO is always a
constant time operation even if the chunks are already iolists.

-bob

From rrr at ronadam.com  Fri Oct  6 21:53:01 2006
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 06 Oct 2006 14:53:01 -0500
Subject: [Python-Dev] PATCH submitted: Speed up + for
 string	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <20061006085913.0963.JCARLSON@uci.edu>
References: <45263FE5.3070604@ronadam.com> 
	<20061006085913.0963.JCARLSON@uci.edu>
Message-ID: <4526B41D.50805@ronadam.com>

Josiah Carlson wrote:
> Fredrik Lundh  wrote:
>> Ron Adam wrote:
>>
>>> I think what may be missing is a larger set of higher level string functions 
>>> that will work with lists of strings directly.  Then lists of strings can be 
>>> thought of as a mutable string type by its use, and then working with substrings 
>>> in lists and using ''.join() will not seem as out of place.
>> as important is the observation that you don't necessarily have to join 
>> string lists; if the data ends up being sent over a wire or written to 
>> disk, you might as well skip the join step, and work directly from the list.
>>
>> (it's no accident that ET has grown "tostringlist" and "fromstringlist" 
>> functions, for example ;-)
> 
> I've personally added a line-based abstraction with indent/dedent
> handling, etc., for the editor I use, which helps make macros and
> underlying editor functionality easier to write.
> 
> 
>  - Josiah

I've done the same thing just last week.  I've started to collect them into a 
module called stringtools, but I see no reason why they can't reside in the 
string module.

I think this may be just a case of collecting these type of routines together in 
one place so they can be reused easily because they already are scattered around 
pythons library in some form or another.

Another tool I found tucked away within a pydoc is the console pager that is 
used in pydoc.  I think it could easily be a separate module it self.  And it 
benefits from the line-based abstraction as well.

Cheers,
    Ron

From nicko at nicko.org  Sat Oct  7 04:21:08 2006
From: nicko at nicko.org (Nicko van Someren)
Date: Sat, 7 Oct 2006 03:21:08 +0100
Subject: [Python-Dev] PATCH submitted: Speed up + for string	
	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <45263FE5.3070604@ronadam.com>
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
	<45263FE5.3070604@ronadam.com>
Message-ID: 

On 6 Oct 2006, at 12:37, Ron Adam wrote:

>>> I've never liked the "".join([]) idiom for string concatenation;  
>>> in my
>>> opinion it violates the principles "Beautiful is better than  
>>> ugly." and
>>> "There should be one-- and preferably only one --obvious way to  
>>> do it.".
...
> Well I always like things to run faster, but I disagree that this  
> idiom is broken.
>
> I like using lists to store sub strings and I think it's just a  
> matter of
> changing your frame of reference in how you think about them.

I think that you've hit on exactly the reason why this patch is a  
good idea.  You happen to like to store strings in lists, and in many  
situations this is a fine thing to do, but if one is forced to change  
ones frame of reference in order to get decent performance then as  
well as violating the maxims Larry originally cited you're also  
hitting both "readability counts" and "Correctness and clarity before  
speed."

The "".join(L) idiom is not "broken" in the sense that, to the fluent  
Python programmer, it does convey the intent as well as the action.   
That said, there are plenty of places that you'll see it not being  
used because it fails to convey the intent.  It's pretty rare to see  
someone write:
     for k,v in d.items():
         print " has value: ".join([k,v])
but, despite the utility of the % operator on strings it's pretty  
common to see:
         print k + " has value: " + v

This patch _seems_ to be able to provide better performance for this  
sort of usage and provide a major speed-up for some other common  
usage forms without causing the programmer to resort making their  
code more complicated.  The cost seems to be a small memory hit on  
the size of a string object, a tiny increase in code size and some  
well isolated, under-the-hood complexity.

It's not like having this patch is going to force anyone to change  
the way they write their code.  As far as I can tell it simply offers  
better performance if you choose to express your code in some common  
ways.  If it speeds up pystone by 5.5% with such minimal down side  
I'm hard pressed to see a reason not to use it.

Cheers,
	Nicko

From kbk at shore.net  Sat Oct  7 06:18:50 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat, 7 Oct 2006 00:18:50 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200610070418.k974IoNN008046@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  428 open ( +6) /  3417 closed ( +2) /  3845 total ( +8)
Bugs    :  939 open ( +6) /  6229 closed (+17) /  7168 total (+23)
RFE     :  240 open ( +3) /   239 closed ( +0) /   479 total ( +3)

New / Reopened Patches
______________________

Speed up using + for string concatenation  (2006-10-02)
       http://python.org/sf/1569040  opened by  Larry Hastings

Speed-up in array_repeat()  (2006-10-02)
       http://python.org/sf/1569291  opened by  Lars Skovlund

Fix building the source within exec_prefix  (2006-10-03)
       http://python.org/sf/1569798  opened by  Matthias Klose

distutils - python 2.5 vc8 - non working setup  (2006-10-03)
CLOSED http://python.org/sf/1570119  opened by  Grzegorz Makarewicz

Fix for compilation errors in the 2.4 branch  (2006-10-03)
CLOSED http://python.org/sf/1570253  opened by  ?iga Seilnacht

qtsupport.py mistake leads to bad _Qt module  (2006-10-04)
       http://python.org/sf/1570672  opened by  Jeff Senn

Generate numeric/space/linebreak from Unicode database.  (2006-10-05)
       http://python.org/sf/1571184  opened by  Anders Chrigstr?m

make trace.py --ignore-dir work  (2006-10-05)
       http://python.org/sf/1571379  opened by  Clinton Roy

Patches Closed
______________

distutils - python 2.5 vc8 - non working setup  (2006-10-03)
       http://python.org/sf/1570119  closed by  loewis

Fix for compilation errors in the 2.4 branch  (2006-10-03)
       http://python.org/sf/1570253  closed by  loewis

New / Reopened Bugs
___________________

Test for uintptr_t seems to be incorrect  (2006-10-01)
CLOSED http://python.org/sf/1568842  opened by  Ronald Oussoren

http redirect does not pass 'post' data  (2006-10-02)
CLOSED http://python.org/sf/1568897  opened by  hans_moleman

'all' documentation missing online  (2006-09-26)
CLOSED http://python.org/sf/1565797  reopened by  aisaac0

Using .next() on file open in write mode writes junk to file  (2006-10-01)
       http://python.org/sf/1569057  opened by  andrei kulakov

External codecs no longer usable  (2006-10-02)
CLOSED http://python.org/sf/1569084  opened by  Ivan Vilata i Balaguer

sys.settrace cause curried parms to show up as attributes  (2006-10-02)
       http://python.org/sf/1569356  opened by  applebucks

sys.settrace cause curried parms to show up as attributes  (2006-10-02)
CLOSED http://python.org/sf/1569374  opened by  applebucks

PGIRelease linkage fails on pgodb80.dll  (2006-10-02)
       http://python.org/sf/1569517  opened by  Coatimundi

Backward incompatibility in logging.py  (2006-10-02)
CLOSED http://python.org/sf/1569622  opened by  Mike Klaas

datetime.datetime subtraction bug  (2006-10-02)
CLOSED http://python.org/sf/1569623  opened by  David Fugate

mailbox.Maildir.get_folder() loses factory information  (2006-10-03)
       http://python.org/sf/1569790  opened by  Matthias Klose

distutils don't respect standard env variables  (2006-10-03)
CLOSED http://python.org/sf/1569886  opened by  Lukas Lalinsky

2.5 incorrectly permits break inside try statement  (2006-10-04)
CLOSED http://python.org/sf/1569998  opened by  Nick Coghlan

redirected cookies  (2006-10-04)
       http://python.org/sf/1570255  opened by  hans_moleman

Launcher reset to factory button provides bad command-line  (2006-10-03)
       http://python.org/sf/1570284  opened by  jjackson

2.4 & 2.5 can't create win installer on linux  (2006-10-04)
       http://python.org/sf/1570417  opened by  Richard Jones

_ssl module can't be built on windows  (2006-10-05)
CLOSED http://python.org/sf/1571023  opened by  ?iga Seilnacht

simple moves freeze IDLE  (2006-10-04)
       http://python.org/sf/1571112  opened by  Douglas W. Goodall

Some numeric characters are still not recognized  (2006-10-05)
       http://python.org/sf/1571170  opened by  Anders Chrigstr?m

round() producing -0.0  (2006-10-05)
CLOSED http://python.org/sf/1571620  opened by  Ron Frye

Building using Sleepycat db 4.5.20 is broken  (2006-10-05)
       http://python.org/sf/1571754  opened by  Robert Scheck

email module does not complay with RFC 2046: CRLF issue  (2006-10-05)
       http://python.org/sf/1571841  opened by  Andy Leszczynski

.eml attachments in email  (2006-10-06)
       http://python.org/sf/1572084  opened by  rainwolf8472

parser stack overflow  (2006-10-06)
       http://python.org/sf/1572320  opened by  j?rgen urner

csv "dialect = 'excel-tab'" to use excel_tab  (2006-10-06)
       http://python.org/sf/1572471  opened by  Dan Goldner

Bugs Closed
___________

Test for uintptr_t seems to be incorrect  (2006-10-01)
       http://python.org/sf/1568842  closed by  loewis

http redirect does not pass 'post' data  (2006-10-01)
       http://python.org/sf/1568897  closed by  loewis

Spurious Tabnanny error  (2006-09-21)
       http://python.org/sf/1562716  closed by  kbk

Spurious Tab/space error  (2006-09-21)
       http://python.org/sf/1562719  closed by  kbk

plistlib should be moved out of plat-mac  (2003-07-29)
       http://python.org/sf/779460  closed by  gbrandl

Pythonw doesn't get rebuilt if version number changes  (2006-09-05)
       http://python.org/sf/1552935  closed by  sf-robot

'all' documentation missing online  (2006-09-26)
       http://python.org/sf/1565797  closed by  loewis

External codecs no longer usable  (2006-10-02)
       http://python.org/sf/1569084  closed by  lemburg

sys.settrace cause curried parms to show up as attributes  (2006-10-02)
       http://python.org/sf/1569374  closed by  gbrandl

Backward incompatibility in logging.py  (2006-10-02)
       http://python.org/sf/1569622  closed by  vsajip

datetime.datetime subtraction bug  (2006-10-02)
       http://python.org/sf/1569623  closed by  tim_one

Output of KlocWork on Python2.4.3 sources  (2006-05-09)
       http://python.org/sf/1484556  closed by  loewis

possible bug in mystrtol.c with recent gcc  (2006-07-13)
       http://python.org/sf/1521947  closed by  arigo

gcc trunk (4.2) exposes a signed integer overflows  (2006-08-24)
       http://python.org/sf/1545668  closed by  arigo

distutils don't respect standard env variables  (2006-10-03)
       http://python.org/sf/1569886  closed by  loewis

2.5 incorrectly permits break inside try statement  (2006-10-03)
       http://python.org/sf/1569998  closed by  jhylton

_ssl module can't be built on windows  (2006-10-05)
       http://python.org/sf/1571023  closed by  loewis

round() producing -0.0  (2006-10-05)
       http://python.org/sf/1571620  closed by  rhettinger

New style classes and __hash__  (2002-12-30)
       http://python.org/sf/660098  closed by  gvanrossum

New / Reopened RFE
__________________

Improvements to socket module exceptions  (2006-10-06)
       http://python.org/sf/1571878  opened by  GaryD

help(x) for for keywords too  (2006-10-06)
       http://python.org/sf/1572210  opened by  Jim Jewett

From rrr at ronadam.com  Sat Oct  7 06:23:00 2006
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 06 Oct 2006 23:23:00 -0500
Subject: [Python-Dev] PATCH submitted: Speed up + for string	
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
	<45263FE5.3070604@ronadam.com>

Message-ID: <45272BA4.2020208@ronadam.com>

Nicko van Someren wrote:
> On 6 Oct 2006, at 12:37, Ron Adam wrote:
> 
>>>> I've never liked the "".join([]) idiom for string concatenation; in my
>>>> opinion it violates the principles "Beautiful is better than ugly." and
>>>> "There should be one-- and preferably only one --obvious way to do 
>>>> it.".
> ...
>> Well I always like things to run faster, but I disagree that this 
>> idiom is broken.
>>
>> I like using lists to store sub strings and I think it's just a matter of
>> changing your frame of reference in how you think about them.
> 
> I think that you've hit on exactly the reason why this patch is a good 
> idea.  You happen to like to store strings in lists, and in many 
> situations this is a fine thing to do, but if one is forced to change 
> ones frame of reference in order to get decent performance then as well 
> as violating the maxims Larry originally cited you're also hitting both 
> "readability counts" and "Correctness and clarity before speed."

The statement ".. if one is forced to change .." is a bit overstated I think. 
The situation is more a matter of increasing awareness so the frame of reference 
comes to mind more naturally and doesn't seem forced.  And the suggestion of how 
to do that is by adding additional functions and methods that can use 
lists-of-strings instead of having to join or concatenate them first.  Added 
examples and documentation can also do that as well.

The two ideas are non-competing. They are related because they realize their 
benefits by reducing redundant underlying operations in a similar way.

Cheers,
    Ron

From jcarlson at uci.edu  Sat Oct  7 09:51:23 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 07 Oct 2006 00:51:23 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string	
	concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <45263FE5.3070604@ronadam.com>

Message-ID: <20061007004620.0979.JCARLSON@uci.edu>

Nicko van Someren  wrote:
> It's not like having this patch is going to force anyone to change  
> the way they write their code.  As far as I can tell it simply offers  
> better performance if you choose to express your code in some common  
> ways.  If it speeds up pystone by 5.5% with such minimal down side  
> I'm hard pressed to see a reason not to use it.

This has to wait until Python 2.6 (which is anywhere from 14-24 months
away, according to history); including it would destroy binary
capatability with modules compiled for 2.5, nevermind that it is a
nontrivial feature addition.

I also think that the original author (or one of this patch's supporters)
should write a PEP outlining the Python 2.5 and earlier drawbacks, what
changes this implementation brings, its improvements, and any potential
drawbacks.

 - Josiah

From fredrik at pythonware.com  Sat Oct  7 10:17:23 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 07 Oct 2006 10:17:23 +0200
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <4523F890.9060804@hastings.org>	<20061005192858.GA9435@zot.electricrain.com>	<45263FE5.3070604@ronadam.com>

Message-ID: 

Nicko van Someren wrote:

> If it speeds up pystone by 5.5% with such minimal down side  
> I'm hard pressed to see a reason not to use it.

can you tell me where exactly "pystone" does string concatenations?

From skip at pobox.com  Sat Oct  7 13:53:43 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sat, 7 Oct 2006 06:53:43 -0500
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <4523F890.9060804@hastings.org>
	<20061005192858.GA9435@zot.electricrain.com>
	<45263FE5.3070604@ronadam.com>

Message-ID: <17703.38215.196524.424167@montanaro.dyndns.org>

    Fredrik> Nicko van Someren wrote:
    >> If it speeds up pystone by 5.5% with such minimal down side I'm hard
    >> pressed to see a reason not to use it.

    Fredrik> can you tell me where exactly "pystone" does string
    Fredrik> concatenations?

I wondered about that as well.  While I'm not prepared to assert without a
doubt that pystone does no simpleminded string concatenation, a couple
minutes scanning the pystone source didn't turn up any.  If the pystone
speedup isn't an artifact, the absence of string concatention in pystone
suggests it's happening somewhere in the interpreter.

I applied the patch, ran the interpreter under gdb with a breakpoint set in
string_concat where the PyStringConcatenationObject is created, then ran
pystone.  The first hit was in

    site.py -> distutils/util.py -> string.py

All told, there were only 22 hits, none for very long strings, so that
doesn't explain the performance improvement.

BTW, on my Mac (OSX 10.4.8) max() is not defined.  I had to add a macro
definition to string_concat.

Skip

From g.brandl at gmx.net  Sat Oct  7 14:01:50 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 07 Oct 2006 14:01:50 +0200
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
Message-ID: 

[ Bug http://python.org/sf/1541585 ]

This seems to be handled like a security issue by linux distributors,
it's also a news item on security related pages.

Should a security advisory be written and official patches be
provided?

Georg

From skip at pobox.com  Sat Oct  7 14:16:37 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sat, 7 Oct 2006 07:16:37 -0500
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
In-Reply-To: 
References: 
Message-ID: <17703.39589.473518.217002@montanaro.dyndns.org>

    Georg> [ Bug http://python.org/sf/1541585 ]

    Georg> This seems to be handled like a security issue by linux
    Georg> distributors, it's also a news item on security related pages.

    Georg> Should a security advisory be written and official patches be
    Georg> provided?

I asked about this a few weeks ago.  I got no direct response.  Secunia sent
mail to webmaster and the SF project admins asking about how this could be
exploited.  (Isn't figuring that stuff out their job?)

This was corrected before 2.5 was released and the 2.4 source has (I think)
already been patched, with 2.4.4 right around the corner.  The bulk of the
Python installations in the field are probably running on Windows (most of
them provided by HP/Compaq), and it seems the Linux vendors are all over it.
I don't know if Apple has picked up on it (or if the version they currently
distribute is affected - 2.3.5 built Oct 5 2005).  Would you provide a patch
of some sort for Windows or just refer people to corrected installers?
Given the apparently miserable results trying to get Windows users to
install security fixes manually, I doubt a new 2.4.3 Windows installer would
get much exercise.

Skip

From g.brandl at gmx.net  Sat Oct  7 14:27:09 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 07 Oct 2006 14:27:09 +0200
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
In-Reply-To: <17703.39589.473518.217002@montanaro.dyndns.org>
References: 
	<17703.39589.473518.217002@montanaro.dyndns.org>
Message-ID: <45279D1D.10102@gmx.net>

skip at pobox.com wrote:
>     Georg> [ Bug http://python.org/sf/1541585 ]
> 
>     Georg> This seems to be handled like a security issue by linux
>     Georg> distributors, it's also a news item on security related pages.
> 
>     Georg> Should a security advisory be written and official patches be
>     Georg> provided?
> 
> I asked about this a few weeks ago.  I got no direct response.  Secunia sent
> mail to webmaster and the SF project admins asking about how this could be
> exploited.  (Isn't figuring that stuff out their job?)

Perhaps, judging from the name :)

> This was corrected before 2.5 was released and the 2.4 source has (I think)
> already been patched, with 2.4.4 right around the corner.  The bulk of the
> Python installations in the field are probably running on Windows (most of
> them provided by HP/Compaq), and it seems the Linux vendors are all over it.
> I don't know if Apple has picked up on it (or if the version they currently
> distribute is affected - 2.3.5 built Oct 5 2005).  Would you provide a patch
> of some sort for Windows or just refer people to corrected installers?
> Given the apparently miserable results trying to get Windows users to
> install security fixes manually, I doubt a new 2.4.3 Windows installer would
> get much exercise.

Even if the patch / corrected installer is used by only 1% of all installations,
reacting quickly and providing it in the first place is going to make a much
better impression than saying "well, nobody is going to apply it and the next
release is due in a few weeks".

[CC'ing security at python.org]

Georg

From mal at egenix.com  Sat Oct  7 16:36:00 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 07 Oct 2006 16:36:00 +0200
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
In-Reply-To: <45279D1D.10102@gmx.net>
References: 	<17703.39589.473518.217002@montanaro.dyndns.org>
	<45279D1D.10102@gmx.net>
Message-ID: <4527BB50.3040503@egenix.com>

Georg Brandl wrote:
> skip at pobox.com wrote:
>>     Georg> [ Bug http://python.org/sf/1541585 ]
>>
>>     Georg> This seems to be handled like a security issue by linux
>>     Georg> distributors, it's also a news item on security related pages.
>>
>>     Georg> Should a security advisory be written and official patches be
>>     Georg> provided?
>>
>> I asked about this a few weeks ago.  I got no direct response.  Secunia sent
>> mail to webmaster and the SF project admins asking about how this could be
>> exploited.  (Isn't figuring that stuff out their job?)
> 
> Perhaps, judging from the name :)
> 
>> This was corrected before 2.5 was released and the 2.4 source has (I think)
>> already been patched, with 2.4.4 right around the corner.  The bulk of the
>> Python installations in the field are probably running on Windows (most of
>> them provided by HP/Compaq), and it seems the Linux vendors are all over it.
>> I don't know if Apple has picked up on it (or if the version they currently
>> distribute is affected - 2.3.5 built Oct 5 2005).  Would you provide a patch
>> of some sort for Windows or just refer people to corrected installers?
>> Given the apparently miserable results trying to get Windows users to
>> install security fixes manually, I doubt a new 2.4.3 Windows installer would
>> get much exercise.
> 
> Even if the patch / corrected installer is used by only 1% of all installations,
> reacting quickly and providing it in the first place is going to make a much
> better impression than saying "well, nobody is going to apply it and the next
> release is due in a few weeks".

Note that the bug refers to a UCS4 Python build. Most Linux
distros ship UCS4 builds nowadays, so they care. The Windows
builds are UCS2 (except maybe the ones for Win64 - don't know)
which doesn't seem to be affected.

+1 on publishing the patch for 2.4. It's always better to react
quickly in such cases, even if it just gives users a fuzzy warm
feeling of being cared for :-) Whether such patches get installed
or not is not really a question to ask, since it's not within
our responsibility.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 07 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From nnorwitz at gmail.com  Sat Oct  7 22:33:44 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sat, 7 Oct 2006 13:33:44 -0700
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
In-Reply-To: <17703.39589.473518.217002@montanaro.dyndns.org>
References: 
	<17703.39589.473518.217002@montanaro.dyndns.org>
Message-ID: 

On 10/7/06, skip at pobox.com  wrote:
>
>     Georg> [ Bug http://python.org/sf/1541585 ]
>
>     Georg> This seems to be handled like a security issue by linux
>     Georg> distributors, it's also a news item on security related pages.
>
>     Georg> Should a security advisory be written and official patches be
>     Georg> provided?
>
> I asked about this a few weeks ago.  I got no direct response.  Secunia sent
> mail to webmaster and the SF project admins asking about how this could be
> exploited.  (Isn't figuring that stuff out their job?)

FWIW, I responded to the original mail from Secunia with what little I
know about the problem.  Everyone on the original mail was copied.
However,  I got ~30 bounces for all the Source Forge addresses due to
some issue between SF and Google mail.

n

From talin at acm.org  Sun Oct  8 00:10:58 2006
From: talin at acm.org (Talin)
Date: Sat, 07 Oct 2006 15:10:58 -0700
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <17693.6670.189595.646482@montanaro.dyndns.org>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>	<20060928095951.08BF.JCARLSON@uci.edu>	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>	<20060929121035.GA4884@localhost.localdomain>
	<17693.6670.189595.646482@montanaro.dyndns.org>
Message-ID: <452825F2.4000803@acm.org>

skip at pobox.com wrote:
>     Andrew> In such autogenerated documentation, you wind up with a list of
>     Andrew> every single class and function, and both trivial and important
>     Andrew> classes are given exactly the same emphasis.  
> 
> I find this true where I work as well.  Doxygen is used as a documentation
> generation tool for our C++ class libraries.  Too many people use that as a
> crutch to often avoid writing documentation altogether.  It's worse in many
> ways than tools like epydoc, because you don't need to write any docstrings
> (or specially formatted comments) to generate reams and reams of virtual
> paper.  This sort of documentation is all but useless for a Python
> programmer like myself.  I don't really need to know the five syntactic
> constructor variants.  I need to know how to use the classes which have been
> exposed to me.

As someone who has submitted patches to Doxygen (and actually had them 
accepted), I have to say that I agree as well. At my work, it used to be 
standard practice for each project to have a web site of "documentation" 
that was generated by Doxygen. Part of the reason for my patches (which 
added support for parsing of C# doctags) was in support of this effort.

However, I gradually realized that there's no actual use-case for 
Doxygen-generated docs in our environment.

Think about the work cycle of a typical C++ programmer. Generally when 
you need to look up something in the docs for a module, you either need 
specific information on the type of a variable or params of a function, 
or you need "overview" docs that explain the general theory of the module.

Bear in mind also that the typical C++ programmer is working inside of 
an IDE or other smart editor. Most such editors have a simple 
one-keystroke method of navigating from a symbol to its definition.

In other words, it is *far* easier for a programmer to jump directly to 
the actual declaration in a header file - and its accompanying 
documentation comments - than it is to switch over to a web browser, 
navigate to the documentation site, type in the name of the symbol, hit 
search...why would I *ever* use HTML reference documentation when I can 
just look at the source, which is much easier to get to? Especially 
since the source often tells me much more than the docs would.

The only reason for generated reference docs is when you are working on 
a module where you don't have the source code - which, even in a 
proprietary environment, is something to be avoided whenever possible. 
(The source may not be 'open', but that doesn't mean that *you* can't 
have access to it.) If you have the source - and a good indexing system 
in your IDE - there's really no need for Doxygen.

Of course, the web-based docs are useful when you need an overview - but 
Doxygen doesn't give you that. As a result, I have been trying to get 
people to stop using Doxygen as a "crutch" as you say - in other words, 
if a team has the responsibility to write docs for their code, they 
can't just run Doxygen over the source and call it done.

(Too bad there's no way to automatically generate the overview! :)

While I am in rant mode (sorry), I also want to mention that most 
Documentation markup systems also have a source readability cost - i.e 
having embedded tags like @param make the original source less readable; 
and given what I said above about the source being the primary reference 
doc, it doesn't make sense to clutter up the code with funny @#$ characters.

If I was going to use any markup system in the future, the first thing I 
would insist is that the markup be "invisible" - in other words, the 
markup should look just like normal comments, and the markup scanner 
should be smart enough to pick out the structure without needing a lot 
of hand-holding. For example:

    /*
       Plot a point at position x, y.
       'x' - The x-coordinate.
       'y' - The y-coordinate.
    */
    void Plot( int x, int y );

The scanner should note that: 'x' and 'y' are in single-quotes, so they 
probably refer to code identifiers. The scanner can see that they are 
both parameters to the function, so there's no need to tell it that 'x' 
is an @param.

In other words, the programmer should never have to type anything that 
can be deduced from looking at the code itself. And the reader shouldn't 
have to read a bunch of redundant information which they can easily see 
for themselves.

> I guess this is a long-winded way of saying, "me too".
> 
> Skip

ditto.

-- Talin

From nicko at nicko.org  Sun Oct  8 01:01:11 2006
From: nicko at nicko.org (Nicko van Someren)
Date: Sun, 8 Oct 2006 00:01:11 +0100
Subject: [Python-Dev] PATCH submitted: Speed up + for string
	concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <4523F890.9060804@hastings.org>	<20061005192858.GA9435@zot.electricrain.com>	<45263FE5.3070604@ronadam.com>

Message-ID: 

On 7 Oct 2006, at 09:17, Fredrik Lundh wrote:

> Nicko van Someren wrote:
>
>> If it speeds up pystone by 5.5% with such minimal down side
>> I'm hard pressed to see a reason not to use it.
>
> can you tell me where exactly "pystone" does string concatenations?

No, not without more in depth examination, but it is a pretty common  
operation in all sorts of cases including inside the interpreter.   
Larry's message in reply to Gregory Smith's request for a pystone  
score showed a 5.5% improvement and as yet I have no reason to doubt  
it.  If the patch provides a measurable performance improvement for  
code that merely happens to use strings as opposed to being  
explicitly heavy on string addition then all the better.

It's clear that this needs to be more carefully measured before it  
goes in (which is why that quote above starts "If").  As I've  
mentioned before in this thread, getting good performance measures on  
code that does lazy evaluation is often tricky.  pystone is a good  
place to start but I'm sure that there are use cases that it does not  
cover.

As for counting up the downsides, Josiah Carlson rightly points out  
that it breaks binary compatibility for modules, so the change can  
not be taken lightly and clearly it will have to wait for a major  
release.  Still, if the benefits outweigh the costs it seems worth  
doing.

Cheers,
	Nicko

From fredrik at pythonware.com  Sun Oct  8 08:38:31 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 08 Oct 2006 08:38:31 +0200
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <452825F2.4000803@acm.org>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>	<20060928095951.08BF.JCARLSON@uci.edu>	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>	<20060929121035.GA4884@localhost.localdomain>	<17693.6670.189595.646482@montanaro.dyndns.org>
	<452825F2.4000803@acm.org>
Message-ID: 

Talin wrote:

>     /*
>        Plot a point at position x, y.
>        'x' - The x-coordinate.
>        'y' - The y-coordinate.
>     */
>     void Plot( int x, int y );
> 
> The scanner should note that: 'x' and 'y' are in single-quotes, so they 
> probably refer to code identifiers.

or maybe they're string literals?

 > The scanner can see that they are
> both parameters to the function, so there's no need to tell it that 'x' 
> is an @param.

PythonDoc provides multiple parameter markers, so you can distinguish 
between positional parameters and keyword arguments.

> In other words, the programmer should never have to type anything that 
> can be deduced from looking at the code itself. And the reader shouldn't 
> have to read a bunch of redundant information which they can easily see 
> for themselves.

that's exactly why you need parameter markers in today's Python: 
Python's function definition syntax doesn't allow the programmer
to fully communicate the intent behind the design.

(what's this post doing on python-dev, btw?  should this discussion
take place on the doc-sig?)

From alastair at alastairs-place.net  Sun Oct  8 14:20:59 2006
From: alastair at alastairs-place.net (Alastair Houghton)
Date: Sun, 8 Oct 2006 13:20:59 +0100
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
In-Reply-To: <4527BB50.3040503@egenix.com>
References: 	<17703.39589.473518.217002@montanaro.dyndns.org>
	<45279D1D.10102@gmx.net> <4527BB50.3040503@egenix.com>
Message-ID: 

On Oct 7, 2006, at 3:36 PM, M.-A. Lemburg wrote:

> Georg Brandl wrote:
>> skip at pobox.com wrote:
>>> I don't know if Apple has picked up on it (or if the version they  
>>> currently
>>> distribute is affected - 2.3.5 built Oct 5 2005).

> Note that the bug refers to a UCS4 Python build. Most Linux
> distros ship UCS4 builds nowadays, so they care. The Windows
> builds are UCS2 (except maybe the ones for Win64 - don't know)
> which doesn't seem to be affected.

AFAIK the version Apple ship is a UCS2 build, therefore not affected.

Kind regards,

Alastair.

--
http://alastairs-place.net

From skip at pobox.com  Sun Oct  8 18:07:16 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 8 Oct 2006 11:07:16 -0500
Subject: [Python-Dev] Can't check in on release25-maint branch
Message-ID: <17705.8756.16093.615833@montanaro.dyndns.org>

(I sent a note to pydotorg yesterday but got no response.  Trying here.)

I checked in a change to Doc/lib/libcsv.tex on the trunk yesterday, then
tried backporting it to the release25-maint branch but failed due to
permission problems.  Thinking it might be lock contention, I waited a few
minutes and tried a couple more times.  Same result.  I just tried again:

    subversion/libsvn_client/commit.c:832: (apr_err=13)
    svn: Commit failed (details follow):
    subversion/libsvn_ra_dav/util.c:368: (apr_err=13)
    svn: Can't create directory '/data/repos/projects/db/transactions/52226-1.txn': Permission denied
    subversion/clients/cmdline/util.c:380: (apr_err=13)
    svn: Your commit message was left in a temporary file:
    subversion/clients/cmdline/util.c:380: (apr_err=13)
    svn:    '/Users/skip/src/python-svn/release25-maint/Doc/lib/svn-commit.4.tmp'

Here's my svn status output:

    Path: .
    URL: http://svn.python.org/projects/python/branches/release25-maint
    Repository UUID: 6015fed2-1504-0410-9fe1-9d1591cc4771
    Revision: 52226
    Node Kind: directory
    Schedule: normal
    Last Changed Author: hyeshik.chang
    Last Changed Rev: 52225
    Last Changed Date: 2006-10-08 09:01:45 -0500 (Sun, 08 Oct 2006)
    Properties Last Updated: 2006-08-17 11:05:19 -0500 (Thu, 17 Aug 2006)

I believe I've got the right thing checked out.

Can someone look into this?

Thanks,

Skip

From gerrit at nl.linux.org  Fri Oct  6 14:35:21 2006
From: gerrit at nl.linux.org (Gerrit Holl)
Date: Fri, 6 Oct 2006 14:35:21 +0200
Subject: [Python-Dev] what's really new in python 2.5 ?
In-Reply-To: <20061003180848.GB31361@localhost.localdomain>
References: 
	<20061003143015.GA25511@localhost.localdomain>
	<200610031039.52434.fdrake@acm.org>
	<20061003180848.GB31361@localhost.localdomain>
Message-ID: <20061006123521.GA30474@topjaklont.student.utwente.nl>

On 2006-10-03 20:10:14 +0200, A.M. Kuchling wrote:
> I've added a robots.txt to keep crawlers out of /dev/.

Isn't there a lot of useful, search-engine worthy stuff in /dev?
I search for peps with google, and I suppose the 'explanation' section,
as well as the developer faq and subversion instructions, are good pages
that deserve to be in the google index. Should /dev really be
Disallow:'ed entirely in robots.txt?

kind regards,
Gerrit Holl.

From okuda1 at llnl.gov  Fri Oct  6 17:06:12 2006
From: okuda1 at llnl.gov (Chuzo Okuda)
Date: Fri, 06 Oct 2006 08:06:12 -0700
Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for anew
 issue tracker
Message-ID: <452670E4.3050201@llnl.gov>

I am willing to volunteer. I emailed previously, but it bounced back. 
Hope this time it reaches you.
Chuzo

From okuda1 at llnl.gov  Sat Oct  7 00:58:26 2006
From: okuda1 at llnl.gov (Chuzo Okuda)
Date: Fri, 06 Oct 2006 15:58:26 -0700
Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for anew
 issue tracker
Message-ID: <4526DF92.5080800@llnl.gov>

I received the bounced email as follow. How do I become a member?
Thank you
Chuzo

Your mail to 'Python-Dev' with the subject

     [Python-Dev] PSF Infrastructure Committee's recommendation for
anew issue tracker

Is being held until the list moderator can review it for approval.

The reason it is being held:

     Post by non-member to a members-only list

From bsittler at gmail.com  Sun Oct  8 01:57:04 2006
From: bsittler at gmail.com (Benjamin C. Wiley Sittler)
Date: Sat, 07 Oct 2006 16:57:04 -0700
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
Message-ID: <1160265424.5695.37.camel@localhost.localdomain>

(i'm not on python-dev, so i dunno whether this will make it through...)

basically, this bug does not affect the vast majority (mac and windows
users with UTF-16 "narrow" unicode Python builds) because the unpatched
code allocates sufficient memory in this case. only the minority
treating this as a serious vulnerability (linux users with UTF-32 "wide"
unicode Python builds, possibly some other Unix-like operating systems
too) are affected by the buffer overrun.

as for secunia, they need to do their own homework ;)

i found this bug and wrote the patch that's been applied by the linux
distros, so i thought i should clear up a couple of apparent
misconceptions. please pardon me if i'm writing stuff you already
know...

the bug concerns allocation in repr() for unicode objects. previously
repr() always allocated 6 bytes in the output buffer per input unicode
string element; this is enough for the six-byte "\uffff" notation and on
UTF-16 python builds enough for the ten-byte "\U0010ffff" notation,
since on UTF-16 python builds the input unicode string contains a
surrogate pair (two consecutive elements) to represent unicode
characters requiring this longer notation, meaning five bytes per
element. however on UTF-32 builds ten bytes per unicode string element
are needed, and this is what the patch accomplishes. the previous
(incorrect) algorithm extended the buffer by 100 bytes in some cases
when encountering such a character, however this fixed-size heuristic
extension fails when the string contains many subsequent characters in
the six-byte "\uffff" form, as demonstrated by this test which will fail
in an unpatched non-debug wide python build:

python2.4 -c 'assert(repr(u"\U00010000" * 39 + u"\uffff" * 4096)) ==
(repr(u"\U00010000" * 39 + u"\uffff" * 4096))'

yes, a sufficiently motivated person could probably discover enough
about the memory layout of a process to use this for data or code
injection, but the more usual (and sometimes accidental) consequence is
a crash.

more background:

python comes in two flavors, UTF-16 ("narrow") and UTF-32 ("wide"),
depending on whether the unicode chars are represented. This is
generally configured to match the C library's wchar_t.

UTF-16: Windows (at least 32-bit builds), Mac OS X (at least 32-bit
builds), probably others too -- this uses a 16-bit variable-length
encoding for Unicode characters: 1 16-bit word for U+0000 ... U+FFFF
(identity mapped to 0x0000 ... 0xffff resp., a.k.a. the "UCS-2" range or
Basic Multilingual Plane) and 2 16-bit words for U+00010000 ... U
+0010FFFF (mapped as "surrogate pairs" to 0xd800; 0xdc00 ... 0xdbff;
0xdfff resp., corresponding to planes 1 through 16.)

UTF-32/UCS-4: Linux, possibly others? -- this uses 1 32-bit word per
unicode character: 1 word for all codepoints allowed by Python U
+0000 ... U+0010FFFF (identity mapped to 0x00000000L ... 0x0010ffffL
resp.)

> On 10/7/06, skip[at]pobox.com  wrote: 
> > 
> > Georg> [ Bug http://python.org/sf/1541585 ] 
> > 
> > Georg> This seems to be handled like a security issue by linux 
> > Georg> distributors, it's also a news item on security related
> pages. 
> > 
> > Georg> Should a security advisory be written and official patches
> be 
> > Georg> provided? 
> > 
> > I asked about this a few weeks ago. I got no direct response.
> Secunia sent 
> > mail to webmaster and the SF project admins asking about how this
> could be 
> > exploited. (Isn't figuring that stuff out their job?) 
> 
> FWIW, I responded to the original mail from Secunia with what little
> I 
> know about the problem. Everyone on the original mail was copied. 
> However, I got ~30 bounces for all the Source Forge addresses due to 
> some issue between SF and Google mail. 
> 
> n 

From g.brandl at gmx.net  Sun Oct  8 18:16:43 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 08 Oct 2006 18:16:43 +0200
Subject: [Python-Dev] Can't check in on release25-maint branch
In-Reply-To: <17705.8756.16093.615833@montanaro.dyndns.org>
References: <17705.8756.16093.615833@montanaro.dyndns.org>
Message-ID: 

skip at pobox.com wrote:
> (I sent a note to pydotorg yesterday but got no response.  Trying here.)
> 
> I checked in a change to Doc/lib/libcsv.tex on the trunk yesterday, then
> tried backporting it to the release25-maint branch but failed due to
> permission problems.  Thinking it might be lock contention, I waited a few
> minutes and tried a couple more times.  Same result.  I just tried again:
> 
>     subversion/libsvn_client/commit.c:832: (apr_err=13)
>     svn: Commit failed (details follow):
>     subversion/libsvn_ra_dav/util.c:368: (apr_err=13)
>     svn: Can't create directory '/data/repos/projects/db/transactions/52226-1.txn': Permission denied
>     subversion/clients/cmdline/util.c:380: (apr_err=13)
>     svn: Your commit message was left in a temporary file:
>     subversion/clients/cmdline/util.c:380: (apr_err=13)
>     svn:    '/Users/skip/src/python-svn/release25-maint/Doc/lib/svn-commit.4.tmp'
> 
> Here's my svn status output:
> 
>     Path: .
>     URL: http://svn.python.org/projects/python/branches/release25-maint
>     Repository UUID: 6015fed2-1504-0410-9fe1-9d1591cc4771
>     Revision: 52226
>     Node Kind: directory
>     Schedule: normal
>     Last Changed Author: hyeshik.chang
>     Last Changed Rev: 52225
>     Last Changed Date: 2006-10-08 09:01:45 -0500 (Sun, 08 Oct 2006)
>     Properties Last Updated: 2006-08-17 11:05:19 -0500 (Thu, 17 Aug 2006)
> 
> I believe I've got the right thing checked out.

It looks like you checked out from http://..., IIRC that's read-only.
svn+ssh://pythondev at svn.python.org/python/... might work better.

Georg

From g.brandl at gmx.net  Sun Oct  8 18:27:54 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 08 Oct 2006 18:27:54 +0200
Subject: [Python-Dev] what's really new in python 2.5 ?
In-Reply-To: <20061006123521.GA30474@topjaklont.student.utwente.nl>
References: 	<20061003143015.GA25511@localhost.localdomain>	<200610031039.52434.fdrake@acm.org>	<20061003180848.GB31361@localhost.localdomain>
	<20061006123521.GA30474@topjaklont.student.utwente.nl>
Message-ID: 

Gerrit Holl wrote:
> On 2006-10-03 20:10:14 +0200, A.M. Kuchling wrote:
>> I've added a robots.txt to keep crawlers out of /dev/.
> 
> Isn't there a lot of useful, search-engine worthy stuff in /dev?
> I search for peps with google, and I suppose the 'explanation' section,
> as well as the developer faq and subversion instructions, are good pages
> that deserve to be in the google index. Should /dev really be
> Disallow:'ed entirely in robots.txt?

I think that refers to docs.python.org/dev.

Georg

From tim.peters at gmail.com  Sun Oct  8 19:01:55 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 8 Oct 2006 13:01:55 -0400
Subject: [Python-Dev] Can't check in on release25-maint branch
In-Reply-To: 
References: <17705.8756.16093.615833@montanaro.dyndns.org>

Message-ID: <1f7befae0610081001h7896d648n426dad287130cb44@mail.gmail.com>

[Skip]
> I checked in a change to Doc/lib/libcsv.tex on the trunk yesterday, then
> tried backporting it to the release25-maint branch but failed due to
> permission problems.  Thinking it might be lock contention, I waited a few
> minutes and tried a couple more times.  Same result.  I just tried again:
...
> Here's my svn status output:
>
>     Path: .
>     URL: http://svn.python.org/projects/python/branches/release25-maint

As Georg said, looks like you did a read-only checkout.  It /may/
(can't recall for sure, but think so) get you unstuck to do:

svn switch --relocate \
    http://svn.python.org/projects/python/branches/release25-maint \
    svn+ssh://svn.python.org/python/branches/release25-maint

from your checkout directory.  If that works, it will go fast; if not,
start over with an svn+ssh checkout.

From fdrake at acm.org  Sun Oct  8 19:34:08 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 8 Oct 2006 13:34:08 -0400
Subject: [Python-Dev] what's really new in python 2.5 ?
In-Reply-To: <20061006123521.GA30474@topjaklont.student.utwente.nl>
References: 
	<20061003180848.GB31361@localhost.localdomain>
	<20061006123521.GA30474@topjaklont.student.utwente.nl>
Message-ID: <200610081334.09333.fdrake@acm.org>

On Friday 06 October 2006 08:35, Gerrit Holl wrote:
 > Isn't there a lot of useful, search-engine worthy stuff in /dev?
 > I search for peps with google, and I suppose the 'explanation' section,
 > as well as the developer faq and subversion instructions, are good pages
 > that deserve to be in the google index. Should /dev really be
 > Disallow:'ed entirely in robots.txt?

As Georg noted, we've been discussing docs.python.org/dev/, which contains 
nightly builds of the documentation on a couple of branches.

The material at www.python.org/dev/ is generally interesting, as you note, and 
remains open to crawlers.

  -Fred

-- 
Fred L. Drake, Jr.   

From aahz at pythoncraft.com  Sun Oct  8 19:48:10 2006
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 8 Oct 2006 10:48:10 -0700
Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for
	anew issue tracker
In-Reply-To: <4526DF92.5080800@llnl.gov>
References: <4526DF92.5080800@llnl.gov>
Message-ID: <20061008174810.GA16606@panix.com>

On Fri, Oct 06, 2006, Chuzo Okuda wrote:
>
> I received the bounced email as follow. How do I become a member?

Subscribe to the list.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you don't know what your program is supposed to do, you'd better not
start writing it."  --Dijkstra

From skip at pobox.com  Sun Oct  8 19:53:01 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 8 Oct 2006 12:53:01 -0500
Subject: [Python-Dev] Can't check in on release25-maint branch
In-Reply-To: <1f7befae0610081001h7896d648n426dad287130cb44@mail.gmail.com>
References: <17705.8756.16093.615833@montanaro.dyndns.org>

	<1f7befae0610081001h7896d648n426dad287130cb44@mail.gmail.com>
Message-ID: <17705.15101.350439.510383@montanaro.dyndns.org>

    Tim> As Georg said, looks like you did a read-only checkout.

Thanks Georg & Tim.  That was indeed the problem.  I don't know why I've had
such a hard time wrapping my head around Subversion.

Skip

From tim.peters at gmail.com  Sun Oct  8 20:07:18 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 8 Oct 2006 14:07:18 -0400
Subject: [Python-Dev] Can't check in on release25-maint branch
In-Reply-To: <17705.15101.350439.510383@montanaro.dyndns.org>
References: <17705.8756.16093.615833@montanaro.dyndns.org>

	<1f7befae0610081001h7896d648n426dad287130cb44@mail.gmail.com>
	<17705.15101.350439.510383@montanaro.dyndns.org>
Message-ID: <1f7befae0610081107i56dfb731u529d699e257d409f@mail.gmail.com>

[Skip]
> Thanks Georg & Tim.  That was indeed the problem.  I don't know why I've had
> such a hard time wrapping my head around Subversion.

I have a theory about that:  it's software <0.5 wink>.  If it's any
consolation, at the NFS sprint earlier this year, I totally blanked
out on how to do a merge using SVN, despite that I've merged hundreds
of times when working on ZODB's seemingly infinite collection of
active branches.  Luckily, I was only trying to help someone else do a
merge at the time, so it frustrated them more than me ;-)

From fredrik at pythonware.com  Sun Oct  8 20:16:02 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 08 Oct 2006 20:16:02 +0200
Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for
 anew issue tracker
In-Reply-To: <4526DF92.5080800@llnl.gov>
References: <4526DF92.5080800@llnl.gov>
Message-ID: 

Chuzo Okuda wrote:

> I received the bounced email as follow. How do I become a member?

the moderator has approved your message, and it has reached the right 
persons.  I'm sure they'll get back to you soon.

From brett at python.org  Sun Oct  8 22:11:55 2006
From: brett at python.org (Brett Cannon)
Date: Sun, 8 Oct 2006 13:11:55 -0700
Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for
	anew issue tracker
In-Reply-To: <452670E4.3050201@llnl.gov>
References: <452670E4.3050201@llnl.gov>
Message-ID: 

The email didn't bounce; it was just held for moderator approval (and it
made it through).  Just sit tight and we will be getting back to all of the
volunteers in the near future (probably next week, no later than after this
upcoming week).

-Brett

On 10/6/06, Chuzo Okuda  wrote:
>
> I am willing to volunteer. I emailed previously, but it bounced back.
> Hope this time it reaches you.
> Chuzo
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061008/26213996/attachment.htm 

From larry at hastings.org  Mon Oct  9 07:47:58 2006
From: larry at hastings.org (Larry Hastings)
Date: Sun, 08 Oct 2006 22:47:58 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 concatenation, now as fast as "".join(x) idiom
In-Reply-To: 
References: <4523F890.9060804@hastings.org>	<20061005192858.GA9435@zot.electricrain.com>	<17702.13238.684094.6289@montanaro.dyndns.org>

Message-ID: <4529E28E.3070800@hastings.org>

Fredrik Lundh wrote:
> skip at pobox.com wrote:
>   
>> MAL's pybench would probably be better for this presuming it does some
>> addition with string operands.
>>     
> or stringbench.
>   

I ran 'em, and they are strangely consistent with pystone.

With concat, stringbench is ever-so-slightly faster overall.  "172.82" 
vs "174.85" for the "ascii" column, I guess that's in seconds.  I'm just 
happy it's not slower.  (I only ran stringbench once; it seems to take 
*forever*).

I ran pybench three times for each build.  The slowest concat overall 
time was still 2.9% faster than the fastest release time.  
"ConcatStrings" is a big winner, at around 150% faster; since the test 
doesn't *do* anything with the concatenated values, it never renders the 
concatenation objects, so it does a lot less work.  
"CreateStringsWithConcat" is generally 18-19% faster, as expected.  
After that, the timings are all over the place, but some tests were 
consistently faster: "CompareInternedStrings" was 8-12% faster, 
"DictWithFloatKeys" was 9-11% faster, "SmallLists" was 8-15% faster, 
"CompareLongs" was 6-10% faster, and "PyMethodCalls" was 4-6% faster.  
(These are all comparing the "average run-time" results, though the 
"minimum run-time" results were similar.)

I still couldn't tell you why my results are faster.  I swear on my 
mother's eyes I didn't touch anything major involved in 
"DictWithFloatKeys", "SmallLists", or "CompareLongs".  I didn't touch 
the compiler settings, so that shouldn't be it.  I acknowledge not only 
that it could all be a mistake, and that I don't know enough about it to 
speculate.//

The speedup mystery continues,

*larry*

From ironfroggy at gmail.com  Mon Oct  9 09:24:28 2006
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Mon, 9 Oct 2006 03:24:28 -0400
Subject: [Python-Dev] if __debug__: except Exception, e: pdb.set_trace()
Message-ID: <76fd5acf0610090024u1caa7868ka336f1456faee93e@mail.gmail.com>

I know I can not do this, but what are the chances on changing the
rules so that we can? Basically, since the if __debug__: lines are
processed before runtime, would it be possible to allow them to be
used to control the inclusion or omission or entire blocks (except,
else, elif, etc.) with them being included as if they were at the same
level as the 'if __debug__:' above them?

I want to allow this:

try:
    foo()
if __debug__:
    except Exception, e:
        import pdb
        pdb.set_trace()

So that when __debug__ is false, the except block doesn't even exist at all.
-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/

From jcarlson at uci.edu  Mon Oct  9 09:45:39 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 09 Oct 2006 00:45:39 -0700
Subject: [Python-Dev] if __debug__: except Exception, e: pdb.set_trace()
In-Reply-To: <76fd5acf0610090024u1caa7868ka336f1456faee93e@mail.gmail.com>
References: <76fd5acf0610090024u1caa7868ka336f1456faee93e@mail.gmail.com>
Message-ID: <20061009003949.0982.JCARLSON@uci.edu>

"Calvin Spealman"  wrote:
> 
> I know I can not do this, but what are the chances on changing the
> rules so that we can? Basically, since the if __debug__: lines are
> processed before runtime, would it be possible to allow them to be
> used to control the inclusion or omission or entire blocks (except,
> else, elif, etc.) with them being included as if they were at the same
> level as the 'if __debug__:' above them?

I would say very low.  try/except/finally, if/elif/else, for/else,
while/else, etc., pairings of statements historically have only been
grouped together when they share indent levels.  If one makes two
statements that don't share indent levels paired in this way, then what
is stopping us from doing the following monstronsity?

if ...:
    ...
if __debug__:
    elif ...:
        ...

Remember, Special cases aren't special enough to break the rules.  This
would be a bad special case that doesn't generalize in a satisfactory
manner.

> I want to allow this:
> 
> try:
>     foo()
> if __debug__:
>     except Exception, e:
>         import pdb
>         pdb.set_trace()
> 
> So that when __debug__ is false, the except block doesn't even exist at all.

And if the except clause doesn't exist at all, then unless you are
following it with the finally clause of a 2.5+ unified
try/except/finally, it is a syntax error.

Regardless, it would be easier to read to have the following...

try:
    foo()
except Exception, e:
    if __debug__:
        import pdb
        pdb.set_trace()
    else:
        raise

 - Josiah

From mal at egenix.com  Mon Oct  9 11:30:25 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 09 Oct 2006 11:30:25 +0200
Subject: [Python-Dev] PATCH submitted: Speed up + for
 string	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <4529E28E.3070800@hastings.org>
References: <4523F890.9060804@hastings.org>	<20061005192858.GA9435@zot.electricrain.com>	<17702.13238.684094.6289@montanaro.dyndns.org>	
	<4529E28E.3070800@hastings.org>
Message-ID: <452A16B1.9070109@egenix.com>

Larry Hastings wrote:
> Fredrik Lundh wrote:
>> skip at pobox.com wrote:
>>   
>>> MAL's pybench would probably be better for this presuming it does some
>>> addition with string operands.
>>>     
>> or stringbench.
>>   
> 
> I ran 'em, and they are strangely consistent with pystone.
> 
> With concat, stringbench is ever-so-slightly faster overall.  "172.82" 
> vs "174.85" for the "ascii" column, I guess that's in seconds.  I'm just 
> happy it's not slower.  (I only ran stringbench once; it seems to take 
> *forever*).
> 
> I ran pybench three times for each build.  The slowest concat overall 
> time was still 2.9% faster than the fastest release time.  
> "ConcatStrings" is a big winner, at around 150% faster; since the test 
> doesn't *do* anything with the concatenated values, it never renders the 
> concatenation objects, so it does a lot less work.  
> "CreateStringsWithConcat" is generally 18-19% faster, as expected.  
> After that, the timings are all over the place, but some tests were 
> consistently faster: "CompareInternedStrings" was 8-12% faster, 
> "DictWithFloatKeys" was 9-11% faster, "SmallLists" was 8-15% faster, 
> "CompareLongs" was 6-10% faster, and "PyMethodCalls" was 4-6% faster.  
> (These are all comparing the "average run-time" results, though the 
> "minimum run-time" results were similar.)

When comparing results, please look at the minimum runtime.
The average times are just given to indicate how much the mintime
differs from the average of all runs.

> I still couldn't tell you why my results are faster.  I swear on my 
> mother's eyes I didn't touch anything major involved in 
> "DictWithFloatKeys", "SmallLists", or "CompareLongs".  I didn't touch 
> the compiler settings, so that shouldn't be it.  I acknowledge not only 
> that it could all be a mistake, and that I don't know enough about it to 
> speculate.//

Depending on what you changed, it is possible that the layout of
the code in memory better fits your CPU architecture.

If however the speedups are not consistent across several runs of
pybench, then it's likely that you have some background activity
going on on the machine which causes a slowdown in the unmodified
run you chose as basis for the comparison.

Just to make sure: you are using pybench 2.0, right ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 09 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From kristjan at ccpgames.com  Mon Oct  9 11:55:00 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Mon, 9 Oct 2006 09:55:00 -0000
Subject: [Python-Dev] PATCH submitted: Speed up + for
	string	concatenation, now as fast as "".join(x) idiom
Message-ID: <129CEF95A523704B9D46959C922A280002FE99D7@nemesis.central.ccp.cc>

This patch looks really nice to use here at CCP.  Our code is full of string contcatenations so I will probably try to apply the patch soon and see what it gives us in a real life app.  The floating point integer cache was also a big win.  Soon, standard python won't be able to keep up with the patched versions out there :)

Oh, and since I have fixed the pcbuild8 thingy in the 2.5 branch, why don't you give the PGO version a whirl too?  Even the non-PGO dll, with link-time code generation, should be faster than your vanilla PCBuild one.  Read the Readme.txt for details.

Cheers,

Kristj?n

> -----Original Message-----
> From: python-dev-bounces+kristjan=ccpgames.com at python.org 
> [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] 
> On Behalf Of M.-A. Lemburg
> Sent: 9. okt?ber 2006 09:30
> To: Larry Hastings
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] PATCH submitted: Speed up + for 
> string concatenation, now as fast as "".join(x) idiom
> 
> Larry Hastings wrote:
> > Fredrik Lundh wrote:
> >> skip at pobox.com wrote:
> >>   
> >>> MAL's pybench would probably be better for this presuming it does 
> >>> some addition with string operands.
> >>>     
> >> or stringbench.
> >>   
> > 
> > I ran 'em, and they are strangely consistent with pystone.
> > 
> > With concat, stringbench is ever-so-slightly faster 
> overall.  "172.82" 
> > vs "174.85" for the "ascii" column, I guess that's in seconds.  I'm 
> > just happy it's not slower.  (I only ran stringbench once; 
> it seems to 
> > take *forever*).
> > 
> > I ran pybench three times for each build.  The slowest 
> concat overall 
> > time was still 2.9% faster than the fastest release time.
> > "ConcatStrings" is a big winner, at around 150% faster; 
> since the test 
> > doesn't *do* anything with the concatenated values, it 
> never renders 
> > the concatenation objects, so it does a lot less work.
> > "CreateStringsWithConcat" is generally 18-19% faster, as expected.  
> > After that, the timings are all over the place, but some tests were 
> > consistently faster: "CompareInternedStrings" was 8-12% faster, 
> > "DictWithFloatKeys" was 9-11% faster, "SmallLists" was 
> 8-15% faster, 
> > "CompareLongs" was 6-10% faster, and "PyMethodCalls" was 
> 4-6% faster.
> > (These are all comparing the "average run-time" results, though the 
> > "minimum run-time" results were similar.)
> 
> When comparing results, please look at the minimum runtime.
> The average times are just given to indicate how much the 
> mintime differs from the average of all runs.
> 
> > I still couldn't tell you why my results are faster.  I swear on my 
> > mother's eyes I didn't touch anything major involved in 
> > "DictWithFloatKeys", "SmallLists", or "CompareLongs".  I 
> didn't touch 
> > the compiler settings, so that shouldn't be it.  I acknowledge not 
> > only that it could all be a mistake, and that I don't know enough 
> > about it to speculate.//
> 
> Depending on what you changed, it is possible that the layout 
> of the code in memory better fits your CPU architecture.
> 
> If however the speedups are not consistent across several 
> runs of pybench, then it's likely that you have some 
> background activity going on on the machine which causes a 
> slowdown in the unmodified run you chose as basis for the comparison.
> 
> Just to make sure: you are using pybench 2.0, right ?
> 
> --
> Marc-Andre Lemburg
> eGenix.com
> 
> Professional Python Services directly from the Source  (#1, 
> Oct 09 2006)
> >>> Python/Zope Consulting and Support ...        
> http://www.egenix.com/
> >>> mxODBC.Zope.Database.Adapter ...             
> http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        
> http://python.egenix.com/
> ______________________________________________________________
> __________
> 
> ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for 
> free ! ::::
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/kristjan%40c
cpgames.com
> 

From kristjan at ccpgames.com  Mon Oct  9 12:07:30 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Mon, 9 Oct 2006 10:07:30 -0000
Subject: [Python-Dev] 2.5, 64 bit
Message-ID: <129CEF95A523704B9D46959C922A280002FE99D8@nemesis.central.ccp.cc>

the VisualStudio8 64 bit build of 2.5 doesn't compile clean.  We have a number of warnings of truncation from 64 bit to 32:
Often it is a question of doing an explicit cast, but sometimes we are using "int" for results from strlen and such.

Is there any interest in fixing this up?

Cheers,
Kristj?n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061009/ccded2cb/attachment.htm 

From g.brandl at gmx.net  Mon Oct  9 12:27:31 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 09 Oct 2006 12:27:31 +0200
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
In-Reply-To: 
References: 
Message-ID: <452A2413.1060708@gmx.net>

Georg Brandl wrote:
> [ Bug http://python.org/sf/1541585 ]
> 
> This seems to be handled like a security issue by linux distributors,
> it's also a news item on security related pages.
> 
> Should a security advisory be written and official patches be
> provided?

May I ask again whether this is handled by the PSRT at all?

Georg

From tlesher at gmail.com  Mon Oct  9 16:52:44 2006
From: tlesher at gmail.com (Tim Lesher)
Date: Mon, 9 Oct 2006 10:52:44 -0400
Subject: [Python-Dev] Iterating over marshal/pickle
Message-ID: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com>

Both marshal and pickle allow multiple objects to be serialized to the
same file-like object.

The pattern for deserializing an unknown number of serialized objects
looks like this:

objs = []
while True:
  try:
    objs.append(marshal.load(fobj)) # or objs.append(unpickler.load())
  except EOFError:
    break

This seems like a good use case for an generator:

def some_name(fobj):
  while True:
    try:
      yield marshal.load(fobj) # or yield unpickler.load()
    except EOFError:
      raise StopIteration

1. Does this seem like a reasonable addition to the standard library?
2. Where should it go, and what should it be called?

>From an end-user point of view, this "feels" right:

import pickle
u = pickle.Unpickler(open('picklefile'))
for x in u:
  print x

import marshal
for x in marshal.unmarshalled(open('marshalfile')):
  print x

But I'm not hung up on the actual names or the use of sequence
semantics in the Unpickler case.

Incidentally, I know that pickle is preferred over marshal, but some
third-party tools (like the Perforce client) still use the marshal
library for serialization, so I've included it in the discussion
-- 
Tim Lesher 

From fredrik at pythonware.com  Mon Oct  9 17:28:24 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 09 Oct 2006 17:28:24 +0200
Subject: [Python-Dev] Iterating over marshal/pickle
In-Reply-To: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com>
References: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com>
Message-ID: 

Tim Lesher wrote:

> 1. Does this seem like a reasonable addition to the standard library?

I cannot remember ever doing this, or seeing anyone except Perforce 
doing this, and it'll only save you a few lines of code every other year 
or so, so my answer is definitely no.

(if you're serious about P4 integration, you probably don't want to use 
Python's marshal.load to deal with the P4 output either; the marshalling 
code has had a tendency to crash Python when it sees malformed or pre-
maturely terminated output).

> Incidentally, I know that pickle is preferred over marshal, but some
> third-party tools (like the Perforce client) still use the marshal
> library for serialization, so I've included it in the discussion

Perforce is the only 3rd party component I'm aware of that uses a 
standard Python serialization format in this way.

As the x windows people have observed, the only thing worse than 
generalizing from one example is generalizing from no examples at
all..

From barry at python.org  Mon Oct  9 18:01:40 2006
From: barry at python.org (Barry Warsaw)
Date: Mon, 9 Oct 2006 12:01:40 -0400
Subject: [Python-Dev] Iterating over marshal/pickle
In-Reply-To: 
References: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com>

Message-ID: 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 9, 2006, at 11:28 AM, Fredrik Lundh wrote:

>> 1. Does this seem like a reasonable addition to the standard library?
>
> I cannot remember ever doing this, or seeing anyone except Perforce
> doing this, and it'll only save you a few lines of code every other  
> year
> or so, so my answer is definitely no.

FWIW, Mailman uses pickle files with multiple pickled objects in  
them, to implement its queue files.  We first dump the Message  
object, followed by a dictionary of metadata.  OTOH, I know there's  
only two objects in the pickle, so I don't have to iterate over it; I  
just load the message and then load the dictionary.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRSpyaHEjvBPtnXfVAQLqcgP/VqKqwZfReaQyRGP2DG61978CmbqLOvSY
nsXP/AE88VvO+IHYajfNuJt/okmTIfHTl9Jcx77YzxZ9ErtpKWbmrX6zo7OkaZPv
5aYQ7zsYwJocL5u6nFqXAs+9zvUOXLvwhKFDc5K/rp4cb02QAYOgn5gpRirJNSAm
ESMiMNRmdQ8=
=3Ih4
-----END PGP SIGNATURE-----

From faassen at infrae.com  Mon Oct  9 18:57:41 2006
From: faassen at infrae.com (Martijn Faassen)
Date: Mon, 09 Oct 2006 18:57:41 +0200
Subject: [Python-Dev] Security Advisory for unicode repr() bug?
In-Reply-To: <452A2413.1060708@gmx.net>
References:  <452A2413.1060708@gmx.net>
Message-ID: <452A7F85.30503@infrae.com>

Georg Brandl wrote:
> Georg Brandl wrote:
>> [ Bug http://python.org/sf/1541585 ]
>>
>> This seems to be handled like a security issue by linux distributors,
>> it's also a news item on security related pages.
>>
>> Should a security advisory be written and official patches be
>> provided?
> 
> May I ask again whether this is handled by the PSRT at all?

I agree that having an official Python security advisory would be good 
to see. I was also assuming automatically that fixed versions of Python 
2.4 and Python 2.3 would be released.

It's a serious issue for web-facing Python applications that handle 
unicode strings.

Regards,

Martijn

From martin at v.loewis.de  Mon Oct  9 19:53:23 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 09 Oct 2006 19:53:23 +0200
Subject: [Python-Dev] 2.5, 64 bit
In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99D8@nemesis.central.ccp.cc>
References: <129CEF95A523704B9D46959C922A280002FE99D8@nemesis.central.ccp.cc>
Message-ID: <452A8C93.7080901@v.loewis.de>

Kristj?n V. J?nsson schrieb:
> the VisualStudio8 64 bit build of 2.5 doesn't compile clean.  We have a
> number of warnings of truncation from 64 bit to 32:
> Often it is a question of doing an explicit cast, but sometimes we are
> using "int" for results from strlen and such.
>  
> Is there any interest in fixing this up?

Yes; I had fixed many of them already for the Python 2.5 release (there
were *way* more of these before I started).

Notice that many of them are bogus. For example, if I do strlen on a
buffer that is known to have MAXPATH bytes, the strlen result *can't*
exceed an int. So care is necessary for each case:
- if there is a conceivable case where it can overflow (i.e. if
  you could come up with a Python program that makes it overflow),
  fix the types appropriately
- if it is certain through inspection that it can't overflow, add
  a cast (Py_SAFE_DOWNCAST, or, when it is really obvious, a plain
  cast), and a comment on why the cast is correct. Notice that
  Py_SAFE_DOWNCAST has an assertion in debug mode. Also notice
  that it evaluates it argument twice.
- if it shouldn't overflow as long as extension modules play by
  the rules, it's your choice of either adding a runtime error,
  or just widening the representation.

IIRC, the biggest chunk of "real" work left is SRE: this can
realistically overflow when it operates on large strings. You
have to really understand SRE before fixing it. For example,
I believe that large strings might have impacts on compilation,
too (e.g. if the regex itself is >2GiB, or some repetition
count is >2**31). In these cases, it might be saner to guarantee
an exception (and document the limitation) than to try expanding
the SRE bytecode.

Another set of remaining changes deals with limitations on
byte code and reflection. For example, there is currently
a limit on the number of local variables imposed by the Python
bytecode. From this limit, it follows that certain casts are
correct. One should document each limit first, and then
refer to these limits when adding casts.

Helping here would be definitely appreciated.

Regards,
Martin

From mal at egenix.com  Mon Oct  9 21:48:04 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 09 Oct 2006 21:48:04 +0200
Subject: [Python-Dev] Iterating over marshal/pickle
In-Reply-To: 
References: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com>

Message-ID: <452AA774.7040105@egenix.com>

Fredrik Lundh wrote:
> Tim Lesher wrote:
> 
>> 1. Does this seem like a reasonable addition to the standard library?
> 
> I cannot remember ever doing this, or seeing anyone except Perforce 
> doing this, and it'll only save you a few lines of code every other year 
> or so, so my answer is definitely no.
> 
> (if you're serious about P4 integration, you probably don't want to use 
> Python's marshal.load to deal with the P4 output either; the marshalling 
> code has had a tendency to crash Python when it sees malformed or pre-
> maturely terminated output).
> 
>> Incidentally, I know that pickle is preferred over marshal, but some
>> third-party tools (like the Perforce client) still use the marshal
>> library for serialization, so I've included it in the discussion
> 
> Perforce is the only 3rd party component I'm aware of that uses a 
> standard Python serialization format in this way.
> 
> As the x windows people have observed, the only thing worse than 
> generalizing from one example is generalizing from no examples at
> all..

FWIW, we've been and are using this quite a lot for dumping
database content to a backup file. It's a simple and terse format,
preserves full data precision and doesn't cause problems when
moving between platforms.

That makes two use cases and I'm sure there are more ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 09 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From tim.peters at gmail.com  Mon Oct  9 22:44:38 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 9 Oct 2006 16:44:38 -0400
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
Message-ID: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>

I just noticed that the bsddb portion of Python fails to compile on
the 2.4 Windows buildbots, but for some reason the buildbot machinery
doesn't notice the failure:

"""
Compiling...
_bsddb.c
Linking...
   Creating library .\./_bsddb_d.lib and object .\./_bsddb_d.exp
_bsddb.obj : warning LNK4217: locally defined symbol _malloc imported
in function __db_associateCallback
_bsddb.obj : warning LNK4217: locally defined symbol _free imported in
function __DB_consume
_bsddb.obj : warning LNK4217: locally defined symbol _fclose imported
in function _DB_verify
_bsddb.obj : warning LNK4217: locally defined symbol _fopen imported
in function _DB_verify
_bsddb.obj : warning LNK4217: locally defined symbol _strncpy imported
in function _init_pybsddb
_bsddb.obj : error LNK2019: unresolved external symbol __imp__strncat
referenced in function _makeDBError
_bsddb.obj : error LNK2019: unresolved external symbol __imp___assert
referenced in function _makeDBError
./_bsddb_d.pyd : fatal error LNK1120: 2 unresolved externals
...

_bsddb - 3 error(s), 5 warning(s)

 Build: 15 succeeded, 1 failed, 0 skipped
"""

The warnings there are old news, but no idea when the errors started.

The test suite doesn't care that bsddb is missing either, just ending with:

1 skip unexpected on win32:
    test_bsddb

Same kind of things when building from my 2.4 checkout.  No clues.

From docwhat+list.python-dev at gerf.org  Mon Oct  9 22:44:39 2006
From: docwhat+list.python-dev at gerf.org (The Doctor What)
Date: Mon, 09 Oct 2006 16:44:39 -0400
Subject: [Python-Dev] BUG (urllib2) Authentication request header is broken
 on long usernames and passwords
Message-ID: <452AB4B7.4030103@gerf.org>

I found a bug in urllib2's handling of basic HTTP authentication.

urllib2 uses the base64.encodestring() method to encode the
username:password.

The problem is that base64.encodestring() adds newlines to wrap the
encoded characters at the 76th column.

This produces bogus request headers like this:
---------->8---------cut---------8<----------------
GET /some/url HTTP/1.1
Host: some.host
Accept-Encoding: identity
Authorization: Basic
cmVhbGx5bG9uZ3VzZXJuYW1lOmFuZXZlbmxvbmdlcnBhc3N3b3JkdGhhdGdvZXNvbmFuZG9uYW5k
b25hbmRvbmFuZG9u

User-agent: some-agent
---------->8---------cut---------8<----------------

This can be worked around by forcing the base64.MAXBINSIZE to
something huge, but really it should be something passed into
base64.encodestring().

# echo example of it wrapping...
# python -c 'import base64; print base64.encodestring("f"*58)'
# excho example of forcing it not to wrap...
# python -c 'import base64; base64.MAXBINSIZE=1000000; print
base64.encodestring("f"*58)'

Symptoms of this bug are receiving HTTP 400 responses from the
remote server, spurious authentication errors, or various parts of
the header "vanishing" (because of the double newline).

Thanks!

-- 
** Ridiculous Quotes **
"I want to say this about my state: When Strom Thurmond ran for
president, we voted for him. We're proud of it. And if the rest of
the country had followed our lead, we wouldn't have had all these
problems over all these years, either."
	-- Senate Minority Leader Trent Lott (R-MS), praising Strom
Thurmond's segregationist presidential campaign [12/5/02]

The Doctor What: Second Baseman
http://docwhat.gerf.org/
docwhat *at* gerf *dot* org
KF6VNC

From aahz at pythoncraft.com  Mon Oct  9 23:35:23 2006
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 9 Oct 2006 14:35:23 -0700
Subject: [Python-Dev] BUG (urllib2) Authentication request header is
	broken on long usernames and passwords
In-Reply-To: <452AB4B7.4030103@gerf.org>
References: <452AB4B7.4030103@gerf.org>
Message-ID: <20061009213523.GA27418@panix.com>

On Mon, Oct 09, 2006, The Doctor What wrote:
>
> I found a bug in urllib2's handling of basic HTTP authentication.

Please submit your bug to SourceForge, then (optional) post the bug
number back here.

See http://www.python.org/dev/faq/#bugs
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you don't know what your program is supposed to do, you'd better not
start writing it."  --Dijkstra

From scott+python-dev at scottdial.com  Mon Oct  9 23:44:51 2006
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Mon, 09 Oct 2006 17:44:51 -0400
Subject: [Python-Dev] BUG (urllib2) Authentication request header is
 broken on long usernames and passwords
In-Reply-To: <452AB4B7.4030103@gerf.org>
References: <452AB4B7.4030103@gerf.org>
Message-ID: <452AC2D3.40200@scottdial.com>

The Doctor What wrote:
> The problem is that base64.encodestring() adds newlines to wrap the
> encoded characters at the 76th column.
> 

The encodestring is following RFC 1521 which speficies:

    The output stream (encoded bytes) must be represented in lines of no
    more than 76 characters each.  All line breaks or other characters
    not found in Table 1 must be ignored by decoding software.

In retrospect, perhaps "{de|en}codestring" was a poor name choice. 
urllib2 should be calling b64encode directly.

I have submitted a patch to the tracker: [ 1574068 ] urllib2 - Fix line 
breaks in authorization headers.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From martin at v.loewis.de  Tue Oct 10 00:31:34 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Oct 2006 00:31:34 +0200
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
In-Reply-To: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>
Message-ID: <452ACDC6.90204@v.loewis.de>

Tim Peters schrieb:
> I just noticed that the bsddb portion of Python fails to compile on
> the 2.4 Windows buildbots, but for some reason the buildbot machinery
> doesn't notice the failure:

It's been a while that a failure to build some extension modules doesn't
cause the "compile" step to fail; this just happened with the _ssl.pyd
module before.

I'm not sure how build.bat communicates an error, or whether devenv.com
fails in some way when some build step fails.

Revision 43156 may contribute here, which adds additional commands
into build.bat after devenv.com is invoked.

Regards,
Martin

From tim.peters at gmail.com  Tue Oct 10 01:26:41 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 9 Oct 2006 19:26:41 -0400
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
In-Reply-To: <452ACDC6.90204@v.loewis.de>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>
	<452ACDC6.90204@v.loewis.de>
Message-ID: <1f7befae0610091626l18ace335y98052cb9ee481843@mail.gmail.com>

[Tim Peters]
>> I just noticed that the bsddb portion of Python fails to compile on
>> the 2.4 Windows buildbots, but for some reason the buildbot machinery
>> doesn't notice the failure:

[Martin v. L?wis]
> It's been a while that a failure to build some extension modules doesn't
> cause the "compile" step to fail; this just happened with the _ssl.pyd
> module before.

I'm guessing only on the release24-maint branch?

> I'm not sure how build.bat communicates an error, or whether devenv.com
> fails in some way when some build step fails.
>
> Revision 43156 may contribute here, which adds additional commands
> into build.bat after devenv.com is invoked.

More guessing:  devenv gives a non-zero exit code when it fails, and a
.bat script passes on the exit code of the last command it executes.

True or false, after making changes based on those guesses, the 2.4
Windows buildbots now say they fail the compile step.

It was my fault to begin with (boo! /bad/ Timmy), but should have been
unique to the 24 branch (2.5 and trunk fetch Unicode test files all by
themselves).

From tim.peters at gmail.com  Tue Oct 10 02:11:59 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 9 Oct 2006 20:11:59 -0400
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
In-Reply-To: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>
Message-ID: <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com>

[Tim]
> I just noticed that the bsddb portion of Python fails to compile on
> the 2.4 Windows buildbots, but for some reason the buildbot machinery
> doesn't notice the failure:

But it does now.  This is the revision that broke the Windows build:

"""
r52170 | andrew.kuchling | 2006-10-05 14:49:36 -0400 (Thu, 05 Oct
2006) | 12 lines

[Backport r50783 | neal.norwitz.  The bytes_left code is complicated,
 but looks correct on a casual inspection and hasn't been modified
 in the trunk.  Does anyone want to review further?]

Ensure we don't write beyond errText.  I think I got this right, but
it definitely could use some review to ensure I'm not off by one
and there's no possible overflow/wrap-around of bytes_left.
Reported by Klocwork #1.

Fix a problem if there is a failure allocating self->db.
Found with failmalloc.
"""

It introduces uses of assert() and strncat(), and the linker can't
resolve them.  I suppose that's because the Windows link step for the
_bsddb subproject explicitly excludes msvcrt (in the release build)
and msvcrtd (in the debug build), but I don't know why that's done.

OTOH, we got a lot more errors (about duplicate code definitions) if
the standard MS libraries aren't explicitly excluded, so that's no
fix.

From jjl at pobox.com  Tue Oct 10 02:14:51 2006
From: jjl at pobox.com (John J Lee)
Date: Tue, 10 Oct 2006 00:14:51 +0000 (UTC)
Subject: [Python-Dev] BUG (urllib2) Authentication request header is
 broken on long usernames and passwords
In-Reply-To: <452AC2D3.40200@scottdial.com>
References: <452AB4B7.4030103@gerf.org> <452AC2D3.40200@scottdial.com>
Message-ID: 

On Mon, 9 Oct 2006, Scott Dial wrote:
[...]
> In retrospect, perhaps "{de|en}codestring" was a poor name choice.
> urllib2 should be calling b64encode directly.
>
> I have submitted a patch to the tracker: [ 1574068 ] urllib2 - Fix line
> breaks in authorization headers.

urllib should also be fixed in the same way (assuming your fix is 
correct), since urllib also uses base64.{de,en}codestring().

John

From martin at v.loewis.de  Tue Oct 10 08:31:20 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Oct 2006 08:31:20 +0200
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
In-Reply-To: <1f7befae0610091626l18ace335y98052cb9ee481843@mail.gmail.com>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>	<452ACDC6.90204@v.loewis.de>
	<1f7befae0610091626l18ace335y98052cb9ee481843@mail.gmail.com>
Message-ID: <452B3E38.7020801@v.loewis.de>

Tim Peters schrieb:
> [Martin v. L?wis]
>> It's been a while that a failure to build some extension modules doesn't
>> cause the "compile" step to fail; this just happened with the _ssl.pyd
>> module before.
> 
> I'm guessing only on the release24-maint branch?

Yes. I backported some change which broke the build (not so on my own
installation for a strange reason), and the buildbot didn't complain,
either. I was surprised to see a bug report on SF that it wouldn't build.

> More guessing:  devenv gives a non-zero exit code when it fails, and a
> .bat script passes on the exit code of the last command it executes.

That's my theory also.

Thanks for fixing it,
Martin

From ncoghlan at gmail.com  Tue Oct 10 11:32:43 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 Oct 2006 19:32:43 +1000
Subject: [Python-Dev] [Python-3000] Sky pie: a "var" keyword
In-Reply-To: <452B66C8.6020703@gmail.com>
References: <452A027B.8060009@cs.byu.edu> <452B66C8.6020703@gmail.com>
Message-ID: <452B68BB.4070005@gmail.com>

Nick Coghlan wrote:
> Any proposal such as this also needs to addresses all of the *other* name 
> binding statements in Python:
> 
>    try/except
>    for loop
>    with statement
>    def statement
>    class statement

I forgot the import statement (especially the * version)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From fredrik at pythonware.com  Tue Oct 10 12:00:57 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 10 Oct 2006 12:00:57 +0200
Subject: [Python-Dev] [Python-3000] Sky pie: a "var" keyword
References: <452A027B.8060009@cs.byu.edu> <452B66C8.6020703@gmail.com>
	<452B68BB.4070005@gmail.com>
Message-ID: 

> I forgot the import statement (especially the * version)

not only that, you also forgot what mailing list you were posting to...

From ndbecker2 at gmail.com  Tue Oct 10 13:53:16 2006
From: ndbecker2 at gmail.com (Neal Becker)
Date: Tue, 10 Oct 2006 07:53:16 -0400
Subject: [Python-Dev] Proprietary code in python?
Message-ID: 

http://www.google.com/codesearch?q=+doc:DxlBcBw4TXo+proprietary+confidential+show:DxlBcBw4TXo:BwgQSUaGDCc:1s0hP8rbIGE&sa=N&cd=1&ct=ri&cs_p=http://www.python.org/download/releases/binaries-1.3/python-IRIX-5.3-full.tar.gz&cs_f=lib/python/irix5/AWARE.py#a0

From ncoghlan at gmail.com  Tue Oct 10 14:06:24 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 Oct 2006 22:06:24 +1000
Subject: [Python-Dev] Proprietary code in python?
In-Reply-To: 
References: 
Message-ID: <452B8CC0.3030206@gmail.com>

Neal Becker wrote:
> http://www.google.com/codesearch?q=+doc:DxlBcBw4TXo+proprietary+confidential+show:DxlBcBw4TXo:BwgQSUaGDCc:1s0hP8rbIGE&sa=N&cd=1&ct=ri&cs_p=http://www.python.org/download/releases/binaries-1.3/python-IRIX-5.3-full.tar.gz&cs_f=lib/python/irix5/AWARE.py#a0

That file isn't there any more [1]

The file appears to have been removed with the change in license for Python 
2.0 (the last tag I can find containing that file is related 1.5.2). (Note 
that the linked version is Python 1.3)

Cheers,
Nick.

[1] http://svn.python.org/view/python/trunk/Lib/plat-irix5/

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From fredrik at pythonware.com  Tue Oct 10 15:17:25 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 10 Oct 2006 15:17:25 +0200
Subject: [Python-Dev] Proprietary code in python?
References: 
Message-ID: 

"Neal Becker"  wrote:

> http://www.google.com/codesearch?q=+doc:DxlBcBw4TXo+proprietary+confidential+show:DxlBcBw4TXo:BwgQSUaGDCc:1s0hP8rbIGE&sa=N&cd=1&ct=ri&cs_p=http://www.python.org/download/releases/binaries-1.3/python-IRIX-5.3-full.tar.gz&cs_f=lib/python/irix5/AWARE.py#a0

that's most likely code that's been automatically generated from corresponding
header files in IRIX.

in most jurisdictions, laws about corporate secrets doesn't apply to things that
you've intentionally published.  (their file is still copyrighted, but I'm not sure to
what extent you can use copyright to protect a few integers).

From r.m.oudkerk at googlemail.com  Mon Oct  9 13:59:06 2006
From: r.m.oudkerk at googlemail.com (Richard Oudkerk)
Date: Mon, 9 Oct 2006 12:59:06 +0100
Subject: [Python-Dev] Cloning threading.py using proccesses
Message-ID: 

I am not sure how sensible the idea is, but I have had a first stab at
writing a module processing.py which is a near clone of threading.py
but uses processes and sockets for communication.  (It is one way of
avoiding the GIL.)

I have tested it on unix and windows and it seem to work pretty well.
(Getting round the lack of os.fork on windows is a bit awkward.)
There is also another module dummy_processing.py which has the same
api but is just a wrapper round threading.py.

Queues, Locks, RLocks, Conditions, Semaphores and some other shared
objects are implemented.

People are welcome to try out the tests in test_processing.py
contained in the zipfile.  More information is included in the README
file.

As a quick example, the code

.   from processing import Process, Queue, ObjectManager
.
.   def f(token):
.       q = proxy(token)
.       for i in range(10):
.           q.put(i*i)
.       q.put('STOP')
.
.   if __name__ == '__main__':
.       manager = ObjectManager()
.       token = manager.new(Queue)
.       queue = proxy(token)
.
.       t = Process(target=f, args=[token])
.       t.start()
.
.       result = None
.       while result != 'STOP':
.           result = queue.get()
.           print result
.
.       t.join()

is not very different from the normal threaded equivalent

.   from threading import Thread
.   from Queue import Queue
.
.   def f(q):
.       for i in range(10):
.           q.put(i*i)
.       q.put('STOP')
.
.   if __name__ == '__main__':
.       queue = Queue()
.
.       t = Thread(target=f, args=[queue])
.       t.start()
.
.       result = None
.       while result != 'STOP':
.           result = queue.get()
.           print result
.
.       t.join()

Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: processing.zip
Type: application/zip
Size: 16648 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061009/e707ab8d/attachment-0001.zip 

From docwhat+list.python-dev at gerf.org  Tue Oct 10 16:18:39 2006
From: docwhat+list.python-dev at gerf.org (The Doctor What)
Date: Tue, 10 Oct 2006 10:18:39 -0400
Subject: [Python-Dev] BUG (urllib2) Authentication request header is
 broken on long usernames and passwords
In-Reply-To: <20061009213523.GA27418@panix.com>
References: <452AB4B7.4030103@gerf.org> <20061009213523.GA27418@panix.com>
Message-ID: <452BABBF.2050403@gerf.org>

Aahz wrote:
> On Mon, Oct 09, 2006, The Doctor What wrote:
>> I found a bug in urllib2's handling of basic HTTP authentication.
> 
> Please submit your bug to SourceForge, then (optional) post the bug
> number back here.
> 
> See http://www.python.org/dev/faq/#bugs

Thank you!  I couldn't find the bug system for python (never had to
submit a bug before) and was looking all over the python.org site.

I see someone else submitted the bug as 1574068.

Ciao!

-- 
I'd horsewhip you if I had a horse.
	-- Groucho Marx

The Doctor What: Da Man
http://docwhat.gerf.org/
docwhat *at* gerf *dot* org
KF6VNC

From steven.bethard at gmail.com  Tue Oct 10 17:36:46 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue, 10 Oct 2006 09:36:46 -0600
Subject: [Python-Dev] DRAFT: python-dev summary for 2006-08-16 to 2006-08-31
Message-ID: 

Here's the draft summary for the second half of August.  As always,
comments and corrections are greatly appreciated!

=============
Announcements
=============

---------------------------
Python communnity buildbots
---------------------------

Want to make sure your package works with the latest and greatest
development and release versions of Python? Thanks to Grig Gheorghiu,
you can add your test suite to the `Python community buildbots`_ and
the results of these tests will show up on the `Python buildbot
results page`_.

.. _Python community buildbots: http://www.pybots.org/
.. _Python buildbot results page: http://www.python.org/dev/buildbot/

Contributing thread:

- `link to community buildbot?
`__

=========
Summaries
=========

---------------------
Fast subclass testing
---------------------

Neal Norwitz was playing around with a patch that would make subclass
testing for certain builtin types faster by stealing some bits from
tp_flags. Georg Brandl thought this could be useful for exception
handling in Python 3000 when all exceptions must be subclasses of
BaseException. Guido also liked the patch and suggested it be checked
into the `Python 3000 branch`_.

.. _Python 3000 branch: http://svn.python.org/view/python/branches/p3yk/

Contributing thread:

- `Type of range object members
`__

-----------------------------
gcc 4.2 and integer overflows
-----------------------------

Jack Howarth pointed out that intobject.c was using the test ``x < 0
&& x == -x`` to determine if the signed integer ``x`` was the most
negative integer on the platform. However, the C standard says
overflow is undefined, so despite this code actually working on pretty
much all known hardware, `gcc 4.2 assumes that overflow won't happen`_
and so optimizes away the entire clause. David Hopwood and Tim Peters
provided a patch that casts ``x`` to an unsigned long (the
"unnecessary" ``0`` is to make the Microsoft compilers happy)::

    ``x < 0 && (unsigned long)x == 0-(unsigned long)x``

.. _gcc 4.2 assumes that overflow won't happen: http://bugs.python.org/1545668

Contributing thread:

- `gcc 4.2 exposes signed integer overflows
`__

--------------------------
Python and 64-bit machines
--------------------------

Thomas Heller explained that the _ctypes extension module was still a
fair ways from building on Win64 and had to be removed from the
installer for that platform. There was some discussion about in
general how "experimental" the Win64 build of Python was, but Martin
v. L?wis explained that despite the compiler warnings, Python has been
running mostly fine on Win64 since version 2.4. In fact, Python has
been running in 64-bit machines since 1993 (when Tim Peters ported it
to 64-bit Crays) though of course not with the support that Python 2.5
brought through the Py_ssize_t changes.

Contributing thread:

- `ctypes and win64
`__

------------------------------------------
Guidelines for submitting bugs and patches
------------------------------------------

Brett Cannon put together a rewrite of the `bug and patch
guidelines`_.  The bug guidelines now includes sections on how to:

* Get a SourceForge account
* Start a new bug
* Specify the Python version
* Specify special settings for your Python interpreter
* Give sample code to reproduce bug
* Submit!
* Respond to requests from developers

And the patch guidelines now includes sections on how to:

* Read the Developer Intro to understand the scope of your proposed change
* Add the appropriate unit tests
* Add the proper document changes
* Make your code follow the style guidelines
* Generate a patch
* Create a tracker item on SourceForge
* Reference the patch in proper bug reports
* Wait for a developer to contact you

At Chad Whitacre's suggestion, Brett also included a section on the
5-for-1 rule, where some python-devvers have agreed to review your one
patch if you post reviews of five others.

The updates had not been posted to python.org at the time of this summary.

.. _bug and patch guidelines: http://www.python.org/dev/patches/

Contributing threads:

- `draft for bug guidelines
`__
- `draft of patch guidelines
`__

---------------------------------
Corner cases for continue/finally
---------------------------------

Dino Viehland pointed out an odd corner case with ``continue`` in a
``finally`` clause that was causing Python to crash::

    for abc in range(10):
        try: pass
        finally:
            try:
                continue
            except:
                pass

The bug was present at least all the way back to Python 2.3. People
tossed a few patches back and forth (and a few tests which broke
various versions of the patches) before `Neal Norwitz posted a patch`_
that people seemed to like.

.. _Neal Norwitz posted a patch: http://bugs.python.org/1542451

Contributing thread:

- `2.4 & 2.5 beta 3 crash
`__

---------------------------------------
PEP 343: decimal module context manager
---------------------------------------

Raymond Hettinger pointed out that the updates to the decimal module
to take advantage of the ``with``-statement differed dramatically from
`PEP 343`_ and were misdocumented in a number of places. Nick Coghlan
explained that the API was a result of the introduction and then later
removal of the ``__context__`` method.  After some discussion, Raymond
convinced everyone to change the API from::

    with decimal.getcontext().copy().get_manager() as ctx:
        ...

to the simpler API originally introduced in `PEP 343`::

    with decimal.localcontext() as ctx:
        ...

As a result of the changes needed to fix this API, Anthony Baxter
decided that another release candidate was necessary before Python 2.5
final could be released.

.. _PEP 343: http://www.python.org/dev/peps/pep-0343/

Contributing thread:

- `Py2.5 issue: decimal context manager misimplemented, misdesigned,
and misdocumented
`__

----------------------------
Python 2.6 development goals
----------------------------

Guido suggested that since Python 3.0 is now being developed in
parallel with the 2.X trunk, the major work for Python 2.6 should be
in making the transition to Python 3.0 as smooth as possible.  This
meant:

* Adding warnings (suppressed by default) for code incompatible with Python 3.0.
* Making all Python 2.X library code as Python 3.0-compatible as possible.
* Converting all unittests to unittest or doctest format.

Brett Cannon suggested adding to this list:

* Improving tests and classifying them better
* Updating and improving the documentation

In general, people seemed to think this was a pretty good approach,
particularly as it would address some of the complaints about the
speed of addition of new features to Python.  The discussion then
moved off to the `python-3000 list`_.

.. _python-3000 list: http://mail.python.org/mailman/listinfo/python-3000

Contributing threads:

- `What should the focus for 2.6 be?
`__
- `[Python-3000] What should the focus for 2.6 be?
`__

-----------------------
Python 2.5, VC8 and PGO
-----------------------

Muguntharaj Subramanian asked about building Python 2.5 with the VC8
compiler.  Christopher Baus had recently provided a few patches to get
the VC8 build working better and Kristj?n V. J?nsson said that he's
working on updating the PCBuild8 directory in the trunk in a number of
ways, including better support for profile-guided optimization (PGO)
builds. He said once he got everything working right, he'd backport to
Python 2.5.

Contributing threads:

- `Failed building 2.5rc1 pythoncore on VC8
`__
- `patch to add socket module project to vc8 solution
`__
- `Error while building 2.5rc1 pythoncore_pgo on VC8
`__

----------------------------------------------
PEP 342: using return instead of GeneratorExit
----------------------------------------------

Igor Bukanov suggested that the GeneratorExit exception introduced by
`PEP 342`_ could be eliminated by replacing it with the semantics of
the ``return`` statement.  This would allow code like the following,
which under the GeneratorExit paradigm would execute the ``except``
clause, to only execute the ``finally`` clause::

    def gen():
        try:
            yield 0
        except Exception:
            print "Unexpected exception!"
        finally:
            print "Finally"

    for i in gen():
        print i
        break

Phillip J. Eby and others liked the approach, but suggested that it
was much too late in the release process to be making such a major
language change. Guido was open to making a change like this, perhaps
in Python 3.0, but wanted the new generator enhancements to have some
time in the field to see what was really needed here.

.. _PEP 342: http://www.python.org/dev/peps/pep-0342/

Contributing thread:

- `GeneratorExit is unintuitive and uneccessary
`__

------------------------------------------
String formatting, __str__ and __unicode__
------------------------------------------

John J Lee noticed that in Python 2.5, the ``%s`` format specifier
calls ``__unicode__`` on objects if their ``__str__`` method returns a
unicode object::

    >>> class a(object):
    ...     def __str__(self):
    ...         print '__str__'
    ...         return u'str'
    ...     def __unicode__(self):
    ...         print '__unicode__'
    ...         return u'unicode'
    ...
    >>> '%s%s' % (a(), a())
    __str__
    __unicode__
    __unicode__
    u'unicodeunicode'

Nick Coghlan explained that string formatting first tries to build and
return a str object, but starts over if any of the objects to be
formatted by the ``%s`` specifier are unicode. So if a ``__str__``
method is called during string formatting and it returns a unicode
object, Python will decide that the string formatting operation needs
to return a unicode object, and will therefore start over, calling the
``__unicode__`` methods. Nick promised to look into making the
documentation for this a bit clearer.

Contributing thread:

- `String formatting / unicode 2.5 bug?
`__

-------------------------
Optimizing global lookups
-------------------------

K.S. Sreeram asked about replacing the current LOAD_GLOBAL dict lookup
with an array indexing along the lines of what is done for local
names. Brett Cannon explained that globals can be altered from the
outside, e.g. ``import mod; mod.name = value``, and thus globals
aren't necessarily known at compile time. Tim Peters pointed out that
a number of PEPs have been written in this area of optimization, with
`PEP 280`_ being a good place to start. Most people were not opposed
to the idea in general, but without an implementation to benchmark,
there wasn't really much to discuss.

.. _PEP 280: http://www.python.org/dev/peps/pep-0280/

Contributing thread:

- `Can LOAD_GLOBAL be optimized to a simple array lookup?
`__

---------------------
ElementTree and PEP 8
---------------------

Greg Ewing asked about changing the ElementTree names to be more `PEP
8`_ compliant. Being that Python was already in the release candidate
stage for Python 2.5, this was not possible. Even had the issue been
raised earlier, such a change would have been unlikely, as it would
have discouraged people who needed some backward compatibility from
using the version in the stdlib.

.. _PEP 8: http://www.python.org/dev/peps/pep-0008/

Contributing thread:

- `Doc suggestion for Elementtree (for 2.5? a bit late, I know...)
`__

--------
rslice()
--------

Nick Coghlan suggested that since reversing slices could be somewhat
complicated, e.g. ``(stop - 1) % abs(step) : start - 1 : -step``, it
would be helpful to introduce a ``rslice()`` builtin so that this
could be written ``rslice(start, stop, step)``.  Most people felt that
this was unnecessary and didn't gain much over using ``reversed()`` on
the sliced sequence.

Contributing thread:

- `Adding an rslice() builtin?
`__

----------------------------------
PEP 362: Function Signature Object
----------------------------------

Brett Cannon spent his time at the Google sprint working on `PEP
362`_, which introduces a signature object for functions to describe
what arguments they take. He asked for some feedback on two points:

* Should the signature object be an attribute on all functions or
should it be requested through the inspect module?
* Should the dict returned by ``Signature.bind()`` key by name or by a
tuple of names for argument lists like ``def f((a, b)):``?

He didn't get much of a response.

.. _PEP 362: http://www.python.org/dev/peps/pep-0362/

Contributing threads:

- `[Python-checkins] r51458 - peps/trunk/pep-0000.txt
peps/trunk/pep-0362.txt
`__
- `PEP 362 open issues
`__

----------------------------------------
Warn about mixing tabs and spaces in 2.6
----------------------------------------

Thomas Wouters suggested making the ``-t`` flag the default in Python
2.6.  This would make Python always issue warnings if users mixed tabs
and spaces.  People generally seemed in favor of the idea.

Contributing thread:

- `Making 'python -t' the default.
`__

---------------------
xrange() and non-ints
---------------------

Neal Norwitz was playing around with some patches that would allow
``xrange`` in Python 2.6 to accept longs or objects with an
``__index__`` method instead of just ints as it does now. He looked at
two Python implementations, a Python-C hybrid implementation and a C
implementation, and found that for his benchmark, the Python-C hybrid
was as good as the C implementation. People suggested that the
benchmark wasn't testing function call overhead well enough, and the
pure C implementation was probably still the way to go.

Contributing thread:

- `xrange accepting non-ints
`__

================
Deferred Threads
================
- `Interest in a Python 2.3.6?
`__
- `That library reference, yet again
`__

==================
Previous Summaries
==================
- `no remaining issues blocking 2.5 release
`__

===============
Skipped Threads
===============
- `IDLE patches - bugfix or not?
`__
- `TRUNK FREEZE for 2.5c1, 00:00 UTC, Thursday 17th August
`__
- `Weekly Python Patch/Bug Summary
`__
- `Benchmarking the int allocator (Was: Type of range object members)
`__
- `2.5: recently introduced sgmllib regexp bug hangs Python
`__
- `[wwwsearch-general] 2.5: recently introduced sgmllib regexp bug
hangs Python `__
- `recently introduced sgmllib regexp bughangs Python
`__
- `RELEASED Python 2.5 (release candidate 1)
`__
- `TRUNK IS UNFROZEN, available for 2.6 work if you are so inclined
`__
- `[Python-checkins] TRUNK IS UNFROZEN, available for 2.6 work if you
are so inclined
`__
- `Fixing 2.5 windows buildbots
`__
- `uuid tests failing on Windows
`__
- `Sprints next week at Google
`__
- `__del__ unexpectedly being called twice
`__
- `How does this help? Re: [Python-checkins] r51366 -
python/trunk/Lib/idlelib/NEWS.txt python/trunk/Lib/idlelib/idlever.py
`__
- `One-line fix for urllib2 regression
`__
- `os.spawnlp() missing on Windows in 2.4?
`__
- `Questions on unittest behaviour
`__
- `[Python-checkins] How does this help? Re: r51366 -
python/trunk/Lib/idlelib/NEWS.txt python/trunk/Lib/idlelib/idlever.py
`__
- `SSH Key Added
`__
- `uuid module - byte order issue
`__
- `A cast from Py_ssize_t to long
`__
- `Python + Java Integration
`__
- `[4suite] cDomlette deallocation bug?
`__
- `[Python-checkins] r51525 - in python/trunk: Lib/test/test_float.py
Objects/floatobject.c
`__
- `for 2.5 issues
`__
- `Need help with test_mutants.py
`__
- `zip -> izip; is __length_hint__ required?
`__
- `Removing anachronisms from logging module
`__
- `distutils patch
`__
- `32-bit and 64-bit python on Solaris
`__
- `Small Py3k task: fix modulefinder.py
`__
- `Windows build slave downtime
`__

From jcarlson at uci.edu  Tue Oct 10 17:58:47 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 10 Oct 2006 08:58:47 -0700
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
References: 
Message-ID: <20061010084306.09AE.JCARLSON@uci.edu>

"Richard Oudkerk"  wrote:
> I am not sure how sensible the idea is, but I have had a first stab at
> writing a module processing.py which is a near clone of threading.py
> but uses processes and sockets for communication.  (It is one way of
> avoiding the GIL.)

On non-windows platforms, you should check on unix domain sockets, I've
found they can run a couple times faster than standard sockets on the
local machine.  And if you are using fork or a variant of subprocess to
start processes on linux or Windows, you should consider using pipes,
they can be competitive with sockets (though using a bunch on Windows
can be a pain).

> I have tested it on unix and windows and it seem to work pretty well.
> (Getting round the lack of os.fork on windows is a bit awkward.)
> There is also another module dummy_processing.py which has the same
> api but is just a wrapper round threading.py.
> 
> Queues, Locks, RLocks, Conditions, Semaphores and some other shared
> objects are implemented.
> 
> People are welcome to try out the tests in test_processing.py
> contained in the zipfile.  More information is included in the README
> file.
> 
> As a quick example, the code

[snip]

Looks interesting.  Maybe it would become clearer with docs (I hope
you've written some).  Right now there is a difference, and it is
basically that there are tokens and proxies, which could confuse some
users.

Presumably with this library you have created, you have also written a
fast object encoder/decoder (like marshal or pickle).  If it isn't any
faster than cPickle or marshal, then users may bypass the module and opt
for fork/etc. + XML-RPC; which works pretty well and gets them
multi-machine calling, milti-language interoperability, and some other
goodies, though it is a bit slow in terms of communication.

 - Josiah

From fredrik at pythonware.com  Tue Oct 10 18:03:32 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 10 Oct 2006 18:03:32 +0200
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <20061010084306.09AE.JCARLSON@uci.edu>
References: 
	<20061010084306.09AE.JCARLSON@uci.edu>
Message-ID: 

Josiah Carlson wrote:

> Presumably with this library you have created, you have also written a
> fast object encoder/decoder (like marshal or pickle).  If it isn't any
> faster than cPickle or marshal, then users may bypass the module and opt
> for fork/etc. + XML-RPC

XML-RPC isn't close to marshal and cPickle in performance, though, so 
that statement is a bit misleading.

the really interesting thing here is a ready-made threading-style API, I 
think.  reimplementing queues, locks, and semaphores can be a reasonable 
amount of work; might as well use an existing implementation.

From anthony at interlink.com.au  Tue Oct 10 18:48:51 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed, 11 Oct 2006 02:48:51 +1000
Subject: [Python-Dev] BRANCH FREEZE, release24-maint for 2.4.4c1. 00:00UTC,
	11 October 2006
Message-ID: <200610110248.52613.anthony@interlink.com.au>

The release24-maint branch is frozen for the 2.4.4c1 release from 00:00UTC on 
the 11th of October. That's about 7 hours from now.

Anthony
-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From greg at electricrain.com  Tue Oct 10 21:11:39 2006
From: greg at electricrain.com (Gregory P. Smith)
Date: Tue, 10 Oct 2006 12:11:39 -0700
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
In-Reply-To: <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>
	<1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com>
Message-ID: <20061010191139.GA31184@zot.electricrain.com>

On Mon, Oct 09, 2006 at 08:11:59PM -0400, Tim Peters wrote:
> [Tim]
> > I just noticed that the bsddb portion of Python fails to compile on
> > the 2.4 Windows buildbots, but for some reason the buildbot machinery
> > doesn't notice the failure:
> 
> But it does now.  This is the revision that broke the Windows build:
> 
> """
> r52170 | andrew.kuchling | 2006-10-05 14:49:36 -0400 (Thu, 05 Oct
> 2006) | 12 lines
> 
> [Backport r50783 | neal.norwitz.  The bytes_left code is complicated,
>  but looks correct on a casual inspection and hasn't been modified
>  in the trunk.  Does anyone want to review further?]
> 
> Ensure we don't write beyond errText.  I think I got this right, but
> it definitely could use some review to ensure I'm not off by one
> and there's no possible overflow/wrap-around of bytes_left.
> Reported by Klocwork #1.
> 
> Fix a problem if there is a failure allocating self->db.
> Found with failmalloc.
> """
> 
> It introduces uses of assert() and strncat(), and the linker can't
> resolve them.  I suppose that's because the Windows link step for the
> _bsddb subproject explicitly excludes msvcrt (in the release build)
> and msvcrtd (in the debug build), but I don't know why that's done.
> 
> OTOH, we got a lot more errors (about duplicate code definitions) if
> the standard MS libraries aren't explicitly excluded, so that's no
> fix.

It seems bad form to C assert() within a python extension.  crashing
is bad.  Just code it to not copy the string in that case.  The
exception type should convey enough info alone and if someone actually
looks at the string description of the exception they're welcome to
notice that its missing info and file a bug (it won't happen; the
strings come from the BerkeleyDB or C library itself).

As for the strncat instead of strcat that is good practice.  The
buffer is way more than large enough for any of the error messages
defined in the berkeleydb common/db_err.c db_strerror() function but
the C library could supply its own unreasonably long one in some
unforseen circumstance.

-greg

From tim.peters at gmail.com  Tue Oct 10 21:35:11 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 10 Oct 2006 15:35:11 -0400
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
In-Reply-To: <20061010191139.GA31184@zot.electricrain.com>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>
	<1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com>
	<20061010191139.GA31184@zot.electricrain.com>
Message-ID: <1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com>

[Gregory P. Smith]
> It seems bad form to C assert() within a python extension.  crashing
> is bad.  Just code it to not copy the string in that case.  The
> exception type should convey enough info alone and if someone actually
> looks at the string description of the exception they're welcome to
> notice that its missing info and file a bug (it won't happen; the
> strings come from the BerkeleyDB or C library itself).

The proper use of C's assert() in Python (whether core or extension)
is to strongly document a condition the author believes /must/ be
true.  It's a strong sanity-check on the programmer's beliefs about
necessary invariants, things that must be true under all possible
conditions.  For example, it would always be wrong to assert that the
result of calling malloc() with a non-zero argument is non-NULL; it
would be correct (although trivially and unhelpfully so) to assert
that the result is NULL or is not NULL.

Given that, the assert() in question looks fine to me:

        if (_db_errmsg[0] && bytes_left < (sizeof(errTxt) - 4)) {
            bytes_left = sizeof(errTxt) - bytes_left - 4 - 1;
            assert(bytes_left >= 0);

We can't get into the block unless

    bytes_left < sizeof(errTxt) - 4

is true.  Subtracting bytes_left from both sides, then swapping LHS and RHS:

    sizeof(errTxt) - bytes_left - 4 > 0

which implies

    sizeof(errTxt) - bytes_left - 4 >= 1

Subtracting 1 from both sides:

    sizeof(errTxt) - bytes_left - 4 - 1 >= 0

And since the LHS of that is the new value of bytes_left, it must be true that

     bytes_left >= 0

Either that, or the original author (and me, just above) made an error
in analyzing what must be true at this point.  From

    bytes_left < sizeof(errTxt) - 4

it's not /instantly/ obvious that

    bytes_left >= 0

inside the block, so there's value in assert'ing that it's true.  It's
both documentation and an executable sanity check.

In any case, assert() statements are thrown away in a release build,
so can't be a cause of abnormal termination then.

> As for the strncat instead of strcat that is good practice.  The
> buffer is way more than large enough for any of the error messages
> defined in the berkeleydb common/db_err.c db_strerror() function but
> the C library could supply its own unreasonably long one in some
> unforseen circumstance.

That's fine -- there "shouldn't have been" a problem with using any
standard C function here.  It was just the funky linker step on
Windows on the 2.4 branch that was hosed.  Martin figured out how to
repair it, and there's no longer any problem here.  In fact, even the
been-there-forever linker warnings in 2.4 on Windows have gone away
now.

From arigo at tunes.org  Tue Oct 10 23:10:37 2006
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 10 Oct 2006 23:10:37 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com>
References: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com>
Message-ID: <20061010211037.GA4271@code0.codespeak.net>

Hi Raymond,

On Fri, Oct 06, 2006 at 08:48:15AM -0700, Raymond Hettinger wrote:
> No need to backport.  Py_TPFLAGS_DEFAULT implies
> Py_TPFLAGS_HAVE_WEAKREFS.
> 
> 
> The change was for clarity -- most things that have the weakref slots
> filled-in will also make the flag explicit -- that makes it easier on
> the brain when verifying code that checks the weakref flag.

I don't understand why you added this flag here; there are many other
flags with a meaning very similar to Py_TPFLAGS_HAVE_WEAKREFS, which are
also implied by Py_TPFLAGS_DEFAULT.  Also, *all* other types in a
CPython build use Py_TPFLAGS_DEFAULT as well, so have
Py_TPFLAGS_HAVE_WEAKREFS set.  Why would explicitly spelling just this
flag, on just this type, help make the overall code clearer?  It seems
to only further confuse the matter -- the slightly obscure bit that
requires some getting used to is that all these flags don't really mean
"I have such and such feature" but just "I could have such and such
feature, if the corresponding tp_xxx field were set".

A bientot,

Armin

From jcarlson at uci.edu  Tue Oct 10 23:49:50 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 10 Oct 2006 14:49:50 -0700
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
References: <20061010084306.09AE.JCARLSON@uci.edu> 
Message-ID: <20061010130901.09B1.JCARLSON@uci.edu>

Fredrik Lundh  wrote:
> 
> Josiah Carlson wrote:
> 
> > Presumably with this library you have created, you have also written a
> > fast object encoder/decoder (like marshal or pickle).  If it isn't any
> > faster than cPickle or marshal, then users may bypass the module and opt
> > for fork/etc. + XML-RPC
> 
> XML-RPC isn't close to marshal and cPickle in performance, though, so 
> that statement is a bit misleading.

You are correct, it is misleading, and relies on a few unstated
assumptions.

In my own personal delving into process splitting, RPC, etc., I usually
end up with one of two cases; I need really fast call/return, or I need
not slow call/return.  The not slow call/return is (in my opinion)
satisfactorally solved with XML-RPC.  But I've personally not been
satisfied with the speed of any remote 'fast call/return' packages, as
they usually rely on cPickle or marshal, which are slow compared to
even moderately fast 100mbit network connections.  When we are talking
about local connections, I have even seen cases where the
cPickle/marshal calls can make it so that forking the process is faster
than encoding the input to a called function.

I've had an idea for a fast object encoder/decoder (with limited support
for certain built-in Python objects), but I haven't gotten around to
actually implementing it as of yet.

> the really interesting thing here is a ready-made threading-style API, I 
> think.  reimplementing queues, locks, and semaphores can be a reasonable 
> amount of work; might as well use an existing implementation.

Really, it is a matter of asking what kind of API is desireable.  Do we
want to have threading plus other stuff be the style of API that we want
to replicate?  Do we want to have shared queue objects, or would an
XML-RPC-esque remote.queue_put('queue_X', value) and
remote.queue_get('queue_X', blocking=1) be better?

 - Josiah

From rhettinger at ewtllc.com  Tue Oct 10 23:47:26 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 10 Oct 2006 14:47:26 -0700
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
Message-ID: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>

> The change was for clarity -- most things that have the weakref slots
> filled-in will also make the flag explicit -- that makes it easier on
> the brain when verifying code that checks the weakref flag.

> I don't understand why you added this flag here;

Perhaps my other post wasn't clear.  The change wasn't necessary, so if
it bugs you, feel free to take it out.  Essentially, it was a "note to
self" so that I didn't have to keep looking up what was implied by
Py_TPFLAGS_DEFAULT.

> the slightly obscure bit that requires some getting used to is 
> that all these flags don't really mean "I have such and such 
> feature" but just "I could have such and such
> feature, if the corresponding tp_xxx field were set".

I would like to see that cleaned-up for Py3k.  Ideally, the NULL or
non_NULL status of a slot should serve as its flag.

Raymond

From barry at python.org  Wed Oct 11 00:48:36 2006
From: barry at python.org (Barry Warsaw)
Date: Tue, 10 Oct 2006 18:48:36 -0400
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>
Message-ID: <5CBC9CDA-BBF9-4956-B0A3-7C7373C74EB4@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 10, 2006, at 5:47 PM, Raymond Hettinger wrote:

>> the slightly obscure bit that requires some getting used to is
>> that all these flags don't really mean "I have such and such
>> feature" but just "I could have such and such
>> feature, if the corresponding tp_xxx field were set".
>
> I would like to see that cleaned-up for Py3k.  Ideally, the NULL or
> non_NULL status of a slot should serve as its flag.

+1 TOOWTDI.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRSwjRHEjvBPtnXfVAQJ0sQQAllzbSdONhCBWc/Rt0PW6J5iANLcm99N4
MkkSEDZBo72SsviijRvTha1+1pvpzB6s4Rf7EOw/OKnQ+a3u37w3BB966ag8WIN1
RItKubCVS6kTpg53BBnIX7P0CGSFFY36pEQm4nNe3G6RH4F0FwmIdv0WyJhSnnDR
KRT9PHI9QY8=
=bH9r
-----END PGP SIGNATURE-----

From david.nospam.hopwood at blueyonder.co.uk  Wed Oct 11 00:49:48 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Tue, 10 Oct 2006 23:49:48 +0100
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
In-Reply-To: <1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>	<1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com>	<20061010191139.GA31184@zot.electricrain.com>
	<1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com>
Message-ID: <452C238C.6000708@blueyonder.co.uk>

Tim Peters wrote:
> Given that, the assert() in question looks fine to me:
> 
>         if (_db_errmsg[0] && bytes_left < (sizeof(errTxt) - 4)) {
>             bytes_left = sizeof(errTxt) - bytes_left - 4 - 1;
>             assert(bytes_left >= 0);
> 
> We can't get into the block unless
> 
>     bytes_left < sizeof(errTxt) - 4
> 
> is true.  Subtracting bytes_left from both sides, then swapping LHS and RHS:
> 
>     sizeof(errTxt) - bytes_left - 4 > 0
> 
> which implies
> 
>     sizeof(errTxt) - bytes_left - 4 >= 1
> 
> Subtracting 1 from both sides:
> 
>     sizeof(errTxt) - bytes_left - 4 - 1 >= 0
> 
> And since the LHS of that is the new value of bytes_left, it must be true that
> 
>      bytes_left >= 0
> 
> Either that, or the original author (and me, just above) made an error
> in analyzing what must be true at this point.

You omitted to state an assumption that sizeof(errTxt) >= 4, since size_t
(and the constant 4) are unsigned. Also bytes_left must initially be nonnegative
so that the subexpression 'sizeof(errTxt) - bytes_left' cannot overflow.

-- 
David Hopwood 

From david.nospam.hopwood at blueyonder.co.uk  Wed Oct 11 01:03:26 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Wed, 11 Oct 2006 00:03:26 +0100
Subject: [Python-Dev] 2.4 vs Windows vs bsddb [correction]
In-Reply-To: <452C238C.6000708@blueyonder.co.uk>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>	<1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com>	<20061010191139.GA31184@zot.electricrain.com>	<1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com>
	<452C238C.6000708@blueyonder.co.uk>
Message-ID: <452C26BE.6090703@blueyonder.co.uk>

I wrote:
> You omitted to state an assumption that sizeof(errTxt) >= 4, since size_t
> (and the constant 4) are unsigned.

Sorry, the constant '4' is signed, but sizeof(errTxt) - 4 can nevertheless
wrap around unless sizeof(errTxt) >= 4.

-- 
David Hopwood 

From tim.peters at gmail.com  Wed Oct 11 03:20:00 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 10 Oct 2006 21:20:00 -0400
Subject: [Python-Dev] 2.4 vs Windows vs bsddb
In-Reply-To: <452C238C.6000708@blueyonder.co.uk>
References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com>
	<1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com>
	<20061010191139.GA31184@zot.electricrain.com>
	<1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com>
	<452C238C.6000708@blueyonder.co.uk>
Message-ID: <1f7befae0610101820o16330e6frce1a33b39ac7b370@mail.gmail.com>

[Tim]
>> Given that, the assert() in question looks fine to me:
>> ...
|>> Either that, or the original author (and me, just above) made an error
>>  in analyzing what must be true at this point.
|

[David Hopwood]
> You omitted to state an assumption that sizeof(errTxt) >= 4, since size_t
> (and the constant 4) are unsigned. Also bytes_left must initially be nonnegative
> so that the subexpression 'sizeof(errTxt) - bytes_left' cannot overflow.

I don't care, but that's really the /point/:  asserts are valuable
precisely because any inference that's not utterly obvious at first
glance at best stands a good chance of relying on hidden assumptions.
assert() makes key assumptions and key inferences visible, and
verifies them in a debug build of Python.

From martin at v.loewis.de  Wed Oct 11 06:15:20 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Oct 2006 06:15:20 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>
Message-ID: <452C6FD8.8070403@v.loewis.de>

Raymond Hettinger schrieb:
>> the slightly obscure bit that requires some getting used to is 
>> that all these flags don't really mean "I have such and such 
>> feature" but just "I could have such and such
>> feature, if the corresponding tp_xxx field were set".
> 
> I would like to see that cleaned-up for Py3k.  Ideally, the NULL or
> non_NULL status of a slot should serve as its flag.

The flag indicates that the field is even present. If you have
an extension module from an earlier Python release (in binary
form), it won't *have* the field, so you can't test whether it's
null. Accessing it will get to some other place in the data
segment, and interpreting it as a function pointer will cause
a crash. That's why the flags where initially introduced;
presence of the flag indicates that the field was there at
compile time.

Of course, if everybody would always recompile all extension modules
for a new Python feature release, those flags weren't necessary.

Regards,
Martin

From mal at egenix.com  Wed Oct 11 10:23:40 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 11 Oct 2006 10:23:40 +0200
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <20061010130901.09B1.JCARLSON@uci.edu>
References: <20061010084306.09AE.JCARLSON@uci.edu> 
	<20061010130901.09B1.JCARLSON@uci.edu>
Message-ID: <452CAA0C.6030306@egenix.com>

Josiah Carlson wrote:
> Fredrik Lundh  wrote:
>> Josiah Carlson wrote:
>>
>>> Presumably with this library you have created, you have also written a
>>> fast object encoder/decoder (like marshal or pickle).  If it isn't any
>>> faster than cPickle or marshal, then users may bypass the module and opt
>>> for fork/etc. + XML-RPC
>> XML-RPC isn't close to marshal and cPickle in performance, though, so 
>> that statement is a bit misleading.
> 
> You are correct, it is misleading, and relies on a few unstated
> assumptions.
> 
> In my own personal delving into process splitting, RPC, etc., I usually
> end up with one of two cases; I need really fast call/return, or I need
> not slow call/return.  The not slow call/return is (in my opinion)
> satisfactorally solved with XML-RPC.  But I've personally not been
> satisfied with the speed of any remote 'fast call/return' packages, as
> they usually rely on cPickle or marshal, which are slow compared to
> even moderately fast 100mbit network connections.  When we are talking
> about local connections, I have even seen cases where the
> cPickle/marshal calls can make it so that forking the process is faster
> than encoding the input to a called function.

This is hard to believe. I've been in that business for a few
years and so far have not found an OS/hardware/network combination
with the mentioned features.

Usually the worst part in performance breakdown for RPC is network
latency, ie. time to connect, waiting for the packets to come through,
etc. and this parameter doesn't really depend on the OS or hardware
you're running the application on, but is more a factor of which
network hardware, architecture and structure is being used.

It also depends a lot on what you send as arguments, of course,
but I assume that you're not pickling a gazillion objects :-)

> I've had an idea for a fast object encoder/decoder (with limited support
> for certain built-in Python objects), but I haven't gotten around to
> actually implementing it as of yet.

Would be interesting to look at.

BTW, did you know about http://sourceforge.net/projects/py-xmlrpc/ ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 11 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From fredrik at pythonware.com  Wed Oct 11 12:35:23 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 11 Oct 2006 12:35:23 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>
	<452C6FD8.8070403@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:

> Of course, if everybody would always recompile all extension modules
> for a new Python feature release, those flags weren't necessary.

a dynamic registration approach would be even better, with a single entry point
used to register all methods and hooks your C extension has implemented, and
code on the other side that builds a properly initialized type descriptor from that
set, using fallback functions and error stubs where needed.

e.g. the impossible-to-write-from-scratch NoddyType struct initialization in

    http://docs.python.org/ext/node24.html

would collapse to

    static PyTypeObject NoddyType;

    ...

    NoddyType = PyType_Setup("noddy.Noddy", sizeof(Noddy));
    PyType_Register(NoddyType, PY_TP_DEALLOC, Noddy_dealloc);
    PyType_Register(NoddyType, PY_TP_DOC, "Noddy objects");
    PyType_Register(NoddyType, PY_TP_TRAVERSE, Noddy_traverse);
    PyType_Register(NoddyType, PY_TP_CLEAR, Noddy_clear);
    PyType_Register(NoddyType, PY_TP_METHODS, Noddy_methods);
    PyType_Register(NoddyType, PY_TP_MEMBERS, Noddy_members);
    PyType_Register(NoddyType, PY_TP_INIT, Noddy_init);
    PyType_Register(NoddyType, PY_TP_NEW, Noddy_new);
    if (PyType_Ready(&NoddyType) < 0)
        return;

(a preprocessor that generated this based on suitable "macro decorators" could
be implemented in just over 8 lines of Python...)

with this in place, we could simply remove all those silly NULL checks from the
interpreter.

From fredrik at pythonware.com  Wed Oct 11 12:54:33 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 11 Oct 2006 12:54:33 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com><452C6FD8.8070403@v.loewis.de>

Message-ID: 

I wrote:

>    PyType_Register(NoddyType, PY_TP_METHODS, Noddy_methods);

methods and members could of course be registered to, so the implementation can chose
how to store them (e.g. short lists for smaller method lists, dictionaries for others).

From jcarlson at uci.edu  Wed Oct 11 18:38:48 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 11 Oct 2006 09:38:48 -0700
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <452CAA0C.6030306@egenix.com>
References: <20061010130901.09B1.JCARLSON@uci.edu>
	<452CAA0C.6030306@egenix.com>
Message-ID: <20061011090701.09CA.JCARLSON@uci.edu>

"M.-A. Lemburg"  wrote:
> 
> Josiah Carlson wrote:
> > Fredrik Lundh  wrote:
> >> Josiah Carlson wrote:
> >>
> >>> Presumably with this library you have created, you have also written a
> >>> fast object encoder/decoder (like marshal or pickle).  If it isn't any
> >>> faster than cPickle or marshal, then users may bypass the module and opt
> >>> for fork/etc. + XML-RPC
> >> XML-RPC isn't close to marshal and cPickle in performance, though, so 
> >> that statement is a bit misleading.
> > 
> > You are correct, it is misleading, and relies on a few unstated
> > assumptions.
> > 
> > In my own personal delving into process splitting, RPC, etc., I usually
> > end up with one of two cases; I need really fast call/return, or I need
> > not slow call/return.  The not slow call/return is (in my opinion)
> > satisfactorally solved with XML-RPC.  But I've personally not been
> > satisfied with the speed of any remote 'fast call/return' packages, as
> > they usually rely on cPickle or marshal, which are slow compared to
> > even moderately fast 100mbit network connections.  When we are talking
> > about local connections, I have even seen cases where the
> > cPickle/marshal calls can make it so that forking the process is faster
> > than encoding the input to a called function.
> 
> This is hard to believe. I've been in that business for a few
> years and so far have not found an OS/hardware/network combination
> with the mentioned features.
> 
> Usually the worst part in performance breakdown for RPC is network
> latency, ie. time to connect, waiting for the packets to come through,
> etc. and this parameter doesn't really depend on the OS or hardware
> you're running the application on, but is more a factor of which
> network hardware, architecture and structure is being used.

I agree, that is usually the case.  But for pre-existing connections
remote or local (whether via socket or unix domain socket), pickling
slows things down significantly.  What do I mean?  Set up a daemon that
reads and discards what is sent to it as fast as possible.  Then start
sending it plain strings (constructed via something like 32768*'\0'). 
Compare it to a somewhat equivalently sized pickle-as-you-go sender. 
Maybe I'm just not doing it right, but I always end up with a slowdown
that makes me want to write my own fast encoder/decoder.

> It also depends a lot on what you send as arguments, of course,
> but I assume that you're not pickling a gazillion objects :-)

According to tests on one of the few non-emulated linux machines I have
my hands on, forking to a child process runs on the order of
.0004-.00055 seconds.  On that same machine, pickling...
    128*['hello world', 18, {1:2}, 7.382]
...takes ~.0005 seconds.  512 somewhat mixed elements isn't a gazillion,
though in my case, I believe it was originally a list of tuples or
somesuch.

> > I've had an idea for a fast object encoder/decoder (with limited support
> > for certain built-in Python objects), but I haven't gotten around to
> > actually implementing it as of yet.
> 
> Would be interesting to look at.

It would basically be something along the lines of cPickle, but would
only support the basic types of: int, long, float, str, unicode, tuple,
list, dictionary.

> BTW, did you know about http://sourceforge.net/projects/py-xmlrpc/ ?

I did not know about it.  But it looks interesting.  I'll have to
compile it for my (ancient) 2.3 installation and see how it does.  Thank
you for the pointer.

 - Josiah

From jcarlson at uci.edu  Wed Oct 11 18:46:39 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 11 Oct 2006 09:46:39 -0700
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
References: <20061010130901.09B1.JCARLSON@uci.edu>

Message-ID: <20061011084824.09C7.JCARLSON@uci.edu>

"Richard Oudkerk"  wrote:
> On 10/10/06, Josiah Carlson  wrote:
> > > the really interesting thing here is a ready-made threading-style API, I
> > > think.  reimplementing queues, locks, and semaphores can be a reasonable
> > > amount of work; might as well use an existing implementation.
> >
> > Really, it is a matter of asking what kind of API is desireable.  Do we
> > want to have threading plus other stuff be the style of API that we want
> > to replicate?  Do we want to have shared queue objects, or would an
> > XML-RPC-esque remote.queue_put('queue_X', value) and
> > remote.queue_get('queue_X', blocking=1) be better?
> 
> Whatever the API is, I think it is useful if you can swap between
> threads and processes just by changing the import line.  That way you
> can write applications without deciding upfront which to use.

It would be convenient, yes, but the question isn't always 'threads or
processes?'  In my experience (not to say that it is more or better than
anyone else's), when going multi-process, the expense on some platforms
is significant enough to want to persist the process (this is counter to
my previous forking statement, but its all relative). And sometimes one
*wants* multiple threads running in a single process handling multiple
requests.

There's a recipe hanging out in the Python cookbook that adds a
threading mixin to the standard XML-RPC server in Python.  For a set of
processes (perhaps on different machines) that are cooperating and
calling amongst each other, I've not seen a significantly better variant,
especially when the remote procedure call can take a long time to
complete.  It does take a few tricks to make sure that sufficient
connections are available from process A to process B when A calls B
from multiple threads, but its not bad.

 - Josiah

From fredrik at pythonware.com  Wed Oct 11 18:41:52 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 11 Oct 2006 18:41:52 +0200
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <20061011090701.09CA.JCARLSON@uci.edu>
References: <20061010130901.09B1.JCARLSON@uci.edu>	<452CAA0C.6030306@egenix.com>
	<20061011090701.09CA.JCARLSON@uci.edu>
Message-ID: 

Josiah Carlson wrote:

> It would basically be something along the lines of cPickle, but would
> only support the basic types of: int, long, float, str, unicode, tuple,
> list, dictionary.

if you're aware of a way to do that faster than the current marshal 
implementation, maybe you could work on speeding up marshal instead?

From skip at pobox.com  Wed Oct 11 18:59:30 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 11 Oct 2006 11:59:30 -0500
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <20061011090701.09CA.JCARLSON@uci.edu>
References: <20061010130901.09B1.JCARLSON@uci.edu>
	<452CAA0C.6030306@egenix.com>
	<20061011090701.09CA.JCARLSON@uci.edu>
Message-ID: <17709.8946.379605.437664@montanaro.dyndns.org>

    Josiah> It would basically be something along the lines of cPickle, but
    Josiah> would only support the basic types of: int, long, float, str,
    Josiah> unicode, tuple, list, dictionary.

Isn't that approximately marshal's territory?  If you can write a faster
encoder/decoder, it might well be possible to apply the speedup ideas to
marshal.

Skip

From brett at python.org  Wed Oct 11 20:01:50 2006
From: brett at python.org (Brett Cannon)
Date: Wed, 11 Oct 2006 11:01:50 -0700
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: 
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>
	<452C6FD8.8070403@v.loewis.de> 
Message-ID: 

On 10/11/06, Fredrik Lundh  wrote:
>
> Martin v. L?wis wrote:
>
> > Of course, if everybody would always recompile all extension modules
> > for a new Python feature release, those flags weren't necessary.
>
> a dynamic registration approach would be even better, with a single entry
> point
> used to register all methods and hooks your C extension has implemented,
> and
> code on the other side that builds a properly initialized type descriptor
> from that
> set, using fallback functions and error stubs where needed.
>
> e.g. the impossible-to-write-from-scratch NoddyType struct initialization
> in
>
>     http://docs.python.org/ext/node24.html
>
> would collapse to
>
>     static PyTypeObject NoddyType;
>
>     ...
>
>     NoddyType = PyType_Setup("noddy.Noddy", sizeof(Noddy));
>     PyType_Register(NoddyType, PY_TP_DEALLOC, Noddy_dealloc);
>     PyType_Register(NoddyType, PY_TP_DOC, "Noddy objects");
>     PyType_Register(NoddyType, PY_TP_TRAVERSE, Noddy_traverse);
>     PyType_Register(NoddyType, PY_TP_CLEAR, Noddy_clear);
>     PyType_Register(NoddyType, PY_TP_METHODS, Noddy_methods);
>     PyType_Register(NoddyType, PY_TP_MEMBERS, Noddy_members);
>     PyType_Register(NoddyType, PY_TP_INIT, Noddy_init);
>     PyType_Register(NoddyType, PY_TP_NEW, Noddy_new);
>     if (PyType_Ready(&NoddyType) < 0)
>         return;
>
> (a preprocessor that generated this based on suitable "macro decorators"
> could
> be implemented in just over 8 lines of Python...)
>
> with this in place, we could simply remove all those silly NULL checks
> from the
> interpreter.

This is also has the benefit of making it really easy to grep for the
function used for the tp_init field since there is no guarantee someone will
keep the traditional field comments in their file (I usually grep for
PyTypeObject until I find the type I want).

If we went with C99 this wouldn't be an issue, but since I don't think that
is necessarily in the cards I am very happy to go with this solution.  It
ends up feeling more like how Ruby does C extensions, and I have to admit I
think they may have made it simpler than we have.

And of course with the change Raymond put in for checking the PyMethodDef
slots can also easily allow people to name methods based on what the slots
would be named had it been defined in Python (which we might want to do
anyway with the C constants to make it more readable and less obtuse to new
extension writers; e.g. change PY_TP_NEW to PY__NEW__).

And lastly, this approach makes sure that the basic requirement of what a
type must have defined can be enforced in the PyType_Setup() method /F's
proposing.

+1 from me.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061011/6a564753/attachment-0001.htm 

From jcarlson at uci.edu  Wed Oct 11 20:34:01 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 11 Oct 2006 11:34:01 -0700
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
References: <20061011090701.09CA.JCARLSON@uci.edu> 
Message-ID: <20061011103733.09CD.JCARLSON@uci.edu>

Fredrik Lundh  wrote:
> Josiah Carlson wrote:
> 
> > It would basically be something along the lines of cPickle, but would
> > only support the basic types of: int, long, float, str, unicode, tuple,
> > list, dictionary.
> 
> if you're aware of a way to do that faster than the current marshal 
> implementation, maybe you could work on speeding up marshal instead?

The current implementation uses a fixed resize semantic (1024 bytes at a
time) that makes large marshal operations slow.  If we were to switch to
a list resize-like or cStringIO semantic (overallocate by ~size>>3,
or at least double, respectively), it would likely increase the speed
for large resize operations. (see the w_more definition)  This should
make it significantly faster than cPickle in basically all cases.

w_object uses a giant if/else if block to handle all of the possible
cases, both for identity checks against None, False, True, etc., as well
as with the various Py*_Check().  This is necessary due to marshal
supporting subclasses (the Py*_Check() calls) and the dynamic layout of
memory during Python startup.  The identity checks may be able to be
replaced with a small array-like thing if we were to statically allocate
them from a single array to guarantee that their addresses are a fixed
distance apart...

char base_objects[320];
PyObject* IDENTITY[8];
int cases[8];

/* 
64 bytes per object is overkill, and we may want to allocate enough room
for 15 objects, to make sure that IDENTITY[0] = NULL;
*/
p = 0
for obj_init in objs_to_init:
    init_object(base_objects+p, obj_init)
    x = ((base_objects+p)>>6)&7
    IDENTITY[x] = (PyObject*)(base_objects+p)
    cases[x] = p//64
    p += 64

Then we could use the following in w_object...

x = (v>>6)&7
if v == IDENTITY[x] {
    switch (cases[x]) {
    case 0: /* should be null */
    ...
    case 1: /* based on the order of objs_to_init */
    }
}

The Py*_Check() stuff isn't so amenable to potential speedup, but in a
custom no-subclasses only base types version, we ccould use a variant of
the above mechanism to look directly at types, then use a second
switch/case statement, which should be significantly faster than the
if/else if tests that it currently uses.  An identity check, then a fast
type check, otherwise fail.

 - Josiah

From simonwittber at gmail.com  Thu Oct 12 02:31:19 2006
From: simonwittber at gmail.com (Simon Wittber)
Date: Thu, 12 Oct 2006 08:31:19 +0800
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <20061011090701.09CA.JCARLSON@uci.edu>
References: <20061010130901.09B1.JCARLSON@uci.edu>
	<452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu>
Message-ID: <4e4a11f80610111731n68275e4agd98abd0baf3ab54@mail.gmail.com>

On 10/12/06, Josiah Carlson  wrote:
>
> It would basically be something along the lines of cPickle, but would
> only support the basic types of: int, long, float, str, unicode, tuple,
> list, dictionary.
>

Great idea! Check this thread for past efforts:

http://mail.python.org/pipermail/python-dev/2005-June/054313.html

The 'gherkin' module discussed there now lives in the cheeseshop as
part of the FibraNet package.

http://cheeseshop.python.org/pypi/FibraNet

I love benchmarks, especially when they come around for the second time.

I wrote a silly script which compares dumps performance between
different serialization modules for different simple objects using
Python 2.4.3. All figures are 'dumps per second'.

test: a tuple: ("a" * 1024,1.0,[1,2,3],{'1':2,'3':4})

gherkin: 10895.7762314
pickle: 6510.97245984
cPickle: 34218.5455317
marshal: 85562.2443672
xmlrpclib: 9468.0766772

test: a large string: 'a' * 10240

gherkin:  45955.4065455
pickle:     10209.0239868
cPickle:   13773.8138516
marshal: 24937.002069
xmlrpclib: Traceback

test: a small string: 'a' * 128

gherkin: 73453.0960495
pickle: 28357.0210654
cPickle: 122997.592425
marshal: 202428.776201
xmlrpclib: Traceback

test: a tupe of ints: tuple(range(64))
gherkin: 4522.06801154
pickle: 2273.12937965
cPickle: 23969.9306043
marshal: 143691.72582
xmlrpclib: 2781.3083894

Marshal is very quick for most cases, but still has this warning in
the documentation.

"""Warning: The marshal module is not intended to be secure against
erroneous or maliciously constructed data. Never unmarshal data
received from an untrusted or unauthenticated source."""

-Sw

From greg.ewing at canterbury.ac.nz  Thu Oct 12 01:30:26 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 12 Oct 2006 12:30:26 +1300
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
References: <20061010130901.09B1.JCARLSON@uci.edu>
	<452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu>

Message-ID: <452D7E92.4050206@canterbury.ac.nz>

Fredrik Lundh wrote:

> if you're aware of a way to do that faster than the current marshal 
> implementation, maybe you could work on speeding up marshal instead?

Even if it weren't faster than marshal, it could still
be useful to have something nearly as fast that used
a python-version-independent protocol.

--
Greg

From fredrik at pythonware.com  Thu Oct 12 07:22:00 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 12 Oct 2006 07:22:00 +0200
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <452D7E92.4050206@canterbury.ac.nz>
References: <20061010130901.09B1.JCARLSON@uci.edu>	<452CAA0C.6030306@egenix.com>
	<20061011090701.09CA.JCARLSON@uci.edu>	
	<452D7E92.4050206@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:

>> if you're aware of a way to do that faster than the current marshal 
>> implementation, maybe you could work on speeding up marshal instead?
> 
> Even if it weren't faster than marshal, it could still
> be useful to have something nearly as fast that used
> a python-version-independent protocol.

marshal hasn't changed in many years:

$ python1.5
 >>> x = 1, 2.0, "three", [4, 5, 6, "seven"], {8: 9}, None
 >>> import marshal
 >>> marshal.dump(x, open("x.dat", "w"))
 >>>

$ python2.5
 >>> import marshal
 >>> marshal.load(open("x.dat"))
(1, 2.0, 'three', [4, 5, 6, 'seven'], {8: 9}, None)

which is a good thing, because there are external non-Python tools that 
generate marshalled data streams.

maybe you were thinking about marshalled code objects?

From dave at boost-consulting.com  Thu Oct 12 09:00:16 2006
From: dave at boost-consulting.com (Dave Abrahams)
Date: Thu, 12 Oct 2006 07:00:16 +0000 (UTC)
Subject: [Python-Dev] Plea to distribute debugging lib
References: 
	<20051104202824.GA19678@discworld.dyndns.org>

	 <20051202025557.GA22377@ActiveState.com>
Message-ID: 

Trent Mick  ActiveState.com> writes:

> 
> [Thomas Heller wrote]
> > Anyway, AFAIK, the activestate distribution contains Python debug dlls.
> 
> [Er, a month late, but I was in flitting around Australia at the time. :)]
> 
> Yes, as a separate download.
> 
>     ftp://ftp.activestate.com/ActivePython/etc/
>         ActivePython--win32-ix86-debug.zip
> 
> And those should be binary compatible with the equivalent python.org
> installs as well. Note that the simple "install.py" script in those
> packages bails if the Python installation isn't ActivePython, but I
> could easily remove that if you think that would be useful for your
> users.

The only problem here is that there appears to be a lag in the release of
ActivePython after Python itself is released.

Is there any chance of putting up just the debugging libraries a little earlier?

Thanks again,
Dave

From anthony at python.org  Thu Oct 12 09:33:02 2006
From: anthony at python.org (Anthony Baxter)
Date: Thu, 12 Oct 2006 17:33:02 +1000
Subject: [Python-Dev] RELEASED Python 2.4.4, release candidate 1
Message-ID: <200610121733.11507.anthony@python.org>

On behalf of the Python development team and the Python community, 
I'm happy to announce the release of Python 2.4.4 (release candidate 1).

Python 2.4.4 is a bug-fix release. While Python 2.5 is the latest 
version of Python, we're making this release for people who are 
still running Python 2.4.

See the release notes at the website (also available as Misc/NEWS in
the source distribution) for details of the more than 80 bugs squished
in this release, including a number found by the Coverity and Klocwork
static analysis tools. We'd like to offer our thanks to both these 
companies for making this available for open source projects.

 *  Python 2.4.4 contains a fix for PSF-2006-001, a buffer overrun   *
 *  in repr() of unicode strings in wide unicode (UCS-4) builds.     *
 *  See http://www.python.org/news/security/PSF-2006-001/ for more.  *

Assuming no major problems crop up, a final release of Python 2.4.4 will
follow in about a week's time. This will be the last planned release in
the Python 2.4 series - future maintenance releases will be in the 2.5 
line.

For more information on Python 2.4.4, including download links for
various platforms, release notes, and known issues, please see:

    http://www.python.org/2.4.4/

Highlights of this new release include:

  - Bug fixes. According to the release notes, at least 80 have been
    fixed.
  - A fix for PSF-2006-001, a bug in repr() for unicode strings 
    on UCS-4 (wide unicode) builds.

Highlights of the previous major Python release (2.4) are available
from the Python 2.4 page, at

    http://www.python.org/2.4/highlights.html

Enjoy this release,
Anthony

Anthony Baxter
anthony at python.org
Python Release Manager
(on behalf of the entire python-dev team)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061012/c1778e7d/attachment.pgp 

From anthony at interlink.com.au  Thu Oct 12 10:08:46 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu, 12 Oct 2006 18:08:46 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
Message-ID: <200610121808.47010.anthony@interlink.com.au>

I've had a couple of queries about whether PSF-2006-001 merits a 2.3.6. 
Personally, I lean towards "no" - 2.4 was nearly two years ago now. But I'm 
open to other opinions - I guess people see the phrase "buffer overrun" and 
they get scared.

Plus once 2.4.4 final is out next week, I'll have cut 12 releases since 
March. Assuming a 2.5.1 before March (very likely) that'll be 14 releases
in 12 months. 16 releases in 12 months would just about make me go crazy.

From fredrik at pythonware.com  Thu Oct 12 10:18:33 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 12 Oct 2006 10:18:33 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
References: <200610121808.47010.anthony@interlink.com.au>
Message-ID: 

Anthony Baxter wrote:

> 16 releases in 12 months would just about make me go crazy.

is there any way we could further automate or otherwise streamline or
distribute the release process ?

ideally, releasing (earlier release + well-defined patch set) should be
fairly trivial, compared to releasing (new release from trunk).  what do
we have to do to make it easier to handle that case?

From nick at craig-wood.com  Thu Oct 12 13:35:31 2006
From: nick at craig-wood.com (Nick Craig-Wood)
Date: Thu, 12 Oct 2006 12:35:31 +0100
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610121808.47010.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>
Message-ID: <20061012113531.GA32477@craig-wood.com>

On Thu, Oct 12, 2006 at 06:08:46PM +1000, Anthony Baxter wrote:
> I've had a couple of queries about whether PSF-2006-001 merits a 2.3.6. 
> Personally, I lean towards "no" - 2.4 was nearly two years ago now. But I'm 
> open to other opinions - I guess people see the phrase "buffer overrun" and 
> they get scared.

As a data point: python 2.3 is the shipped version of python in
current stable Debian release (sarge).  It is also vulnerable by
default (sys.maxunicode == 1114111).

I'm sure the debian maintainers are capable of picking up the patch
and sending out a security update themselves, but by releasing a fixed
2.3 you'll send a stronger message to all the distributions hopefully!

> Plus once 2.4.4 final is out next week, I'll have cut 12 releases
> since March. Assuming a 2.5.1 before March (very likely) that'll be
> 14 releases in 12 months. 16 releases in 12 months would just about
> make me go crazy.

I sympathise!  I do released for my current workplace and it is time
consuming and exacting work.
-- 
Nick Craig-Wood  -- http://www.craig-wood.com/nick

From arigo at tunes.org  Thu Oct 12 14:12:49 2006
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 12 Oct 2006 14:12:49 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: 
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>
	<452C6FD8.8070403@v.loewis.de> 
Message-ID: <20061012121248.GA25659@code0.codespeak.net>

Hi Fredrik,

On Wed, Oct 11, 2006 at 12:35:23PM +0200, Fredrik Lundh wrote:
>     NoddyType = PyType_Setup("noddy.Noddy", sizeof(Noddy));

It doesn't address the problem Martin explained (you can put neither
NULLs nor stubs in tp_xxx fields that are beyond the C extension
module's sizeof(Nobby)).  But I imagine it could with a bit more
tweaking.

A bientot,

Armin

From fredrik at pythonware.com  Thu Oct 12 14:37:25 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 12 Oct 2006 14:37:25 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com><452C6FD8.8070403@v.loewis.de>

	<20061012121248.GA25659@code0.codespeak.net>
Message-ID: 

Armin Rigo wrote:

>>     NoddyType = PyType_Setup("noddy.Noddy", sizeof(Noddy));
>
> It doesn't address the problem Martin explained (you can put neither
> NULLs nor stubs in tp_xxx fields that are beyond the C extension
> module's sizeof(Nobby)).  But I imagine it could with a bit more
> tweaking.

umm. last time I checked, the tp fields lived in the type object, not in the
instance.

From nmm1 at cus.cam.ac.uk  Thu Oct 12 15:22:30 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Thu, 12 Oct 2006 14:22:30 +0100
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: Your message of "Wed, 11 Oct 2006 10:23:40 +0200."
	<452CAA0C.6030306@egenix.com> 
Message-ID: 

"M.-A. Lemburg"  wrote:
>
> This is hard to believe. I've been in that business for a few
> years and so far have not found an OS/hardware/network combination
> with the mentioned features.

Surely you must have - unless there is another M.-A. Lemburg in IT!
Some of the specialist systems, especially those used for communication,
were like that, and it is very likely that many still are.  But they
aren't currently in Python's domain.  I have never used any, but have
colleagues who have.

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From nmm1 at cus.cam.ac.uk  Thu Oct 12 15:25:26 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Thu, 12 Oct 2006 14:25:26 +0100
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: Your message of "Wed, 11 Oct 2006 09:46:39 PDT."
	<20061011084824.09C7.JCARLSON@uci.edu> 
Message-ID: 

Josiah Carlson  wrote:
> 
> It would be convenient, yes, but the question isn't always 'threads or
> processes?'  In my experience (not to say that it is more or better than
> anyone else's), when going multi-process, the expense on some platforms
> is significant enough to want to persist the process (this is counter to
> my previous forking statement, but its all relative). And sometimes one
> *wants* multiple threads running in a single process handling multiple
> requests.

Yes, indeed.

This is all confused by the way that POSIX (and Microsoft) threads
have become essentially just processes with shared resources.  If
one had a system with real, lightweight threads, the same might
well not be so.

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From nmm1 at cus.cam.ac.uk  Thu Oct 12 16:10:27 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Thu, 12 Oct 2006 15:10:27 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: Your message of "Wed, 13 Sep 2006 05:36:34 +0200."
	<45077CC2.9070601@v.loewis.de> 
Message-ID: 

Sorry.  I was on holiday, and then buried this when sorting out my
thousands of Emails on my return, partly because I had to look up the
information!

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=  wrote:
> 
> >> | afaik the kernel only sends signals to threads that don't have them blocked.
> >> | If python doesn't want anyone but the main thread to get signals, it
> >> should just
> >> | block signals on all but the main thread and then by nature, all
> >> signals will go
> >> | to the main thread....
> > 
> > Well, THAT'S wrong, I am afraid!  Things ain't that simple :-(> 
> > Yes, POSIX implies that things work that way, but there are so many
> > get-out clauses and problems with trying to implement that specification
> > that such behaviour can't be relied on.
> 
> Can you please give one example for each (one get-out clause, and
> one problem with trying to implement that).

http://www.opengroup.org/onlinepubs/009695399/toc.htm

2.4.1 Signal Generation and Delivery

It is extremely unclear what that means, but it talks about the
generation and delivery of signals to both threads and processes.
I can tell you (from speaking to system developers) that they
understand that to mean that they are allowed to send signals to
specific threads when that is appropriate.  But they are as
confused by POSIX's verbiage as I am!

> I fail to see why it isn't desirable to make all signals occur
> in the main thread, on systems where this is possible.

Oh, THAT's easy.  Consider a threaded application running on a
muti-CPU machine and consider hardware generated signals (e.g.
SIGFPE, SIGSEGV etc.)  Sending them to the master thread involves
either moving them between CPUs or moving the master thread; both
are inefficient and neither may be possible.

[ I have brought systems down with signals that did have to be
handled on a particular CPU, by flooding that with signals from
dozens of others (yes, big SMPs) and blocking out high-priority
interrupts.  The efficiency point can be serious. ]

That also applies to many of the signals that do not reach programs,
such as TLB misses, ECC failure etc.  But, in those cases, what does
Python or even POSIX need to know about them?

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From nmm1 at cus.cam.ac.uk  Thu Oct 12 16:15:47 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Thu, 12 Oct 2006 15:15:47 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: 

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=  wrote:
Michael Hudson schrieb:
> 
> >> According to [1], all python needs to do to avoid this problem is
> >> block all signals in all but the main thread;
> > 
> > Argh, no: then people who call system() from non-main threads end up
> > running subprocesses with all signals masked, which breaks other
> > things in very mysterious ways.  Been there...
>         
> Python should register a pthread_atfork handler then, which clears
> the signal mask. Would that not work?

No.  It's not the only such problem.

Personally, I think that anyone who calls system(), fork(), spawn()
or whatever from threads is cuckoo.  It is precisely the sort of
thing that is asking for trouble, because there are so many ways
of doing it 'right' that you can't be sure exactly what mental
model the system developers will have.

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From Audun.Ostrem.Nordal at cern.ch  Tue Oct 10 16:28:31 2006
From: Audun.Ostrem.Nordal at cern.ch (Audun Ostrem Nordal)
Date: Tue, 10 Oct 2006 16:28:31 +0200
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
Message-ID: 

You may already know about a similar project a friend of mine (hi,
Steffen!) did a few years ago called Python Object Sharing (POSH).  This
was however unix specific and relied on fork and SYSV IPC iirc.  I see
he has a SF projectpage here:

http://poshmodule.sourceforge.net/

(doesn't seem to be a lot of activity there, though).

Best regards

__ 
Audun Ostrem Nordal     tel: +41.22.76.74427
CERN IT/IS
1211 Geneve 23
Switzerland 

> -----Original Message-----
> From: python-dev-bounces+audun=cern.ch at python.org 
> [mailto:python-dev-bounces+audun=cern.ch at python.org] On 
> Behalf Of Richard Oudkerk
> Sent: Monday, October 09, 2006 1:59 PM
> To: python-dev at python.org
> Subject: [Python-Dev] Cloning threading.py using proccesses
> 
> I am not sure how sensible the idea is, but I have had a 
> first stab at writing a module processing.py which is a near 
> clone of threading.py but uses processes and sockets for 
> communication.  (It is one way of avoiding the GIL.)
> 
> I have tested it on unix and windows and it seem to work pretty well.
> (Getting round the lack of os.fork on windows is a bit 
> awkward.) There is also another module dummy_processing.py 
> which has the same api but is just a wrapper round threading.py.
> 
> Queues, Locks, RLocks, Conditions, Semaphores and some other 
> shared objects are implemented.
> 
> People are welcome to try out the tests in test_processing.py 
> contained in the zipfile.  More information is included in 
> the README file.
> 
> As a quick example, the code
> 
> .   from processing import Process, Queue, ObjectManager
> .
> .   def f(token):
> .       q = proxy(token)
> .       for i in range(10):
> .           q.put(i*i)
> .       q.put('STOP')
> .
> .   if __name__ == '__main__':
> .       manager = ObjectManager()
> .       token = manager.new(Queue)
> .       queue = proxy(token)
> .
> .       t = Process(target=f, args=[token])
> .       t.start()
> .
> .       result = None
> .       while result != 'STOP':
> .           result = queue.get()
> .           print result
> .
> .       t.join()
> 
> is not very different from the normal threaded equivalent
> 
> .   from threading import Thread
> .   from Queue import Queue
> .
> .   def f(q):
> .       for i in range(10):
> .           q.put(i*i)
> .       q.put('STOP')
> .
> .   if __name__ == '__main__':
> .       queue = Queue()
> .
> .       t = Thread(target=f, args=[queue])
> .       t.start()
> .
> .       result = None
> .       while result != 'STOP':
> .           result = queue.get()
> .           print result
> .
> .       t.join()
> 
> Richard
> 

From gregwillden at gmail.com  Tue Oct 10 21:40:59 2006
From: gregwillden at gmail.com (Greg Willden)
Date: Tue, 10 Oct 2006 14:40:59 -0500
Subject: [Python-Dev] ConfigParser: whitespace leading comment lines
Message-ID: <903323ff0610101240p2f4e0a18g18d34d1a800624ec@mail.gmail.com>

Hello all,
I'd like to propose the following change to ConfigParser.py.
I won't call it a bug-fix because I don't know the relevant standards.
This change will enable multiline comments as follows:

[section]
item=value   ;first of multiline comment
            ;second of multiline comment

Right now the behaviour is

In [19]: cfg.get('section','item')
Out[19]: 'value\n;second of multiline comment'

It's a one-line change.
RawConfigParser._read lines 434-437
            # comment or blank line?
-            if line.strip() == '' or line[0] in '#;':
+            if line.strip() == '' or line.strip()[0] in '#;':
                continue

Regards,
Greg Willden (Not a member of python-dev)

-- 
Linux.  Because rebooting is for adding hardware.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061010/76a9e204/attachment-0001.html 

From kristjan at ccpgames.com  Wed Oct 11 13:29:00 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Wed, 11 Oct 2006 11:29:00 -0000
Subject: [Python-Dev] Python 2.5 performance
Message-ID: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>

Hello there.
I just got round to do some comparative runs of 2.5 32 bit Release, built with visual studio 2003 and 2005.

Here the figures (pybench with default arguments)

.NET 2003:
Test                             minimum  average  operation  overhead
-------------------------------------------------------------------------------
          BuiltinFunctionCalls:    262ms    304ms    0.60us    0.251ms
           BuiltinMethodLookup:    232ms    267ms    0.25us    0.312ms
                 CompareFloats:    148ms    170ms    0.14us    0.377ms
         CompareFloatsIntegers:    183ms    216ms    0.24us    0.261ms
               CompareIntegers:    144ms    163ms    0.09us    0.527ms
        CompareInternedStrings:    157ms    186ms    0.12us    1.606ms
                  CompareLongs:    153ms    174ms    0.17us    0.300ms
                CompareStrings:    156ms    198ms    0.20us    1.166ms
                CompareUnicode:    180ms    205ms    0.27us    0.731ms
                 ConcatStrings:    410ms    457ms    0.91us    0.579ms
                 ConcatUnicode:    473ms    610ms    2.03us    0.466ms
               CreateInstances:    248ms    290ms    2.59us    0.432ms
            CreateNewInstances:    206ms    243ms    2.89us    0.352ms
       CreateStringsWithConcat:    164ms    200ms    0.20us    0.971ms
       CreateUnicodeWithConcat:    268ms    295ms    0.74us    0.343ms
                  DictCreation:    152ms    186ms    0.47us    0.358ms
             DictWithFloatKeys:    378ms    410ms    0.46us    0.660ms
           DictWithIntegerKeys:    133ms    161ms    0.13us    0.907ms
            DictWithStringKeys:    152ms    184ms    0.15us    0.927ms
                      ForLoops:    125ms    133ms    5.32us    0.069ms
                    IfThenElse:    109ms    131ms    0.10us    1.019ms
                   ListSlicing:    193ms    223ms   15.90us    0.072ms
                NestedForLoops:    147ms    164ms    0.11us    0.021ms
          NormalClassAttribute:    176ms    195ms    0.16us    0.579ms
       NormalInstanceAttribute:    171ms    198ms    0.17us    0.598ms
           PythonFunctionCalls:    207ms    240ms    0.73us    0.326ms
             PythonMethodCalls:    234ms    287ms    1.27us    0.163ms
                     Recursion:    294ms    328ms    6.56us    0.563ms
                  SecondImport:    191ms    210ms    2.10us    0.241ms
           SecondPackageImport:    197ms    220ms    2.20us    0.217ms
         SecondSubmoduleImport:    257ms    276ms    2.76us    0.213ms
       SimpleComplexArithmetic:    191ms    208ms    0.24us    0.445ms
        SimpleDictManipulation:    158ms    178ms    0.15us    0.625ms
         SimpleFloatArithmetic:    183ms    211ms    0.16us    0.703ms
      SimpleIntFloatArithmetic:    122ms    133ms    0.10us    0.745ms
       SimpleIntegerArithmetic:    106ms    121ms    0.09us    0.680ms
        SimpleListManipulation:    132ms    149ms    0.13us    0.750ms
          SimpleLongArithmetic:    170ms    198ms    0.30us    0.322ms
                    SmallLists:    246ms    274ms    0.40us    0.437ms
                   SmallTuples:    204ms    235ms    0.43us    0.497ms
         SpecialClassAttribute:    177ms    201ms    0.17us    0.561ms
      SpecialInstanceAttribute:    257ms    290ms    0.24us    0.598ms
                StringMappings:    881ms    949ms    3.77us    0.584ms
              StringPredicates:    321ms    366ms    0.52us    3.207ms
                 StringSlicing:    243ms    286ms    0.51us    1.032ms
                     TryExcept:     87ms    110ms    0.05us    0.957ms
                TryRaiseExcept:    164ms    197ms    3.08us    0.434ms
                  TupleSlicing:    195ms    230ms    0.88us    0.065ms
               UnicodeMappings:    158ms    187ms    5.20us    0.699ms
             UnicodePredicates:    191ms    233ms    0.43us    3.954ms
             UnicodeProperties:    209ms    251ms    0.63us    3.234ms
                UnicodeSlicing:    306ms    345ms    0.70us    0.933ms
-------------------------------------------------------------------------------
Totals:                          11202ms  12875ms

.NET 2005:
Test                             minimum  average  operation  overhead
-------------------------------------------------------------------------------
          BuiltinFunctionCalls:    254ms    279ms    0.55us    0.280ms
           BuiltinMethodLookup:    269ms    290ms    0.28us    0.327ms
                 CompareFloats:    136ms    147ms    0.12us    0.375ms
         CompareFloatsIntegers:    158ms    178ms    0.20us    0.268ms
               CompareIntegers:    118ms    141ms    0.08us    0.603ms
        CompareInternedStrings:    152ms    203ms    0.14us    1.666ms
                  CompareLongs:    152ms    171ms    0.16us    0.335ms
                CompareStrings:    118ms    140ms    0.14us    1.374ms
                CompareUnicode:    160ms    180ms    0.24us    0.730ms
                 ConcatStrings:    430ms    472ms    0.94us    0.681ms
                 ConcatUnicode:    488ms    535ms    1.78us    0.458ms
               CreateInstances:    249ms    286ms    2.56us    0.437ms
            CreateNewInstances:    220ms    254ms    3.02us    0.356ms
       CreateStringsWithConcat:    174ms    204ms    0.20us    1.123ms
       CreateUnicodeWithConcat:    271ms    294ms    0.74us    0.348ms
                  DictCreation:    151ms    169ms    0.42us    0.365ms
             DictWithFloatKeys:    350ms    387ms    0.43us    0.666ms
           DictWithIntegerKeys:    140ms    151ms    0.13us    1.020ms
            DictWithStringKeys:    154ms    176ms    0.15us    1.070ms
                      ForLoops:     96ms    111ms    4.42us    0.069ms
                    IfThenElse:    115ms    130ms    0.10us    0.697ms
                   ListSlicing:    221ms    261ms   18.66us    0.093ms
                NestedForLoops:    146ms    167ms    0.11us    0.022ms
          NormalClassAttribute:    182ms    205ms    0.17us    0.502ms
       NormalInstanceAttribute:    174ms    192ms    0.16us    0.457ms
           PythonFunctionCalls:    203ms    221ms    0.67us    0.337ms
             PythonMethodCalls:    266ms    309ms    1.37us    0.149ms
                     Recursion:    286ms    329ms    6.57us    0.459ms
                  SecondImport:    170ms    197ms    1.97us    0.181ms
           SecondPackageImport:    187ms    215ms    2.15us    0.178ms
         SecondSubmoduleImport:    243ms    275ms    2.75us    0.215ms
       SimpleComplexArithmetic:    177ms    199ms    0.23us    0.370ms
        SimpleDictManipulation:    159ms    185ms    0.15us    0.498ms
         SimpleFloatArithmetic:    177ms    196ms    0.15us    1.502ms
      SimpleIntFloatArithmetic:    109ms    126ms    0.10us    0.574ms
       SimpleIntegerArithmetic:    108ms    124ms    0.09us    0.611ms
        SimpleListManipulation:    145ms    169ms    0.14us    0.619ms
          SimpleLongArithmetic:    167ms    190ms    0.29us    0.324ms
                    SmallLists:    247ms    274ms    0.40us    0.339ms
                   SmallTuples:    204ms    224ms    0.42us    0.429ms
         SpecialClassAttribute:    193ms    216ms    0.18us    0.558ms
      SpecialInstanceAttribute:    255ms    280ms    0.23us    0.470ms
                StringMappings:    297ms    321ms    1.28us    0.474ms
              StringPredicates:    229ms    274ms    0.39us    3.892ms
                 StringSlicing:    238ms    258ms    0.46us    0.962ms
                     TryExcept:     86ms    102ms    0.05us    0.755ms
                TryRaiseExcept:    155ms    173ms    2.70us    0.357ms
                  TupleSlicing:    188ms    217ms    0.83us    0.050ms
               UnicodeMappings:    103ms    118ms    3.29us    0.595ms
             UnicodePredicates:    176ms    207ms    0.38us    3.950ms
             UnicodeProperties:    187ms    212ms    0.53us    3.228ms
                UnicodeSlicing:    312ms    342ms    0.70us    0.834ms
-------------------------------------------------------------------------------
Totals:                          10343ms  11677ms

This is an improvement of more than 7%.

In addition, here is a run of the PGO optimized .NET 2005:
Test                             minimum  average  operation  overhead
-------------------------------------------------------------------------------
          BuiltinFunctionCalls:    232ms    250ms    0.49us    0.330ms
           BuiltinMethodLookup:    276ms    296ms    0.28us    0.382ms
                 CompareFloats:    130ms    142ms    0.12us    0.451ms
         CompareFloatsIntegers:    150ms    166ms    0.18us    0.326ms
               CompareIntegers:    130ms    155ms    0.09us    0.729ms
        CompareInternedStrings:    152ms    197ms    0.13us    1.947ms
                  CompareLongs:    136ms    146ms    0.14us    0.390ms
                CompareStrings:    151ms    174ms    0.17us    1.583ms
                CompareUnicode:    131ms    167ms    0.22us    0.965ms
                 ConcatStrings:    417ms    485ms    0.97us    0.681ms
                 ConcatUnicode:    483ms    551ms    1.84us    0.484ms
               CreateInstances:    224ms    252ms    2.25us    0.600ms
            CreateNewInstances:    186ms    216ms    2.58us    0.407ms
       CreateStringsWithConcat:    155ms    175ms    0.18us    1.264ms
       CreateUnicodeWithConcat:    275ms    306ms    0.76us    0.437ms
                  DictCreation:    160ms    186ms    0.47us    0.443ms
             DictWithFloatKeys:    349ms    375ms    0.42us    0.924ms
           DictWithIntegerKeys:    143ms    173ms    0.14us    1.296ms
            DictWithStringKeys:    157ms    177ms    0.15us    1.184ms
                      ForLoops:    140ms    155ms    6.21us    0.074ms
                    IfThenElse:    107ms    127ms    0.09us    0.955ms
                   ListSlicing:    217ms    256ms   18.29us    0.103ms
                NestedForLoops:    166ms    194ms    0.13us    0.018ms
          NormalClassAttribute:    163ms    179ms    0.15us    0.564ms
       NormalInstanceAttribute:    151ms    169ms    0.14us    0.536ms
           PythonFunctionCalls:    210ms    235ms    0.71us    0.313ms
             PythonMethodCalls:    237ms    260ms    1.15us    0.167ms
                     Recursion:    285ms    334ms    6.68us    0.538ms
                  SecondImport:    147ms    169ms    1.69us    0.243ms
           SecondPackageImport:    155ms    200ms    2.00us    0.215ms
         SecondSubmoduleImport:    202ms    234ms    2.34us    0.203ms
       SimpleComplexArithmetic:    162ms    187ms    0.21us    0.446ms
        SimpleDictManipulation:    162ms    181ms    0.15us    0.627ms
         SimpleFloatArithmetic:    171ms    201ms    0.15us    1.335ms
      SimpleIntFloatArithmetic:    119ms    137ms    0.10us    0.659ms
       SimpleIntegerArithmetic:    114ms    128ms    0.10us    0.668ms
        SimpleListManipulation:    145ms    161ms    0.14us    0.764ms
          SimpleLongArithmetic:    161ms    178ms    0.27us    0.423ms
                    SmallLists:    234ms    271ms    0.40us    0.454ms
                   SmallTuples:    182ms    203ms    0.38us    0.497ms
         SpecialClassAttribute:    174ms    201ms    0.17us    0.716ms
      SpecialInstanceAttribute:    230ms    252ms    0.21us    0.558ms
                StringMappings:    285ms    313ms    1.24us    0.514ms
              StringPredicates:    233ms    275ms    0.39us    3.475ms
                 StringSlicing:    225ms    242ms    0.43us    1.037ms
                     TryExcept:     78ms     89ms    0.04us    0.961ms
                TryRaiseExcept:    133ms    156ms    2.44us    0.454ms
                  TupleSlicing:    186ms    202ms    0.77us    0.078ms
               UnicodeMappings:    103ms    118ms    3.29us    0.520ms
             UnicodePredicates:    186ms    216ms    0.40us    3.414ms
             UnicodeProperties:    180ms    214ms    0.54us    2.530ms
                UnicodeSlicing:    299ms    318ms    0.65us    0.815ms
-------------------------------------------------------------------------------
Totals:                           9974ms  11345ms

This is an improvement of another 3.5 %.
In all, we have a performance increase of more than 10%.
Granted, this is from a single set of runs, but I think we should start considering to make PCBuild8 a "supported" build.

Cheers,
Kristj?n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061011/49a0c38e/attachment-0001.htm 

From r.m.oudkerk at googlemail.com  Wed Oct 11 17:14:27 2006
From: r.m.oudkerk at googlemail.com (Richard Oudkerk)
Date: Wed, 11 Oct 2006 16:14:27 +0100
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
References: 
	<20061010084306.09AE.JCARLSON@uci.edu> 
Message-ID: 

On 10/10/06, Fredrik Lundh  wrote:
> Josiah Carlson wrote:
>
> > Presumably with this library you have created, you have also written a
> > fast object encoder/decoder (like marshal or pickle).  If it isn't any
> > faster than cPickle or marshal, then users may bypass the module and opt
> > for fork/etc. + XML-RPC
>
> XML-RPC isn't close to marshal and cPickle in performance, though, so
> that statement is a bit misleading.
>
> the really interesting thing here is a ready-made threading-style API, I
> think.  reimplementing queues, locks, and semaphores can be a reasonable
> amount of work; might as well use an existing implementation.
>

The module uses cPickle.   As for speed, on my old laptop I get maybe
1300 objects through a queue a second.  For many purposes this might
be too slow, in which cases you are better of sticking to threading;
for many other cases that should not be a problem.  It should quite
possible to connect to an ObjectServer on a different machine, though
I have not tried it.

Although I reuse Queue, I wrote locks, semaphores and conditions from
scratch -- I could not see a sensible way to use the original
implementations.  (The implementations of those classes are actually
quite a bit shorter than the ones in threading.py.)

By the way, on windows the example files currently need to be executed
from commandline rather than clicked on (but that is easily fixable).

From r.m.oudkerk at googlemail.com  Wed Oct 11 17:20:44 2006
From: r.m.oudkerk at googlemail.com (Richard Oudkerk)
Date: Wed, 11 Oct 2006 16:20:44 +0100
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <20061010130901.09B1.JCARLSON@uci.edu>
References: <20061010084306.09AE.JCARLSON@uci.edu> 
	<20061010130901.09B1.JCARLSON@uci.edu>
Message-ID: 

On 10/10/06, Josiah Carlson  wrote:
> > the really interesting thing here is a ready-made threading-style API, I
> > think.  reimplementing queues, locks, and semaphores can be a reasonable
> > amount of work; might as well use an existing implementation.
>
> Really, it is a matter of asking what kind of API is desireable.  Do we
> want to have threading plus other stuff be the style of API that we want
> to replicate?  Do we want to have shared queue objects, or would an
> XML-RPC-esque remote.queue_put('queue_X', value) and
> remote.queue_get('queue_X', blocking=1) be better?

Whatever the API is, I think it is useful if you can swap between
threads and processes just by changing the import line.  That way you
can write applications without deciding upfront which to use.

From python at rcn.com  Thu Oct 12 17:12:43 2006
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Oct 2006 08:12:43 -0700
Subject: [Python-Dev] Python 2.5 performance
References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>
Message-ID: <00df01c6ee10$ded9c920$ea146b0a@RaymondLaptop1>

> From: Kristj?n V. J?nsson
> I think we should start considering to make PCBuild8 a "supported" build.

+1 and not just for the free speed-up.  VC8 is what more and more Windows 
developers will have on there machines.  Without a supported build, it becomes 
much harder to make patches or build compatible extensions.

Raymond 

From snaury at gmail.com  Thu Oct 12 17:15:07 2006
From: snaury at gmail.com (Alexey Borzenkov)
Date: Thu, 12 Oct 2006 19:15:07 +0400
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
Message-ID: 

Hi all,

I've been looking at python 2.5 today and what I notices is absense of
spawnvp with this comment in os.py:

  # At the moment, Windows doesn't implement spawnvp[e],
  # so it won't have spawnlp[e] either.

I'm wondering, why so? Searching MSDN I can see that these functions
are implemented in CRT:

  spawnvp: http://msdn2.microsoft.com/en-us/library/275khfab.aspx
  spawnvpe: http://msdn2.microsoft.com/en-us/library/h565xwht.aspx

I can also see that spawnvp and spawnvpe are currently wrapped in
posixmodule.c, but for some reason on OS/2 only. Forgive me if I'm
wrong but shouldn't it work when

  #if defined(PYOS_OS2)

is changed to

  #if defined(PYOS_OS2) || defined(MS_WINDOWS)

around spawnvp and spawnvpe wrappers and in posix_methods?
At least when I did it with my copy, nt.spawnvp seems to work fine...

From barry at python.org  Thu Oct 12 17:36:37 2006
From: barry at python.org (Barry Warsaw)
Date: Thu, 12 Oct 2006 11:36:37 -0400
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610121808.47010.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>
Message-ID: <2514DA1C-F5A1-4144-9068-006A933C516C@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 12, 2006, at 4:08 AM, Anthony Baxter wrote:

> I've had a couple of queries about whether PSF-2006-001 merits a  
> 2.3.6.
> Personally, I lean towards "no" - 2.4 was nearly two years ago now.  
> But I'm
> open to other opinions - I guess people see the phrase "buffer  
> overrun" and
> they get scared.
>
> Plus once 2.4.4 final is out next week, I'll have cut 12 releases  
> since
> March. Assuming a 2.5.1 before March (very likely) that'll be 14  
> releases
> in 12 months. 16 releases in 12 months would just about make me go  
> crazy.

I've offered in the past to dust off my release manager cap and do a  
2.3.6 release.  Having not done one in a long while, the most  
daunting part for me is getting the website updated, since I have  
none of those tools installed.

I'm still willing to do a 2.3.6, though the last time this came up  
the response was too underwhelming to care.  I'm not sure this  
advisory is enough to change people's minds about that -- I'm sure  
any affected downstream distro is fully capable of patching and re- 
releasing their own packages.  Since this doesn't affect the  
binaries /we/ release, I'm not sure I care enough either.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRS5hD3EjvBPtnXfVAQIlLgP/Rz5ahaeus0VLJT0HmyZUYBf07Crr2e1K
KgCoEDqXZq+LyF7B8bqokXZ4uFisBbQTREM3d+8vYEHC9kcQpt0FurkSFc47G0gj
rJvm0XbGkhXFGdPqrTwUoT033f/bhabpEILDkNJx6bB+Jk5G23EyTKRRDB531QvY
qC6ttgGRfVA=
=dECg
-----END PGP SIGNATURE-----

From tjreedy at udel.edu  Thu Oct 12 19:34:09 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 12 Oct 2006 13:34:09 -0400
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
References: <200610121808.47010.anthony@interlink.com.au>
	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>
Message-ID: 

"Barry Warsaw"  wrote in message 
news:2514DA1C-F5A1-4144-9068-006A933C516C at python.org...
> -----BEGIN PGP SIGNED MESSAGE-----
> I've offered in the past to dust off my release manager cap and do a
> 2.3.6 release.  Having not done one in a long while, the most
> daunting part for me is getting the website updated, since I have
> none of those tools installed.
>
> I'm still willing to do a 2.3.6, though the last time this came up
> the response was too underwhelming to care.  I'm not sure this
> advisory is enough to change people's minds about that -- I'm sure
> any affected downstream distro is fully capable of patching and re-
> releasing their own packages.  Since this doesn't affect the
> binaries /we/ release, I'm not sure I care enough either.

Perhaps all that is needed from both a practical and public relations 
viewpoint is the release of a 2.3.5U4 security patch as a separate file 
listed just after 2.3.5 on the source downloads page (if this has not been 
done already).

Add a note (or link to a note) to the effect that it should be applied if 
one has or is going to compile a wide Unicode build for use in an 
environment exposed to untrusted Unicode text.

tjr

From barry at python.org  Thu Oct 12 19:55:17 2006
From: barry at python.org (Barry Warsaw)
Date: Thu, 12 Oct 2006 13:55:17 -0400
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>

Message-ID: <00467286-2218-460F-9B46-54A59F9CC312@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 12, 2006, at 1:34 PM, Terry Reedy wrote:

> Perhaps all that is needed from both a practical and public relations
> viewpoint is the release of a 2.3.5U4 security patch as a separate  
> file
> listed just after 2.3.5 on the source downloads page (if this has  
> not been
> done already).

I don't currently have the ability to update the website, but I think  
the download page should have a big red star that points to the  
security patch.  The 2.3.5 page should probably be updated with a  
link to the patch too.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRS6BjnEjvBPtnXfVAQKssAQAnrgoMFbuAQRFSAReCYLBovsXJK481NdB
gTk/gaAAXe15Ko+HN0gr1EF7Mpd9a8h+5UaWyiQo+2dEJFPYr8LKcLhVLRO75jwK
A7oeXl859cUjwVK1Lc6uR/gFXUIhCsd8kujKb3lE71K6ygVtcqHwxr4OcMlMe/+j
YExPu6zELjk=
=NcuJ
-----END PGP SIGNATURE-----

From snaury at gmail.com  Thu Oct 12 20:32:23 2006
From: snaury at gmail.com (Alexey Borzenkov)
Date: Thu, 12 Oct 2006 22:32:23 +0400
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 
Message-ID: 

On 10/12/06, Alexey Borzenkov  wrote:
> At least when I did it with my copy, nt.spawnvp seems to work fine...

Hi everyone again. I've created patch for spawn*p*, as well as for
exec*p* against trunk, so that when possible it uses crt's execvp[e]
(defined via HAVE_EXECVP, if there are other platforms that have it
they will need to define HAVE_EXECVP and HAVE_SPAWNVP). Fix is in
os.py and posixmodule.c:

http://snaury.googlepages.com/python-win32-spawn_p_.patch

Should I submit it to sourceforge as a patch, or someone can review it as is?

From aahz at pythoncraft.com  Thu Oct 12 20:39:41 2006
From: aahz at pythoncraft.com (Aahz)
Date: Thu, 12 Oct 2006 11:39:41 -0700
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 

Message-ID: <20061012183940.GA13499@panix.com>

On Thu, Oct 12, 2006, Alexey Borzenkov wrote:
>
> Should I submit it to sourceforge as a patch, or someone can review it as is?

Always submit patches; that guarantees your work won't get lost.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you don't know what your program is supposed to do, you'd better not
start writing it."  --Dijkstra

From anthony at interlink.com.au  Thu Oct 12 21:27:59 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 05:27:59 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>

Message-ID: <200610130528.01672.anthony@interlink.com.au>

On Thursday 12 October 2006 18:18, Fredrik Lundh wrote:
> Anthony Baxter wrote:
> > 16 releases in 12 months would just about make me go crazy.
>
> is there any way we could further automate or otherwise streamline or
> distribute the release process ?

It's already pretty heavily automated (see welease.py in the SVN sandbox).
The killer problem is pyramid (the system for the website).

Here's (roughly) a breakdown of the workload:

- Update the 10 or so files that need the date and version number (about 3m)
- Run welease.py, select the branch, enter the version number, press 4 
buttons, one after the other. It complains and stops if something goes wrong.
(elapsed time about 5-10m, actual "work" time < 30s)
- Wait for the Mac/Win/Doc builders (elapsed, 6-12h, depending on timezones, 
actual "work" time 0s)
- Sign binaries and put in place on website (maybe 2m work, plus 5-10m to scp 
up to dinsdale)
- Update webpages (between 30m and an hour, depending on how much I have to 
fight with pyramid. I still need to go update the old release pages putting 
the warnings on them, so there's probably another hour of work today)

I've mentioned this on pydotorg enough times, I don't feel I can continue to 
complain about it (because I can't offer the time to make it better) but 
pyramid is *not* *good* from my point of view. The older system with 
Makefiles, ht2html and rsync took maybe 1/4 to 1/3 as long.

> ideally, releasing (earlier release + well-defined patch set) should be
> fairly trivial, compared to releasing (new release from trunk).  what do
> we have to do to make it easier to handle that case?

Mostly it is easy for me, with the one huge caveat. As far as I know, the Mac 
build is a single command to run for Ronald, and the Doc build similarly for 
Fred. I don't know what Martin has to do for the Windows build.

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From barry at python.org  Thu Oct 12 22:25:55 2006
From: barry at python.org (Barry Warsaw)
Date: Thu, 12 Oct 2006 16:25:55 -0400
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610130528.01672.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>

	<200610130528.01672.anthony@interlink.com.au>
Message-ID: 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 12, 2006, at 3:27 PM, Anthony Baxter wrote:

> Mostly it is easy for me, with the one huge caveat. As far as I  
> know, the Mac
> build is a single command to run for Ronald, and the Doc build  
> similarly for
> Fred. I don't know what Martin has to do for the Windows build.

Why can't we get buildbot to do most or all of this?  At work, we  
have buildbot slaves that post installers to a share after successful  
checkout, build, and test on all our supported platforms.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRS6k2HEjvBPtnXfVAQJeawP7BTqVw7tN80h5lB5UZp4MuN2Q/3KWapIi
lYZeBqoaiouXKIkKsHbCVb1/OeZQRnDwEqWPu0xKfzlteYUchmDh2h53nzfynyyS
PdJ5FaKcAk0LBjR0JsSZKd6TEWxKZZHs04V2LiKZpmsICG8g7uH954wleyGLTl2h
7VZ1aVxGuko=
=1Ito
-----END PGP SIGNATURE-----

From rasky at develer.com  Thu Oct 12 22:29:46 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Thu, 12 Oct 2006 22:29:46 +0200
Subject: [Python-Dev] Python 2.5 performance
References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>
Message-ID: <11ac01c6ee3d$2830a450$e303030a@trilan>

Kristj?n V. J?nsson wrote:

> This is an improvement of another 3.5 %.
> In all, we have a performance increase of more than 10%.
> Granted, this is from a single set of runs, but I think we should
> start considering to make PCBuild8 a "supported" build.

Kristj?n, I wonder if the performance improvement comes from ceval.c only
(or maybe a few other selected files). Is it possible to somehow link the
PGO-optimized ceval.obj into the VS2003 project?
-- 
Giovanni Bajo

From ronaldoussoren at mac.com  Thu Oct 12 22:38:28 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 12 Oct 2006 22:38:28 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>

	<200610130528.01672.anthony@interlink.com.au>

Message-ID: <0EFD836B-CB42-4F2D-8B82-883758487D87@mac.com>

On Oct 12, 2006, at 10:25 PM, Barry Warsaw wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Oct 12, 2006, at 3:27 PM, Anthony Baxter wrote:
>
>> Mostly it is easy for me, with the one huge caveat. As far as I
>> know, the Mac
>> build is a single command to run for Ronald, and the Doc build
>> similarly for
>> Fred. I don't know what Martin has to do for the Windows build.
>
> Why can't we get buildbot to do most or all of this?  At work, we
> have buildbot slaves that post installers to a share after successful
> checkout, build, and test on all our supported platforms.

The windows build is a single command, but I test the output on 3  
different platforms (10.3/ppc, 10.4/ppc and 10.4/x86). If buildbot  
supports such a configuration I'd be very interested (and not just  
for Python itself).

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061012/85d9353c/attachment.bin 

From anthony at interlink.com.au  Thu Oct 12 22:43:40 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 06:43:40 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<200610130528.01672.anthony@interlink.com.au>

Message-ID: <200610130643.46552.anthony@interlink.com.au>

On Friday 13 October 2006 06:25, Barry Warsaw wrote:
> On Oct 12, 2006, at 3:27 PM, Anthony Baxter wrote:
> > Mostly it is easy for me, with the one huge caveat. As far as I
> > know, the Mac
> > build is a single command to run for Ronald, and the Doc build
> > similarly for
> > Fred. I don't know what Martin has to do for the Windows build.
>
> Why can't we get buildbot to do most or all of this?  At work, we
> have buildbot slaves that post installers to a share after successful
> checkout, build, and test on all our supported platforms.

Speaking for myself, I'd rather do it by hand, if it's not a lot of work
(which it isn't) - I don't like the idea of "official" releases just being
an automated thing. If you're instead just talking about daily builds, maybe,
but we'd need to have some new way to do versioning for these.

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From martin at v.loewis.de  Thu Oct 12 22:49:32 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Oct 2006 22:49:32 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>		<200610130528.01672.anthony@interlink.com.au>

Message-ID: <452EAA5C.9090806@v.loewis.de>

Barry Warsaw schrieb:
> Why can't we get buildbot to do most or all of this? 

Very easy. Because somebody has to set it up. I estimate
a man month or so before it works.

Regards,
Martin

From martin at v.loewis.de  Thu Oct 12 22:50:54 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Oct 2006 22:50:54 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610130528.01672.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>	
	<200610130528.01672.anthony@interlink.com.au>
Message-ID: <452EAAAE.2050200@v.loewis.de>

Anthony Baxter schrieb:
> Mostly it is easy for me, with the one huge caveat. As far as I know, the Mac 
> build is a single command to run for Ronald, and the Doc build similarly for 
> Fred. I don't know what Martin has to do for the Windows build.

Actually, for 2.3.x, I wouldn't do the Windows builds. I think Thomas
Heller did the 2.3.x series.

Regards,
Martin

From g.brandl at gmx.net  Thu Oct 12 21:30:49 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 12 Oct 2006 21:30:49 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <2514DA1C-F5A1-4144-9068-006A933C516C@python.org>
References: <200610121808.47010.anthony@interlink.com.au>
	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>
Message-ID: 

Barry Warsaw wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On Oct 12, 2006, at 4:08 AM, Anthony Baxter wrote:
> 
>> I've had a couple of queries about whether PSF-2006-001 merits a  
>> 2.3.6.
>> Personally, I lean towards "no" - 2.4 was nearly two years ago now.  
>> But I'm
>> open to other opinions - I guess people see the phrase "buffer  
>> overrun" and
>> they get scared.
>>
>> Plus once 2.4.4 final is out next week, I'll have cut 12 releases  
>> since
>> March. Assuming a 2.5.1 before March (very likely) that'll be 14  
>> releases
>> in 12 months. 16 releases in 12 months would just about make me go  
>> crazy.
> 
> I've offered in the past to dust off my release manager cap and do a  
> 2.3.6 release.  Having not done one in a long while, the most  
> daunting part for me is getting the website updated, since I have  
> none of those tools installed.

I'm I the only one who feels that the website is a big workflow problem?

Georg

From martin at v.loewis.de  Thu Oct 12 22:57:38 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Oct 2006 22:57:38 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>

Message-ID: <452EAC42.5040703@v.loewis.de>

Fredrik Lundh schrieb:
> ideally, releasing (earlier release + well-defined patch set) should be
> fairly trivial, compared to releasing (new release from trunk).  what do
> we have to do to make it easier to handle that case?

For the Windows release, I doubt there is much one can do. The
time-consuming part is to run the MSI file, on three different
architectures, and in various combinations (admin/no-admin,
default directory/Program Files, upgrade/no-upgrade). I don't
always do all of them, but still it takes a while; I usually
need an hour to make a release.

Plus, sometimes something goes wrong: there might a backport
that doesn't work on Windows, or it might be that I broke
my build environment somehow (which I normally keep across
releases - if I have to start from scratch on a fresh
machine, it takes much longer: a day or so).

Regards,
Martin

From martin at v.loewis.de  Thu Oct 12 23:00:09 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Oct 2006 23:00:09 +0200
Subject: [Python-Dev] Python 2.5 performance
In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>
References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>
Message-ID: <452EACD9.9090001@v.loewis.de>

Kristj?n V. J?nsson schrieb:
> This is an improvement of another 3.5 %.
> In all, we have a performance increase of more than 10%.
> Granted, this is from a single set of runs, but I think we should start
> considering to make PCBuild8 a "supported" build.

What do you mean by that? That Python 2.5.1 should be compiled with
VC 2005? Something else (if so, what)?

Regards,
Martin

From greg at electricrain.com  Thu Oct 12 23:03:10 2006
From: greg at electricrain.com (Gregory P. Smith)
Date: Thu, 12 Oct 2006 14:03:10 -0700
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610130643.46552.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>
	<200610130528.01672.anthony@interlink.com.au>

	<200610130643.46552.anthony@interlink.com.au>
Message-ID: <20061012210310.GC6015@zot.electricrain.com>

On Fri, Oct 13, 2006 at 06:43:40AM +1000, Anthony Baxter wrote:
> On Friday 13 October 2006 06:25, Barry Warsaw wrote:
> > On Oct 12, 2006, at 3:27 PM, Anthony Baxter wrote:
> > > Mostly it is easy for me, with the one huge caveat. As far as I
> > > know, the Mac
> > > build is a single command to run for Ronald, and the Doc build
> > > similarly for
> > > Fred. I don't know what Martin has to do for the Windows build.
> >
> > Why can't we get buildbot to do most or all of this?  At work, we
> > have buildbot slaves that post installers to a share after successful
> > checkout, build, and test on all our supported platforms.
> 
> Speaking for myself, I'd rather do it by hand, if it's not a lot of work
> (which it isn't) - I don't like the idea of "official" releases just being
> an automated thing.

IMHO thats a backwards view; I'm with Barry.  Requiring human
intervention to do anything other than press the big green "go" button
to launch the "official" release build process is an opportunity for
human error.  the same goes for testing the built release installers
and tarballs.

three macs with some virtual machines could take care of this (damn
apple for not allowing their stupid OS to be virtualized).  that said,
i'm not volunteering to setup an automated system for this but i've
got good ideas how to do it if i ever find time or someone wants to
chat offline. :(

as for buildbot, i haven't looked at its design but from the chatter
i've seen i was under the impression that it operates on a continually
updated sandbox rather than a 100% fresh checkout for each build?  if
thats true (is it?) i'd prefer to see a build system setup to do a
fresh checkout+build of everything (including externals) in a new
directory for each build in use.  thats what we do at work.

none of the above even considers the web site updating problem..

greg

From greg at electricrain.com  Thu Oct 12 23:04:31 2006
From: greg at electricrain.com (Gregory P. Smith)
Date: Thu, 12 Oct 2006 14:04:31 -0700
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>

Message-ID: <20061012210431.GD6015@zot.electricrain.com>

On Thu, Oct 12, 2006 at 09:30:49PM +0200, Georg Brandl wrote:
> Barry Warsaw wrote:
> > I've offered in the past to dust off my release manager cap and do a  
> > 2.3.6 release.  Having not done one in a long while, the most  
> > daunting part for me is getting the website updated, since I have  
> > none of those tools installed.
> 
> I'm I the only one who feels that the website is a big workflow problem?

nope, you're not.

From martin at v.loewis.de  Thu Oct 12 23:04:52 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Oct 2006 23:04:52 +0200
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 

Message-ID: <452EADF4.9010800@v.loewis.de>

Alexey Borzenkov schrieb:
> Should I submit it to sourceforge as a patch, or someone can review it as is?

Please consider also exposing _wspawnvp, depending on whether path
argument is a Unicode object or not. See PEP 277 for guidance.
Since this would go into 2.6, support for Windows 95 isn't mandatory.

Regards,
Martin

From greg at electricrain.com  Thu Oct 12 23:07:44 2006
From: greg at electricrain.com (Gregory P. Smith)
Date: Thu, 12 Oct 2006 14:07:44 -0700
Subject: [Python-Dev] Python 2.5 performance
In-Reply-To: <452EACD9.9090001@v.loewis.de>
References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>
	<452EACD9.9090001@v.loewis.de>
Message-ID: <20061012210744.GE6015@zot.electricrain.com>

On Thu, Oct 12, 2006 at 11:00:09PM +0200, "Martin v. L?wis" wrote:
> Kristj?n V. J?nsson schrieb:
> > This is an improvement of another 3.5 %.
> > In all, we have a performance increase of more than 10%.
> > Granted, this is from a single set of runs, but I think we should start
> > considering to make PCBuild8 a "supported" build.
> 
> What do you mean by that? That Python 2.5.1 should be compiled with
> VC 2005? Something else (if so, what)?

i read that as just suggesting that updates should be checked into the
release25-maint tree to get PCBuild8 working out of the box for anyone
who wants to build python from source with vs2005.

Since 2.5 has already shipped built with vs2003 all of the 2.5.x
releases should continue to use that so that third party binary
modules continue to work across 2.5.x versions.

-g

From martin at v.loewis.de  Thu Oct 12 23:07:56 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Oct 2006 23:07:56 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <20061012210310.GC6015@zot.electricrain.com>
References: <200610121808.47010.anthony@interlink.com.au>	<200610130528.01672.anthony@interlink.com.au>		<200610130643.46552.anthony@interlink.com.au>
	<20061012210310.GC6015@zot.electricrain.com>
Message-ID: <452EAEAC.1020007@v.loewis.de>

Gregory P. Smith schrieb:
> three macs with some virtual machines could take care of this (damn
> apple for not allowing their stupid OS to be virtualized).  that said,
> i'm not volunteering to setup an automated system for this but i've
> got good ideas how to do it if i ever find time or someone wants to
> chat offline. :(

Of course, that makes the idea die here and now. Without volunteers
to do the actual work, it just won't happen.

> as for buildbot, i haven't looked at its design but from the chatter
> i've seen i was under the impression that it operates on a continually
> updated sandbox rather than a 100% fresh checkout for each build?  if
> thats true (is it?) i'd prefer to see a build system setup to do a
> fresh checkout+build of everything (including externals) in a new
> directory for each build in use.  thats what we do at work.

Buildbot could do that easily; in fact, I had to explicitly configure
it to not start from scratch each time, to reduce the network traffic
of the donated machines.

Regards,
Martin

From theller at python.net  Thu Oct 12 22:15:05 2006
From: theller at python.net (Thomas Heller)
Date: Thu, 12 Oct 2006 22:15:05 +0200
Subject: [Python-Dev] Exceptions and slicing
In-Reply-To: 
References: 			<45119D6C.2050005@v.loewis.de>

Message-ID: 

Thomas Heller schrieb:
> Martin v. L?wis schrieb:
>> Thomas Heller schrieb:
>>> 1. The __str__ of a WindowsError instance hides the 'real' windows
>>> error number.  So, in 2.4 "print error_instance" would print
>>> for example:
>>> 
>>>   [Errno 1002] Das Fenster kann die gesendete Nachricht nicht verarbeiten.
>>>     
>>> while in 2.5:
>>> 
>>>   [Error 22] Das Fenster kann die gesendete Nachricht nicht verarbeiten.
>> 
>> That's a bug. I changed the string deliberately from Errno to error to
>> indicate that it is not an errno, but a GetLastError. Can you come up
>> with a patch?
> 
> Yes, but not today.

I submitted a patch for this issue:

http://python.org/sf/1576174

Thomas

From anthony at interlink.com.au  Thu Oct 12 23:12:49 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 07:12:49 +1000
Subject: [Python-Dev] Python 2.5 performance
In-Reply-To: <452EACD9.9090001@v.loewis.de>
References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>
	<452EACD9.9090001@v.loewis.de>
Message-ID: <200610130712.51127.anthony@interlink.com.au>

On Friday 13 October 2006 07:00, Martin v. L?wis wrote:
> Kristj?n V. J?nsson schrieb:
> > This is an improvement of another 3.5 %.
> > In all, we have a performance increase of more than 10%.
> > Granted, this is from a single set of runs, but I think we should start
> > considering to make PCBuild8 a "supported" build.
>
> What do you mean by that? That Python 2.5.1 should be compiled with
> VC 2005? Something else (if so, what)?

I don't think we should switch the "official" compiler for a point release. 
I'm happy to say something like "we make the PCbuild8 environment a supported 
compiler", which means we need, at a bare minimum, a buildbot slave for that 
compiler/platform. Kristj?n, is this something you can offer?

Without a buildbot for that compiler, I don't think we can claim it's 
supported. There's plenty of platforms we "support" which don't have 
buildslaves, but they're all variants of Unix - I'm happy that they are all 
mostly[1] sane.

Anthony

[1] Offer void on some versions of HP/UX, Irix, AIX  
-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From anthony at interlink.com.au  Thu Oct 12 23:13:58 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 07:13:58 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>

Message-ID: <200610130714.00673.anthony@interlink.com.au>

On Friday 13 October 2006 05:30, Georg Brandl wrote:
> I'm I the only one who feels that the website is a big workflow problem?

Assuming you meant "Am I", then I absolutely agree with you.

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From barry at python.org  Thu Oct 12 23:34:37 2006
From: barry at python.org (Barry Warsaw)
Date: Thu, 12 Oct 2006 17:34:37 -0400
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <20061012210310.GC6015@zot.electricrain.com>
References: <200610121808.47010.anthony@interlink.com.au>
	<200610130528.01672.anthony@interlink.com.au>

	<200610130643.46552.anthony@interlink.com.au>
	<20061012210310.GC6015@zot.electricrain.com>
Message-ID: <1C323968-BA50-4D36-B5E2-5B4B10306627@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 12, 2006, at 5:03 PM, Gregory P. Smith wrote:

> IMHO thats a backwards view; I'm with Barry.  Requiring human
> intervention to do anything other than press the big green "go" button
> to launch the "official" release build process is an opportunity for
> human error.  the same goes for testing the built release installers
> and tarballs.

Oh yes, that's an important step I forgot to mention.  At work, we  
also run automated tests of the built installers, so we have a high  
degree of confidence that what our buildbot farm produces at least  
passes the sniff test (/then/ our QA dept takes over from there).   
The files we upload then are named by product, platform, version,  
revision id, and date.  It takes a manual step to delete old builds,  
but we have big disks so we generally don't do that except for EOL'd  
versions.  The nice thing about that is that you can go back to  
almost any build and pull down a working installer.

Greg hints at a major benefit of this: the knowledge for how to  
successfully build products is contained in scripts that are  
themselves revision controlled.  A wiki page providing an overview  
and the starting points are still needed but rarely consulted.

> i'm not volunteering to setup an automated system for this but i've
> got good ideas how to do it if i ever find time or someone wants to
> chat offline. :(

I wish I had the cycles to volunteer to help out implementing this. :(

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRS608nEjvBPtnXfVAQIypAQAtiantWJkvStPYR8tnl+AU+HzI7bZ54s1
oX8Ni0/1IbZQwYloV6UMmhwisirZ5bwAtNWfZnd3UQXFhrCC1MGlRMOWP/y6AwS2
/gSzUV9A1dxUE9iVdPy50gEMFrzrZ32g16+FsHzae/9FgklB+GjogAuYmr2vbxd4
SrB1dgEHnXg=
=6rIv
-----END PGP SIGNATURE-----

From barry at python.org  Thu Oct 12 23:38:26 2006
From: barry at python.org (Barry Warsaw)
Date: Thu, 12 Oct 2006 17:38:26 -0400
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <452EAEAC.1020007@v.loewis.de>
References: <200610121808.47010.anthony@interlink.com.au>	<200610130528.01672.anthony@interlink.com.au>		<200610130643.46552.anthony@interlink.com.au>
	<20061012210310.GC6015@zot.electricrain.com>
	<452EAEAC.1020007@v.loewis.de>
Message-ID: <3191856D-5AFD-4418-B99C-8BE07BA9F1F7@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 12, 2006, at 5:07 PM, Martin v. L?wis wrote:

> Of course, that makes the idea die here and now. Without volunteers
> to do the actual work, it just won't happen.

True, and there's no carrot/stick of a salary to entice people into  
doing what is mostly thankless grunt work. ;)  OTOH, there's always  
new blood with lots of time on there hands coming into the community  
looking for a way to distinguish themselves (read: students :).   
Maybe someone will step forward and win a little lemony slice of  
net.fame.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRS610nEjvBPtnXfVAQKtiwP/a+BfIhLupcDQwfY6AhXNxjvnh+scjqTd
nutSPfHR8qdbDxAxq6YcBkMeIh55XP0QSu+gYSdDDj9dGkIP0FGhurpZVW1WFrye
KEBapAmnPUnC8X5kAj0Wrw6BXacchilrH3cpC1psDtlT58TgAsUxtjmYsSKEI0ZP
l+tx3jlp2Ck=
=vbwS
-----END PGP SIGNATURE-----

From snaury at gmail.com  Thu Oct 12 23:53:10 2006
From: snaury at gmail.com (Alexey Borzenkov)
Date: Fri, 13 Oct 2006 01:53:10 +0400
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: <452EADF4.9010800@v.loewis.de>
References: 

	<452EADF4.9010800@v.loewis.de>
Message-ID: 

On 10/13/06, "Martin v. L?wis"  wrote:
> Please consider also exposing _wspawnvp, depending on whether path
> argument is a Unicode object or not. See PEP 277 for guidance.
> Since this would go into 2.6, support for Windows 95 isn't mandatory.

Umm... do you mean that spawn*p* on python 2.5 is an absolute no?

From brett at python.org  Thu Oct 12 23:55:03 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 12 Oct 2006 14:55:03 -0700
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610130714.00673.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>
	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>

	<200610130714.00673.anthony@interlink.com.au>
Message-ID: 

On 10/12/06, Anthony Baxter  wrote:
>
> On Friday 13 October 2006 05:30, Georg Brandl wrote:
> > I'm I the only one who feels that the website is a big workflow problem?
>
> Assuming you meant "Am I", then I absolutely agree with you.

I have touched the web site since the Pyramid switch and thus am not that
active, so what I am about to say may be slightly off, but ...

I know AMK was experimenting with rest2web as a possible way to do the web
site.  There has also been talk about trying out another system.  But I also
know some people would rather put the effort into improving Pyramid.

Once again, it's a matter of people putting the time in to make a switch
happen to a system that the site maintainers would be happy with.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061012/cf302456/attachment.html 

From ronaldoussoren at mac.com  Thu Oct 12 23:59:16 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 12 Oct 2006 23:59:16 +0200
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 

	<452EADF4.9010800@v.loewis.de>

Message-ID: <22924045-80C8-4BAA-A15B-964F4A44841C@mac.com>

On Oct 12, 2006, at 11:53 PM, Alexey Borzenkov wrote:

> On 10/13/06, "Martin v. L?wis"  wrote:
>> Please consider also exposing _wspawnvp, depending on whether path
>> argument is a Unicode object or not. See PEP 277 for guidance.
>> Since this would go into 2.6, support for Windows 95 isn't mandatory.
>
> Umm... do you mean that spawn*p* on python 2.5 is an absolute no?

Unless you have a time machine and manage to get it in before 2.5.0  
is released :-). Micro releases (2.5.1, 2.5.2, ...) only contain  
bugfixes, not new features.

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061012/9d4bec90/attachment.bin 

From martin at v.loewis.de  Fri Oct 13 00:03:18 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 13 Oct 2006 00:03:18 +0200
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 		<452EADF4.9010800@v.loewis.de>

Message-ID: <452EBBA6.8040106@v.loewis.de>

Alexey Borzenkov schrieb:
> On 10/13/06, "Martin v. L?wis"  wrote:
>> Please consider also exposing _wspawnvp, depending on whether path
>> argument is a Unicode object or not. See PEP 277 for guidance.
>> Since this would go into 2.6, support for Windows 95 isn't mandatory.
> 
> Umm... do you mean that spawn*p* on python 2.5 is an absolute no?

Yes. No new features can be added to Python 2.5.x; Python 2.5 has
already been released.

Regards,
Martin

From martin at v.loewis.de  Fri Oct 13 00:06:09 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Oct 2006 00:06:09 +0200
Subject: [Python-Dev] Python 2.5 performance
In-Reply-To: <20061012210744.GE6015@zot.electricrain.com>
References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>	<452EACD9.9090001@v.loewis.de>
	<20061012210744.GE6015@zot.electricrain.com>
Message-ID: <452EBC51.2050509@v.loewis.de>

Gregory P. Smith schrieb:
> i read that as just suggesting that updates should be checked into the
> release25-maint tree to get PCBuild8 working out of the box for anyone
> who wants to build python from source with vs2005.

That's passive voice ("should be checked"). I think it is unrealistic
to expect that anybody making changes will make them to PCbuild8 as
well if they are relevant; in many cases, no changes are made to the
Windows build process at all. Fortunately, Kristjan has volunteered to
maintain PCbuild8, and that's fine with me.

Regards,
Martin

From fuzzyman at voidspace.org.uk  Fri Oct 13 00:07:08 2006
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Thu, 12 Oct 2006 23:07:08 +0100
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>		<200610130714.00673.anthony@interlink.com.au>

Message-ID: <452EBC8C.4080800@voidspace.org.uk>

Brett Cannon wrote:
> On 10/12/06, *Anthony Baxter*  > wrote:
>
>     On Friday 13 October 2006 05:30, Georg Brandl wrote:
>     > I'm I the only one who feels that the website is a big workflow
>     problem?
>
>     Assuming you meant "Am I", then I absolutely agree with you.
>
>
> I have touched the web site since the Pyramid switch and thus am not 
> that active, so what I am about to say may be slightly off, but ...
>
> I know AMK was experimenting with rest2web as a possible way to do the 
> web site. 
+1 for rest2web ;-)

> There has also been talk about trying out another system.  But I also 
> know some people would rather put the effort into improving Pyramid.
>
Actually from the little I looked at it, pyramid seemed a very good 
system. Particularly the SVN integration.

If rest2web is a serious option and needs any customisation, I'd be 
happy to look into it.

Michael Foord
> Once again, it's a matter of people putting the time in to make a 
> switch happen to a system that the site maintainers would be happy with.
>
> -Brett
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>   
> ------------------------------------------------------------------------
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.408 / Virus Database: 268.13.2/472 - Release Date: 11/10/2006
>   

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.408 / Virus Database: 268.13.2/472 - Release Date: 11/10/2006

From martin at v.loewis.de  Fri Oct 13 00:14:55 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Oct 2006 00:14:55 +0200
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: 
References: 	<20051104202824.GA19678@discworld.dyndns.org>		
	<20051202025557.GA22377@ActiveState.com>

Message-ID: <452EBE5F.2040609@v.loewis.de>

Dave Abrahams schrieb:
> The only problem here is that there appears to be a lag in the release of
> ActivePython after Python itself is released.
> 
> Is there any chance of putting up just the debugging libraries a little earlier?

I may be out of context here: what is the precise problem in producing
them yourself? Why do you need somebody else to do it for you?

Regards,
Martin

From anthony at interlink.com.au  Fri Oct 13 01:06:49 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 09:06:49 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <1C323968-BA50-4D36-B5E2-5B4B10306627@python.org>
References: <200610121808.47010.anthony@interlink.com.au>
	<20061012210310.GC6015@zot.electricrain.com>
	<1C323968-BA50-4D36-B5E2-5B4B10306627@python.org>
Message-ID: <200610130906.55129.anthony@interlink.com.au>

On Friday 13 October 2006 07:34, Barry Warsaw wrote:
> > i'm not volunteering to setup an automated system for this but i've
> > got good ideas how to do it if i ever find time or someone wants to
> > chat offline. :(
>
> I wish I had the cycles to volunteer to help out implementing this. :(

Well, regardless of anything else, without someone doing it, it's not going to 
happen.

I don't have the time to spend doing this. Right now, the amount of work this 
would save me is minimal, so I also have little or no incentive to do it. The 
thing that does take the time is the website - fixing that is a major 
investment of time, which I also don't have. Yes, had I spent the probably 
20+ hours I've spent doing website stuff I could have made it a bit better, 
but that's what I know _now_ :)

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From greg.ewing at canterbury.ac.nz  Fri Oct 13 01:27:27 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Oct 2006 12:27:27 +1300
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
References: <20061010130901.09B1.JCARLSON@uci.edu>
	<452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu>
	 <452D7E92.4050206@canterbury.ac.nz>

Message-ID: <452ECF5F.7040204@canterbury.ac.nz>

Fredrik Lundh wrote:

> marshal hasn't changed in many years:

Maybe not, but I was given to understand that it's
regarded as a private format that's not guaranteed
to remain constant across versions. So even if
it happens not to change, it wouldn't be wise to
rely on that.

--
Greg

From dave at boost-consulting.com  Fri Oct 13 00:53:55 2006
From: dave at boost-consulting.com (David Abrahams)
Date: Thu, 12 Oct 2006 18:53:55 -0400
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <452EBE5F.2040609@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wi?=
	=?utf-8?Q?s's?= message of "Fri, 13 Oct 2006 00:14:55 +0200")
References: 
	<20051104202824.GA19678@discworld.dyndns.org>

	 <20051202025557.GA22377@ActiveState.com>

	<452EBE5F.2040609@v.loewis.de>
Message-ID: <8764eprvr0.fsf@pereiro.luannocracy.com>

"Martin v. L?wis"  writes:

> Dave Abrahams schrieb:
>> The only problem here is that there appears to be a lag in the release of
>> ActivePython after Python itself is released.
>> 
>> Is there any chance of putting up just the debugging libraries a little earlier?
>
> I may be out of context here: what is the precise problem in producing
> them yourself? Why do you need somebody else to do it for you?

At the moment I have too weak a server to provide those files, but
that will change very soon.  All that said, the Python and ActiveState
teams need to be aware of each and every Python release and go through
a standard release procedure anyway, whereas -- except for this
problem -- I would not.  I'm willing to try to add it if that's what
works, and of course it's easy for me to say, but I think it adds a
lot more overhead for me than it would for the other two groups.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From snaury at gmail.com  Fri Oct 13 02:46:23 2006
From: snaury at gmail.com (Alexey Borzenkov)
Date: Fri, 13 Oct 2006 04:46:23 +0400
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: <452EBBA6.8040106@v.loewis.de>
References: 

	<452EADF4.9010800@v.loewis.de>

	<452EBBA6.8040106@v.loewis.de>
Message-ID: 

Forgot to include python-dev...

On 10/13/06, "Martin v. L?wis"  wrote:
> > Umm... do you mean that spawn*p* on python 2.5 is an absolute no?
> Yes. No new features can be added to Python 2.5.x; Python 2.5 has
> already been released.

Ugh... that's just not fair. Because of this there will be no spawn*p*
in python for another two years. x_x

I have a workaround for this, that tweaks os module:

[...snip wrong code...]

It should have been:

if (not (hasattr(os, 'spawnvpe') or hasattr(os, 'spawnvp'))
    and hasattr(os, 'spawnve') and hasattr(os, 'spawnv')):
    def _os__spawnvpe(mode, file, args, env=None):
        import sys
        from errno import ENOENT, ENOTDIR
        from os import path, spawnve, spawnv, environ, defpath, pathsep, error

        if env is not None:
            func = spawnve
            argrest = (args, env)
        else:
            func = spawnv
            argrest = (args,)
            env = environ

        head, tail = path.split(file)
        if head:
            return func(mode, file, *argrest)
        if 'PATH' in env:
            envpath = env['PATH']
        else:
            envpath = defpath
        PATH = envpath.split(pathsep)
        if os.name == 'nt' or os.name == 'os2':
            PATH.insert(0, '')
        saved_exc = None
        saved_tb = None
        for dir in PATH:
            fullname = path.join(dir, file)
            try:
                return func(mode, fullname, *argrest)
            except error, e:
                tb = sys.exc_info()[2]
                if (e.errno != ENOENT and e.errno != ENOTDIR
                    and saved_exc is None):
                    saved_exc = e
                    saved_tb = tb
        if saved_exc:
            raise error, saved_exc, saved_tb
        raise error, e, tb

    def _os_spawnvp(mode, file, args):
        return os._spawnvpe(mode, file, args)

    def _os_spawnvpe(mode, file, args, env):
        return os._spawnvpe(mode, file, args, env)

    def _os_spawnlp(mode, file, *args):
        return os._spawnvpe(mode, file, args)

    def _os_spawnlpe(mode, file, *args):
        return os._spawnvpe(mode, file, args[:-1], args[-1])

    os._spawnvpe = _os__spawnvpe
    os.spawnvp = _os_spawnvp
    os.spawnvpe = _os_spawnvpe
    os.spawnlp = _os_spawnlp
    os.spawnlpe = _os_spawnlpe
    os.__all__.extend(["spawnvp", "spawnvpe", "spawnlp", "spawnlpe"])

But the fact that I have to use similar code anywhere I need to use
spawnlp is not fair. Notice that _spawnvpe is simply a clone of
_execvpe from os.py, maybe if the problem is new API in c source, this
approach could be used in os.py?

P.S. Although it's a bit stretching, one might also say that
implementing spawn*p* on windows is not actually a new feature, and
rather is a bugfix for misfeature. Why every other platform can
benefit from spawn*p* and only Windows can't? This just makes
os.spawn*p* useless: it becomes unreliable and can't be used in
portable code at all.

From barry at python.org  Fri Oct 13 03:03:47 2006
From: barry at python.org (Barry Warsaw)
Date: Thu, 12 Oct 2006 21:03:47 -0400
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 

	<452EADF4.9010800@v.loewis.de>

	<452EBBA6.8040106@v.loewis.de>

Message-ID: 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 12, 2006, at 8:46 PM, Alexey Borzenkov wrote:

> Ugh... that's just not fair. Because of this there will be no spawn*p*
> in python for another two years. x_x

Correct, but don't let that stop you.  That's what distutils and the  
Cheeseshop are for.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRS7l+XEjvBPtnXfVAQJy6gP/RkGcTXDCBYM/WL/X+sNiTp6ydvFPg20u
SrxUb/vQpNVkjA2GkFJJAXArnsxn8LB2MC+rPDRkkNMYcFw5JAUcf0IR1L+AdFnC
h+68f03XDzbeB8uqVrQ6xObEPXmanvhx1uCrApqFq+zOzqMNlbzUlyGCTLu0Cw9v
CYLa+aaKFAA=
=dX0B
-----END PGP SIGNATURE-----

From tim.peters at gmail.com  Fri Oct 13 03:04:04 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 12 Oct 2006 21:04:04 -0400
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 

	<452EADF4.9010800@v.loewis.de>

	<452EBBA6.8040106@v.loewis.de>

Message-ID: <1f7befae0610121804y539d5571wa9b717b10b0d80da@mail.gmail.com>

[Alexey Borzenkov]
>>> Umm... do you mean that spawn*p* on python 2.5 is an absolute no?

[Martin v. L?wis]
>> Yes. No new features can be added to Python 2.5.x; Python 2.5 has
>> already been released.

[Alexey Borzenkov]
> Ugh... that's just not fair. Because of this there will be no spawn*p*
> in python for another two years. x_x

Or the last 15 years.  Yet somehow people still have kids ;-)

> ...
> But the fact that I have to use similar code anywhere I need to use
> spawnlp is not fair.

"Fair" is a very strange word here.  Pain in the ass, sure, but not
fair?  Doesn't make sense.

> ...
> P.S. Although it's a bit stretching, one might also say that
> implementing spawn*p* on windows is not actually a new feature, and
> rather is a bugfix for misfeature.

No.  Introducing any new function is obviously a new feature, which
would become acutely and catastrophically visible as soon as someone
released code using the new function in 2.5.1, and someone tried to
/use/ that new code under 2.5.0.  Micro releases of Python do not
introduce new features -- take that as given.  It's been tried before,
for what appeared to be "very good reasons" at the time, and we lived
to regret it deeply.  It won't happen again.

> Why every other platform can benefit from spawn*p* and only Windows can't?

Just the obvious reason:  because so far nobody cared enough to do the
work of writing code, docs and tests for some of these functions on
Windows.

> This just makes os.spawn*p* useless: it becomes unreliable and can't be
> used in portable code at all.

It's certainly true that it can't be used in portable code, at least
not before Python 2.6.

From steve at holdenweb.com  Fri Oct 13 04:56:35 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 13 Oct 2006 03:56:35 +0100
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <452EBC8C.4080800@voidspace.org.uk>
References: <200610121808.47010.anthony@interlink.com.au>	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>		<200610130714.00673.anthony@interlink.com.au>	
	<452EBC8C.4080800@voidspace.org.uk>
Message-ID: 

Michael Foord wrote:
> Brett Cannon wrote:
> 
>>On 10/12/06, *Anthony Baxter* >> wrote:
>>
>>    On Friday 13 October 2006 05:30, Georg Brandl wrote:
>>    > I'm I the only one who feels that the website is a big workflow
>>    problem?
>>
>>    Assuming you meant "Am I", then I absolutely agree with you.
>>
>>
>>I have touched the web site since the Pyramid switch and thus am not 
>>that active, so what I am about to say may be slightly off, but ...
>>
>>I know AMK was experimenting with rest2web as a possible way to do the 
>>web site. 
> 
> +1 for rest2web ;-)
> 
> 
>>There has also been talk about trying out another system.  But I also 
>>know some people would rather put the effort into improving Pyramid.
>>
> 
> Actually from the little I looked at it, pyramid seemed a very good 
> system. Particularly the SVN integration.
> 
The real problem is the more or less complete lack of incremental 
rebuild, which does make site generation time-consuming.

The advantage of pyramid implementation was the regularisation of the 
site data.

I think we probably need to look at taking the now more-or-less regular 
data structures used to drive pyramid and find some way to use them 
(still with source control, but hopefully with much less verbiage) to 
drive something like Django.

To retain the advantages of source control this might mean using scripts 
to generate database content from SVN-controlled data files. Or 
something [waves hands vaguely and steps back hopefully].

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From martin at v.loewis.de  Fri Oct 13 06:01:16 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 13 Oct 2006 06:01:16 +0200
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <8764eprvr0.fsf@pereiro.luannocracy.com>
References: 	<20051104202824.GA19678@discworld.dyndns.org>		
	<20051202025557.GA22377@ActiveState.com>	
	<452EBE5F.2040609@v.loewis.de>
	<8764eprvr0.fsf@pereiro.luannocracy.com>
Message-ID: <452F0F8C.10708@v.loewis.de>

David Abrahams schrieb:
 > At the moment I have too weak a server to provide those files, but
> that will change very soon.  All that said, the Python and ActiveState
> teams need to be aware of each and every Python release and go through
> a standard release procedure anyway, whereas -- except for this
> problem -- I would not.  I'm willing to try to add it if that's what
> works, and of course it's easy for me to say, but I think it adds a
> lot more overhead for me than it would for the other two groups.

It's a significant amount of work, either way. It will be larger for you
when you do it the first time; after that, it will be the same amount of
work for you that it would be for me. It will be easier for you than
for me as you won't be acting under time pressure (whereas additional
actions from me will delay the entire Python release, which, due to
timezones, already significantly suffers from the need to create Windows
binaries).

I'm not sure whether you are requesting these for yourself or for
somebody else. If for somebody else, that somebody else should seriously
consider building Python himself, and publishing the result.

Regards,
Martin

From martin at v.loewis.de  Fri Oct 13 06:20:44 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 13 Oct 2006 06:20:44 +0200
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 	

	<452EADF4.9010800@v.loewis.de>	

	<452EBBA6.8040106@v.loewis.de>

Message-ID: <452F141C.3050801@v.loewis.de>

Alexey Borzenkov schrieb:
> On 10/13/06, "Martin v. L?wis"  wrote:
>> > Umm... do you mean that spawn*p* on python 2.5 is an absolute no?
>> Yes. No new features can be added to Python 2.5.x; Python 2.5 has
>> already been released.
> 
> Ugh... that's just not fair. Because of this there will be no spawn*p*
> in python for another two years. x_x

It may be inconvenient, but it is certainly fair: the same rule is
applied to *all* proposed new features. It would be unfair if that
feature was accepted, and other features were rejected.

Please try to see this from "our" view. If new features are added
to a bugfix release (say, 2.5.1), then users (programmers) would
quickly consider Python as unstable, moving target. They would
use the feature, claiming that you need Python 2.5, and not knowing
that it is really 2.5.*1* that you need. Users would try to
run the program, and find out that it doesn't work, and complain
to the author. Unhappy users, unhappy programmers, and unhappy
maintainers (as the programmers would then complain which idiot
allowed that feature in - they do use strong language at times).

It happened once, in 2.2.1 (IIRC) with the introduction of
True and False. It was very painful and lead to a lot of bad
code, and it still hasn't settled.

As you already have a work-around: what is the problem waiting
for 2.6, for you personally?

If you want to see the feature eventually, please do submit
it to sourceforge, anyway.

Regards,
Martin

From anthony at interlink.com.au  Fri Oct 13 07:05:22 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 15:05:22 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<452EBC8C.4080800@voidspace.org.uk> 
Message-ID: <200610131505.24258.anthony@interlink.com.au>

On Friday 13 October 2006 12:56, Steve Holden wrote:
> The real problem is the more or less complete lack of incremental
> rebuild, which does make site generation time-consuming.

That's _part_ of it. There's other issues. For instance, there's probably 4 
places where the "list of releases" is stored. Every time I do a release, I 
need to update all of these. If it's a new release, I also have to update the
apache config for the /X.Y.Z redirect (anyone who thinks a default URL of 
www.python.org/download/releases/X.Y.Z is a good idea needs to quit drinking
before lunchtime )

Creating a new release area, or hell, even a new page, is a whole pile of 
fiddly files. These still don't make sense to me - I end up copying an 
existing page each time, then reading through them looking for the relevant 
pieces of text. Personally, I can mostly deal with the reST now, although it 
still trips me up on a regular basis. YAML as well is just way more 
complexity - I don't understand the syntax, but it appears to offer massively 
more than we actually use.

> The advantage of pyramid implementation was the regularisation of the
> site data.

Sure - and hopefully if we go down another path we can get that out.

> To retain the advantages of source control this might mean using scripts
> to generate database content from SVN-controlled data files. Or
> something [waves hands vaguely and steps back hopefully].

The other thing to watch out for is that I (or whoever) can still do local 
work on a bunch of different files, then check it in all in one hit once it's 
done and checked. This was an issue I had with the various wiki-based 
proposals, I haven't seen many wikis that allow this.

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From anthony at interlink.com.au  Fri Oct 13 07:11:21 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 15:11:21 +1000
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 
	<452EBBA6.8040106@v.loewis.de>

Message-ID: <200610131511.23131.anthony@interlink.com.au>

On Friday 13 October 2006 10:46, Alexey Borzenkov wrote:
> But the fact that I have to use similar code anywhere I need to use
> spawnlp is not fair. Notice that _spawnvpe is simply a clone of
> _execvpe from os.py, maybe if the problem is new API in c source, this
> approach could be used in os.py?

Oddly, "fair" isn't a constraint in PEP-0006. Backwards and forwards 
compatibility between all point releases in a major release is. Adding it to 
os.py rather than C code doesn't make a difference.

> P.S. Although it's a bit stretching, one might also say that
> implementing spawn*p* on windows is not actually a new feature, and
> rather is a bugfix for misfeature. Why every other platform can
> benefit from spawn*p* and only Windows can't? This just makes
> os.spawn*p* useless: it becomes unreliable and can't be used in
> portable code at all.

"One" might say that. I wouldn't. It stays out until 2.6.

Sorry
Anthony

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From warner at lothar.com  Fri Oct 13 07:18:33 2006
From: warner at lothar.com (Brian Warner)
Date: Thu, 12 Oct 2006 22:18:33 -0700 (PDT)
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
Message-ID: <20061012.221833.74733401.warner@lothar.com>

"Gregory P. Smith"  writes:

> as for buildbot, i haven't looked at its design but from the chatter
> i've seen i was under the impression that it operates on a continually
> updated sandbox rather than a 100% fresh checkout for each build?

It's a configuration option. If you use mode="update" then your builds will
re-use the same source directory over and over, if you use mode="clobber"
then your builds will get a brand new checkout each time, and if you use
mode="copy" then the source is updated in-place in one directory, but each
build is performed from a copy of that checkout. Each offers different
tradeoffs between disk usage, network usage, and which sorts of Makefile bugs
they are likely to discover.

cheers,
 -Brian (Buildbot author)

From fredrik at pythonware.com  Fri Oct 13 07:35:23 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 13 Oct 2006 07:35:23 +0200
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: <452ECF5F.7040204@canterbury.ac.nz>
References: <20061010130901.09B1.JCARLSON@uci.edu>	<452CAA0C.6030306@egenix.com>
	<20061011090701.09CA.JCARLSON@uci.edu>	
	<452D7E92.4050206@canterbury.ac.nz>	
	<452ECF5F.7040204@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:

> Fredrik Lundh wrote:
> 
>> marshal hasn't changed in many years:
> 
> Maybe not, but I was given to understand that it's
> regarded as a private format that's not guaranteed
> to remain constant across versions. So even if
> it happens not to change, it wouldn't be wise to
> rely on that.

but given that the format *has* been stable for many years, surely it 
would make more sense to just codify that fact, rather than developing 
Yet Another Serialization Format instead?

From fredrik at pythonware.com  Fri Oct 13 08:37:00 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 13 Oct 2006 08:37:00 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>		<200610130714.00673.anthony@interlink.com.au>

Message-ID: 

Brett Cannon wrote:

> I know AMK was experimenting with rest2web as a possible way to do the 
> web site.  There has also been talk about trying out another system.  
> But I also know some people would rather put the effort into improving 
> Pyramid.

You forgot the ponies!

> Once again, it's a matter of people putting the time in to make a switch 
> happen to a system that the site maintainers would be happy with.

The people behind the current system and process has invested way too 
much energy and prestige in the current system to ever accept that the 
result is pretty lousy as a site, and complete rubbish as technology. 
It's about sunk costs, not cost- and time-effective solutions.

For reference, here's my effbot.org release procedure:

1) upload the distribution files one by one, as soon as they're 
available.  all links and stuff will appear automatically

2) update the associated description text through the web, when 
necessary, as an HTML fragment.  click "save" to publish.

3) mail out an announcement when everything looks good.

Maybe I should offer Anthony to do the releases via effbot.org instead?

From fredrik at pythonware.com  Fri Oct 13 08:40:40 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 13 Oct 2006 08:40:40 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610131505.24258.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>	<452EBC8C.4080800@voidspace.org.uk>

	<200610131505.24258.anthony@interlink.com.au>
Message-ID: 

Anthony Baxter wrote:

> The other thing to watch out for is that I (or whoever) can still do local 
> work on a bunch of different files

the point of my previous post is that you *shouldn't* have to edit a 
bunch of different files to make a new release.

From anthony at interlink.com.au  Fri Oct 13 08:44:54 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 16:44:54 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>

Message-ID: <200610131644.57560.anthony@interlink.com.au>

> For reference, here's my effbot.org release procedure:
>
> 1) upload the distribution files one by one, as soon as they're
> available.  all links and stuff will appear automatically
>
> 2) update the associated description text through the web, when
> necessary, as an HTML fragment.  click "save" to publish.
>
> 3) mail out an announcement when everything looks good.
>
> Maybe I should offer Anthony to do the releases via effbot.org instead?

First off - I'm not going to be posting 10M or 16M files through a 
web-browser. That's insane :-)

The bit of the website that's dealing with the actual files is not the tricky 
bit - I have a dinky little python script that generates the download table. 
The problems are with the other bits of the pages. I keep thinking "next 
release, I'll automate it further", but never have time on the day. 

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From fredrik at pythonware.com  Fri Oct 13 08:45:12 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 13 Oct 2006 08:45:12 +0200
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 		<452EADF4.9010800@v.loewis.de>		<452EBBA6.8040106@v.loewis.de>

Message-ID: 

Alexey Borzenkov wrote:

> P.S. Although it's a bit stretching, one might also say that
> implementing spawn*p* on windows is not actually a new feature, and
> rather is a bugfix for misfeature. Why every other platform can
> benefit from spawn*p* and only Windows can't? This just makes
> os.spawn*p* useless: it becomes unreliable and can't be used in
> portable code at all.

any reason you cannot just use the "subprocess" module instead, like 
everyone else?

From fredrik at pythonware.com  Fri Oct 13 08:59:46 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 13 Oct 2006 08:59:46 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610131644.57560.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>		
	<200610131644.57560.anthony@interlink.com.au>
Message-ID: 

Anthony Baxter wrote:

>> For reference, here's my effbot.org release procedure:
>>
>> 1) upload the distribution files one by one, as soon as they're
>> available.  all links and stuff will appear automatically
>>
>> 2) update the associated description text through the web, when
>> necessary, as an HTML fragment.  click "save" to publish.
>>
>> 3) mail out an announcement when everything looks good.
>>
>> Maybe I should offer Anthony to do the releases via effbot.org instead?
> 
> First off - I'm not going to be posting 10M or 16M files through a 
> web-browser. That's insane :-)

oh, I only edit the pages through the web, not the files.  there's 
nothing wrong with scp or sftp or rsync-over-ssh or whatever you're 
using today.

> The bit of the website that's dealing with the actual files is not the tricky 
> bit - I have a dinky little python script that generates the download table. 

yeah, but *you* are doing it.  if the server did that, Martin and
other trusted contributors could upload the files as soon as they're 
available, instead of first transferring them to you, and then waiting 
for you to find yet another precious time slot to spend on this release.

> The problems are with the other bits of the pages. I keep thinking "next 
> release, I'll automate it further", but never have time on the day. 

that's why you have to have an overall infrastructure that lets you make 
incremental tweaks to the tool chain, so things can get a little better 
all the time.  Pyramid obviously isn't such a system.

From anthony at interlink.com.au  Fri Oct 13 10:02:53 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 18:02:53 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<200610131644.57560.anthony@interlink.com.au>

Message-ID: <200610131802.58681.anthony@interlink.com.au>

On Friday 13 October 2006 16:59, Fredrik Lundh wrote:
> yeah, but *you* are doing it.  if the server did that, Martin and
> other trusted contributors could upload the files as soon as they're
> available, instead of first transferring them to you, and then waiting
> for you to find yet another precious time slot to spend on this release.

Sure - I get that. There's a couple of reasons for me doing it. First is gpg 
signing the release files, which has to happen on my local machine. There's 
also the variation in who actually builds the releases; at least one of the 
Mac builds was done by Bob I. But there could be ways around this. I don't 
want to have to ensure every builder has scp, and I'd also prefer for it to 
all "go live" at once. A while back, the Mac installer would follow up "some 
time" after the Windows and source builds. Every release, I'd get emails 
saying "where's the mac build?!" 

> > The problems are with the other bits of the pages. I keep thinking "next
> > release, I'll automate it further", but never have time on the day.
>
> that's why you have to have an overall infrastructure that lets you make
> incremental tweaks to the tool chain, so things can get a little better
> all the time.  Pyramid obviously isn't such a system.

I can't disagree with this.

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From larry at hastings.org  Fri Oct 13 10:10:52 2006
From: larry at hastings.org (Larry Hastings)
Date: Fri, 13 Oct 2006 01:10:52 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for
 string	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <452A16B1.9070109@egenix.com>
References: <4523F890.9060804@hastings.org>	<20061005192858.GA9435@zot.electricrain.com>	<17702.13238.684094.6289@montanaro.dyndns.org>	
	<4529E28E.3070800@hastings.org> <452A16B1.9070109@egenix.com>
Message-ID: <452F4A0C.7070101@hastings.org>

I've uploaded a new patch to Sourceforge in response to feedback:
  * I purged all // comments and fixed all > 80 characters added by my 
patch, as per Neil Norwitz.
  * I added a definition of max() for those who don't already have one, 
as per skip at pobox.com.
It now compiles cleanly on Linux again without modification; sorry for 
not checking that since the original patch.

I've also uploaded my hacked-together benchmark script, for all that's 
worth.

That patch tracker page again:

http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470

M.-A. Lemburg wrote:
> When comparing results, please look at the minimum runtime.
> The average times are just given to indicate how much the mintime
> differs from the average of all runs.
>   
I'll do that next time.  In the meantime, I've also uploaded a zip file 
containing the results of my benchmarking, including the stdout from the 
run and the "-f" file which contains the pickled output.  So you can 
examine my results yourself, including doing analysis on the pickled 
data if you like.

> If however the speedups are not consistent across several runs of
> pybench, then it's likely that you have some background activity
> going on on the machine which causes a slowdown in the unmodified
> run you chose as basis for the comparison.
>   
The machine is dual-core, and was quiescent at the time.  XP's scheduler 
is hopefully good enough to just leave the process running on one core.

I ran the benchmarks just once on my Linux 2.6 machine; it's a dual-CPU 
P3 933EB (or maybe just 866EB, I forget).  It's faster overall there 
too, by 1.9% (minimum run-time).  The two tests I expected to be faster 
("ConcatStrings" and "CreateStringsWithConcat") were consistently much 
faster; beyond that the results don't particularly resemble the results 
from my XP machine.  (I uploaded those .txt and .pickle files too.)

The mystery overall speedup continues, not that I find it unwelcome.  :)

> Just to make sure: you are using pybench 2.0, right ?
>   
I sure was.  And I used stringbench.py downloaded from here:

http://svn.python.org/projects/sandbox/branches/jim-fix-setuptools-cli/stringbench/stringbench.py

Cheers,

/larry/

From ncoghlan at gmail.com  Fri Oct 13 11:07:12 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 13 Oct 2006 19:07:12 +1000
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: 
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>	<452C6FD8.8070403@v.loewis.de>

Message-ID: <452F5740.8000504@gmail.com>

Fredrik Lundh wrote:
> Martin v. L?wis wrote:
> 
>> Of course, if everybody would always recompile all extension modules
>> for a new Python feature release, those flags weren't necessary.
> 
> a dynamic registration approach would be even better, with a single entry point
> used to register all methods and hooks your C extension has implemented, and
> code on the other side that builds a properly initialized type descriptor from that
> set, using fallback functions and error stubs where needed.
> 
> e.g. the impossible-to-write-from-scratch NoddyType struct initialization in
> 
>     http://docs.python.org/ext/node24.html
> 
> would collapse to
> 
>     static PyTypeObject NoddyType;

Wouldn't that have to be a pointer to allow the Python runtime complete 
control of the structure size without recompiling the extension?:

     static PyTypeObject *NoddyType;

     NoddyType = PyType_Alloc("noddy.Noddy");
     if (!NoddyType)
         return;
     PyType_Register(NoddyType, PY_TP_DEALLOC, Noddy_dealloc);
     PyType_Register(NoddyType, PY_TP_DOC, "Noddy objects");
     PyType_Register(NoddyType, PY_TP_TRAVERSE, Noddy_traverse);
     PyType_Register(NoddyType, PY_TP_CLEAR, Noddy_clear);
     PyType_Register(NoddyType, PY_TP_METHODS, Noddy_methods);
     PyType_Register(NoddyType, PY_TP_MEMBERS, Noddy_members);
     PyType_Register(NoddyType, PY_TP_INIT, Noddy_init);
     PyType_Register(NoddyType, PY_TP_NEW, Noddy_new);
     if (PyType_Ready(NoddyType) < 0)
         return;

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From fredrik at pythonware.com  Fri Oct 13 11:22:09 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 13 Oct 2006 11:22:09 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>	<452C6FD8.8070403@v.loewis.de>
	<452F5740.8000504@gmail.com>
Message-ID: 

Nick Coghlan wrote:

> > would collapse to
> >
> >     static PyTypeObject NoddyType;
>
> Wouldn't that have to be a pointer to allow the Python runtime complete
> control of the structure size without recompiling the extension?:
>
>      static PyTypeObject *NoddyType;

yeah, that's a silly typo.  or maybe I was thinking of something really clever that
I can no longer remember.

>      NoddyType = PyType_Alloc("noddy.Noddy");
>      if (!NoddyType)
>          return;

the fewer places you have to check for an error, the less chance you have to
forget to do it.  my proposal implied that the NULL check should be done in
Ready.

I've posted slightly cleaned up version of my rough proposal here:

    http://effbot.org/zone/idea-register-type.htm

From ncoghlan at gmail.com  Fri Oct 13 11:25:38 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 13 Oct 2006 19:25:38 +1000
Subject: [Python-Dev] Python 2.5 performance
In-Reply-To: <00df01c6ee10$ded9c920$ea146b0a@RaymondLaptop1>
References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc>
	<00df01c6ee10$ded9c920$ea146b0a@RaymondLaptop1>
Message-ID: <452F5B92.8060702@gmail.com>

Raymond Hettinger wrote:
>> From: Kristj?n V. J?nsson
>> I think we should start considering to make PCBuild8 a "supported" build.
> 
> +1 and not just for the free speed-up.  VC8 is what more and more Windows 
> developers will have on there machines.  Without a supported build, it becomes 
> much harder to make patches or build compatible extensions.

It also makes hobbyist hacking on the core more straightforward, as it makes 
it possible to use VC++ Express Edition to try out changes locally.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From rasky at develer.com  Fri Oct 13 11:48:22 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 13 Oct 2006 11:48:22 +0200
Subject: [Python-Dev] Proposal: No more standard library additions
References: <452F26A3.7060506@acm.org> <2ab101c6ee90$83796470$9a4c2a97@bagio>
	<52208.62.39.9.251.1160728013.squirrel@webmail.nerim.net>
Message-ID: <2f5401c6eeac$b8861b60$9a4c2a97@bagio>

Antoine wrote:

>> The standard library is not about easeness of installation. It is
>> about having
>> a consistent fixed codebase to work with. I don't want to go
>> Perl/CPAN, where you have 3-4 alternatives to do thing A which will
>> never interoperate
>> with whatever you chose among the 3-4 alternatives to do thing B.
>
> Currently in Python:
> http://docs.python.org/lib/module-xml.dom.html
> http://docs.python.org/lib/module-xml.dom.minidom.html
> http://docs.python.org/lib/module-xml.sax.html
> http://docs.python.org/lib/module-xml.parsers.expat.html
> http://docs.python.org/lib/module-xml.etree.ElementTree.html
>
> The problem of "consistent fixed codebase" is that standards get
> higher, so eventually those old stable modules lose popularity in
> favor of newer, better modules.

Those are different paradigms of "doing XML". For instance, the standard
library was missing a "pythonic" library to do XML processing, and several
arose. ElementTree (fortunately) won and joined the standard distribution. This
should allievate the need for other libraries in future.

Instead of looking what we have inside, look outside. There are dozens of
different XML "pythonic" libraries. I have fought in the past with programs
that required large XML frameworks, that in turn required to be downloaded,
built, installed, and *understood* to make the required modifictions to the
programs themselves. This slowed down my own development, and caused infinite
headaches before of version compatibilities (A requires the XML library B, but
only versions < 1.2, otherwise you can use A 2.0, which needs Python 2.4+, and
then you can use latest B; etc. etc. repeat and complicate ad-libitum). A
single version number (that of Python) and a large fixed set of libraries
anybody can use is a *strong* PLUS.

Then, there is the opposite phenomenom, which is interesting as well. I met
many perl programmers which simply re-invented their little wheel everytime.
They were mostly system administrators, so they *knew* very well what hell the
dependency chains are for both programmers and users. Thus, since perl does not
have a standard library, they simply did not import *any* module. This way, the
program is "easier" to ship, distribute and use, but it's harder to code, read,
fix, and contain unnecessary duplications with everybody's else script. Need to
send an e-mail? Why using a library, just paste chunks of cut&pasted mail
headers (with MIME, etc.) and do some basic string substitution; and the SMTP
protocol is easy, just open a socket a dump some strings to it; or you can use
'sendmail' which is available on any UNIX (and there it goes portability, just
because they did not want to evaluate and choose one of the 6 Perl SMTP
libraries... and rightfully so!).

> Therefore, you have to obsolete old stuff if you want there to be
> only One Obvious Way To Do It.

I'm totally in favor of obsoletion and removal of old cruft from the standard
library.
I'm totally against *not* having a standard library.

Giovanni Bajo

From rasky at develer.com  Fri Oct 13 11:50:23 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 13 Oct 2006 11:50:23 +0200
Subject: [Python-Dev] [py3k] Re: Proposal: No more standard library additions
References: <452F26A3.7060506@acm.org>
	<2ab101c6ee90$83796470$9a4c2a97@bagio><52208.62.39.9.251.1160728013.squirrel@webmail.nerim.net>
	<2f5401c6eeac$b8861b60$9a4c2a97@bagio>
Message-ID: <2f9401c6eead$00be4420$9a4c2a97@bagio>

I apologize, this had to go to python-3000 at .

From bob at redivi.com  Fri Oct 13 12:35:46 2006
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 13 Oct 2006 03:35:46 -0700
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610131802.58681.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>
	<200610131644.57560.anthony@interlink.com.au>

	<200610131802.58681.anthony@interlink.com.au>
Message-ID: <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com>

On 10/13/06, Anthony Baxter  wrote:
> On Friday 13 October 2006 16:59, Fredrik Lundh wrote:
> > yeah, but *you* are doing it.  if the server did that, Martin and
> > other trusted contributors could upload the files as soon as they're
> > available, instead of first transferring them to you, and then waiting
> > for you to find yet another precious time slot to spend on this release.
>
> Sure - I get that. There's a couple of reasons for me doing it. First is gpg
> signing the release files, which has to happen on my local machine. There's
> also the variation in who actually builds the releases; at least one of the
> Mac builds was done by Bob I. But there could be ways around this. I don't
> want to have to ensure every builder has scp, and I'd also prefer for it to
> all "go live" at once. A while back, the Mac installer would follow up "some
> time" after the Windows and source builds. Every release, I'd get emails
> saying "where's the mac build?!"

With most consumer connections it's a lot faster to download than to
upload. Perhaps it would save you a few minutes if the contributors
uploaded directly to the destination (or to some other fast server)
and you could download and sign it, rather than having to scp it back
up somewhere from your home connection.

To be fair, (thanks to Ronald) the Mac build is entirely automated by
a script with the caveat that you should be a little careful about
what your environment looks like (e.g. don't install fink or macports,
or to move them out of the way when building). It downloads all of the
third party dependencies, builds them with some special flags to make
it universal, builds Python, and then wraps it up in an installer
package.

Given any Mac OS X 10.4 machine, the builds could happen
automatically. Apple could probably provide one if someone asked. They
did it for Twisted. Or maybe the Twisted folks could appropriate part
of that machine's time to also build Python.

-bob

From fredrik at pythonware.com  Fri Oct 13 13:06:16 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 13 Oct 2006 13:06:16 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
References: <200610121808.47010.anthony@interlink.com.au><200610131644.57560.anthony@interlink.com.au><200610131802.58681.anthony@interlink.com.au>
	<6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com>
Message-ID: 

Anthony:

>> Sure - I get that. There's a couple of reasons for me doing it. First is gpg
>> signing the release files, which has to happen on my local machine. There's
>> also the variation in who actually builds the releases; at least one of the
>> Mac builds was done by Bob I. But there could be ways around this. I don't
>> want to have to ensure every builder has scp

scp or scp access?

the former isn't much of a requirement, really.  I would be surprised to find a
developer that didn't already have it on all machines, or knew how to run it off
the internet (type "putty download" into google and click "I feel lucky").

>> all "go live" at once. A while back, the Mac installer would follow up "some
>> time" after the Windows and source builds. Every release, I'd get emails
>> saying "where's the mac build?!"

that's a worthwhile goal, now that we have plenty of build volunteers, but
I think that could be solved simply by delaying the *public* announcement
until everything is in place.

this is open source, after all - we don't need to hide how we're doing things.

Bob Ippolito wrote:

> With most consumer connections it's a lot faster to download than to
> upload. Perhaps it would save you a few minutes if the contributors
> uploaded directly to the destination (or to some other fast server)
> and you could download and sign it, rather than having to scp it back
> up somewhere from your home connection.

that's another interesting advantage of a more asynchronous release process.
if we can reduce the costly parts to a few 8-minute slots, it's a lot easier for
any busy developer to find the time, even on a hectic day.  and if we can dis-
tribute those slots, things will be even easier.

From anthony at interlink.com.au  Fri Oct 13 13:09:06 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 13 Oct 2006 21:09:06 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com>
References: <200610121808.47010.anthony@interlink.com.au>
	<200610131802.58681.anthony@interlink.com.au>
	<6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com>
Message-ID: <200610132109.11329.anthony@interlink.com.au>

On Friday 13 October 2006 20:35, Bob Ippolito wrote:
> With most consumer connections it's a lot faster to download than to
> upload. Perhaps it would save you a few minutes if the contributors
> uploaded directly to the destination (or to some other fast server)
> and you could download and sign it, rather than having to scp it back
> up somewhere from your home connection.

I actually pull them down to both dinsdale and home, then verify they're the 
same with SHA and MD5 before signing, and uploading the keys. The only thing 
I upload directly are the keys and the source tarballs.

> Given any Mac OS X 10.4 machine, the builds could happen
> automatically. Apple could probably provide one if someone asked. They
> did it for Twisted. Or maybe the Twisted folks could appropriate part
> of that machine's time to also build Python.

We have one, macteagle. For some reason builds fail on it right now - Ronald 
might be able to supply more details as to why.

Anthony
-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From snaury at gmail.com  Fri Oct 13 13:22:58 2006
From: snaury at gmail.com (Alexey Borzenkov)
Date: Fri, 13 Oct 2006 15:22:58 +0400
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
In-Reply-To: 
References: 

	<452EADF4.9010800@v.loewis.de>

	<452EBBA6.8040106@v.loewis.de>

Message-ID: 

On 10/13/06, Fredrik Lundh  wrote:
> any reason you cannot just use the "subprocess" module instead, like
> everyone else?

Oh! Wow! I just simply didn't know of its existance (I'm pretty much
new to python), and both distutils and SCons (I was looking inside
them because they are major build systems and surely had to execute
compilers somehow), and upon seeing that each of them invented their
own method of searching path created a delusion as if inventing custom
workarounds was the only way... Sorry... x_x

From fredrik at pythonware.com  Fri Oct 13 13:26:46 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 13 Oct 2006 13:26:46 +0200
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
References: <452EADF4.9010800@v.loewis.de><452EBBA6.8040106@v.loewis.de>

Message-ID: 

Alexey Borzenkov wrote:

>> any reason you cannot just use the "subprocess" module instead, like
>> everyone else?
>
> Oh! Wow! I just simply didn't know of its existance (I'm pretty much
> new to python), and both distutils and SCons (I was looking inside
> them because they are major build systems and surely had to execute
> compilers somehow), and upon seeing that each of them invented their
> own method of searching path created a delusion as if inventing custom
> workarounds was the only way... Sorry... x_x

no problem.

someone should really update the documentation to make sure that os.spawn
and os.open and commands and popen2 and all the other 80%-solutions at
least point to the subprocess module...

(and if the library reference had been stored in a wiki, I'd fixed that before any-
one else even got this mail...)

From theller at python.net  Fri Oct 13 13:30:14 2006
From: theller at python.net (Thomas Heller)
Date: Fri, 13 Oct 2006 13:30:14 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <452EAAAE.2050200@v.loewis.de>
References: <200610121808.47010.anthony@interlink.com.au>		<200610130528.01672.anthony@interlink.com.au>
	<452EAAAE.2050200@v.loewis.de>
Message-ID: 

Martin v. L?wis schrieb:
> Anthony Baxter schrieb:
>> Mostly it is easy for me, with the one huge caveat. As far as I know, the Mac 
>> build is a single command to run for Ronald, and the Doc build similarly for 
>> Fred. I don't know what Martin has to do for the Windows build.
> 
> Actually, for 2.3.x, I wouldn't do the Windows builds. I think Thomas
> Heller did the 2.3.x series.

Yes.  But I've switched machines since I last build an installer, and I do not
have all of the needed software installed any longer, for example the Wise Installer.

Thomas

From ronaldoussoren at mac.com  Fri Oct 13 13:37:05 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 13 Oct 2006 13:37:05 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610132109.11329.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>
	<200610131802.58681.anthony@interlink.com.au>
	<6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com>
	<200610132109.11329.anthony@interlink.com.au>
Message-ID: <15446848.1160739425294.JavaMail.ronaldoussoren@mac.com>

On Friday, October 13, 2006, at 01:10PM, Anthony Baxter  wrote:

>On Friday 13 October 2006 20:35, Bob Ippolito wrote:
>> With most consumer connections it's a lot faster to download than to
>> upload. Perhaps it would save you a few minutes if the contributors
>> uploaded directly to the destination (or to some other fast server)
>> and you could download and sign it, rather than having to scp it back
>> up somewhere from your home connection.
>
>I actually pull them down to both dinsdale and home, then verify they're the 
>same with SHA and MD5 before signing, and uploading the keys. The only thing 
>I upload directly are the keys and the source tarballs.
>
>
>> Given any Mac OS X 10.4 machine, the builds could happen
>> automatically. Apple could probably provide one if someone asked. They
>> did it for Twisted. Or maybe the Twisted folks could appropriate part
>> of that machine's time to also build Python.
>
>We have one, macteagle. For some reason builds fail on it right now - Ronald 
>might be able to supply more details as to why.

IIRC it has the wrong version of Xcode installed (or rather another one than I use and test with). It also has darwinports installed at the default location, which can cause problems because the setup.py adds that directory to the include/link paths. I don't want to release installers that require that the user has darwinports installed :-)

I can supply a newer version of Xcode if someone with an admin account is willing to install that. I don't know if the admin of that machine has GUI access to the machine, if not I'd have to investigate how to ensure that the proper subpackages get installed using a command-line install (using RemoteDesktop to administrator servers has spoiled me a bit in that regard).

I guess this comes down to the usual problem: I have a working setup for building the mac installer and fixing macteagle takes time which I don't have available in great amounts (who does?). 

Ronald

From ronaldoussoren at mac.com  Fri Oct 13 13:44:15 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 13 Oct 2006 13:44:15 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com>
References: <200610121808.47010.anthony@interlink.com.au>
	<200610131644.57560.anthony@interlink.com.au>

	<200610131802.58681.anthony@interlink.com.au>
	<6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com>
Message-ID: <7635335.1160739855167.JavaMail.ronaldoussoren@mac.com>

On Friday, October 13, 2006, at 12:36PM, Bob Ippolito  wrote:

>
>To be fair, (thanks to Ronald) the Mac build is entirely automated by
>a script with the caveat that you should be a little careful about
>what your environment looks like (e.g. don't install fink or macports,
>or to move them out of the way when building). 

That (the "don't install Fink or macports" part) is because setup.py explicitly adds those directories to the library and include search path. IMHO that is a misfeature because it is much to easy to accidently contaminate a build that way. Fink and macports can easily add their directories to the search paths using OPTS and LDFLAGS, there's no need to automate this in setup.py.

The beauty of macports is that /opt/local is the default prefix, but you can easily pick another prefix and most ports work fine that way (or rather not worse than with the default prefix).

Ronald

From steve at holdenweb.com  Fri Oct 13 13:53:18 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 13 Oct 2006 12:53:18 +0100
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>	<452EBC8C.4080800@voidspace.org.uk>		<200610131505.24258.anthony@interlink.com.au>

Message-ID: 

Fredrik Lundh wrote:
> Anthony Baxter wrote:
> 
> 
>>The other thing to watch out for is that I (or whoever) can still do local 
>>work on a bunch of different files
> 
> 
> the point of my previous post is that you *shouldn't* have to edit a 
> bunch of different files to make a new release.
> 
Indeed. I seem to remember suggesting a while ago on pydotorg that 
whatever replaces pyramid should cater to groups such as the release 
team by allowing everything necessary to be generated from a simple set 
of data that wouldn't be difficult to maintain. Anthony has enough on 
his plate without having to fight the web server too ...

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From steve at holdenweb.com  Fri Oct 13 14:00:36 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 13 Oct 2006 13:00:36 +0100
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>		<200610130714.00673.anthony@interlink.com.au>	

Message-ID: 

Fredrik Lundh wrote:
> Brett Cannon wrote:
> 
> 
>>I know AMK was experimenting with rest2web as a possible way to do the 
>>web site.  There has also been talk about trying out another system.  
>>But I also know some people would rather put the effort into improving 
>>Pyramid.
> 
> 
> You forgot the ponies!
> 
> 
>>Once again, it's a matter of people putting the time in to make a switch 
>>happen to a system that the site maintainers would be happy with.
> 
> 
> The people behind the current system and process has invested way too 
> much energy and prestige in the current system to ever accept that the 
> result is pretty lousy as a site, and complete rubbish as technology. 
> It's about sunk costs, not cost- and time-effective solutions.
> 
I don't believe that's true, but I'm certainly not the one with the most 
time invested in pyramid. Tim Parkin is on record as saying he'd be 
willing to help with a(nother) migration project. I think there's a 
general appreciation of pyramid's strangths *and* deficiencies.

> For reference, here's my effbot.org release procedure:
> 
> 1) upload the distribution files one by one, as soon as they're 
> available.  all links and stuff will appear automatically
> 
> 2) update the associated description text through the web, when 
> necessary, as an HTML fragment.  click "save" to publish.
> 
> 3) mail out an announcement when everything looks good.
> 
> Maybe I should offer Anthony to do the releases via effbot.org instead?
> 
You can try. Or you can start to promote Django again ...

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From dave at boost-consulting.com  Fri Oct 13 14:36:47 2006
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 13 Oct 2006 08:36:47 -0400
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <452F0F8C.10708@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wis'?=
	=?utf-8?Q?s?= message of "Fri, 13 Oct 2006 06:01:16 +0200")
References: 
	<20051104202824.GA19678@discworld.dyndns.org>

	 <20051202025557.GA22377@ActiveState.com>

	<452EBE5F.2040609@v.loewis.de>
	<8764eprvr0.fsf@pereiro.luannocracy.com> <452F0F8C.10708@v.loewis.de>
Message-ID: <87iriowfxc.fsf@pereiro.luannocracy.com>

"Martin v. L?wis"  writes:

> I'm not sure whether you are requesting these for yourself or for
> somebody else. If for somebody else, that somebody else should seriously
> consider building Python himself, and publishing the result.

I'm requesting it for the many Boost.Python (heck, all Python 'C' API)
users who find it a usability hurdle when their first visual studio
projects fail to work properly in the default mode (debug) just
because they don't have the right Python libraries.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From rasky at develer.com  Fri Oct 13 18:21:05 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 13 Oct 2006 18:21:05 +0200
Subject: [Python-Dev] Why spawnvp not implemented on Windows?
References: <452EADF4.9010800@v.loewis.de><452EBBA6.8040106@v.loewis.de>

Message-ID: <04b701c6eee3$9525d7f0$e303030a@trilan>

Alexey Borzenkov wrote:

> Oh! Wow! I just simply didn't know of its existance (I'm pretty much
> new to python), and both distutils and SCons (I was looking inside
> them because they are major build systems and surely had to execute
> compilers somehow), and upon seeing that each of them invented their
> own method of searching path created a delusion as if inventing custom
> workarounds was the only way... Sorry... x_x

SCons is still compatible with Python 1.5. Distutils was written in the
1.5-1.6 timeframe; it has been updated since, but it is basically
unmaintained at this point (if you exclude the setuptools stuff which is its
disputed maintenance/evolution).

subprocess has been introduced in Python 2.4.
-- 
Giovanni Bajo

From theller at python.net  Fri Oct 13 20:20:55 2006
From: theller at python.net (Thomas Heller)
Date: Fri, 13 Oct 2006 20:20:55 +0200
Subject: [Python-Dev] Modulefinder
Message-ID: 

I have patched Lib/modulefinder.py to work with absolute and relative imports.
It also is faster now, and has basic unittests in Lib/test/test_modulefinder.py.

The work was done in a theller_modulefinder SVN branch.
If nobody objects, I will merge this into trunk, and possibly also into release25-maint, when I have time.

Thanks,
Thomas

From jcarlson at uci.edu  Fri Oct 13 21:02:06 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 13 Oct 2006 12:02:06 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for
	string	concatenation, now as fast as "".join(x) idiom
In-Reply-To: <452F4A0C.7070101@hastings.org>
References: <452A16B1.9070109@egenix.com> <452F4A0C.7070101@hastings.org>
Message-ID: <20061013115748.09F2.JCARLSON@uci.edu>

Larry Hastings  wrote:
[snip]
> The machine is dual-core, and was quiescent at the time.  XP's scheduler 
> is hopefully good enough to just leave the process running on one core.

It's not.  Go into the task manager (accessable via Ctrl+Alt+Del by
default) and change the process' affinity to the second core.  In my
experience, running on the second core (in both 2k and XP) tends to
produce slightly faster results.  Linux tends to keep processes on a
single core for a few seconds at a time.

 - Josiah

From pandyacus at gmail.com  Fri Oct 13 21:44:40 2006
From: pandyacus at gmail.com (Chetan Pandya)
Date: Fri, 13 Oct 2006 12:44:40 -0700
Subject: [Python-Dev] Cloning threading.py using proccesses
Message-ID: 

I just got around to reading the messages.

When I first saw this, I thought this is so that the processes that need to
share and work on shared objects. That is where the locks are required.
However, all shread objects are managed by the object manager and thus all
such operations are in effect sequential, even acquires on different locks.
Thus other shared objects in the object manager will actually not require
any (additional) synchronization. Of course, the argument here is that it is
still possible to use that code.

Cleanup of shared objects seems to be another thing to look out for. This is
a problem that subprocesses seem to avoid and has been already suggested.

-Chetan

On 10/11/06, python-dev-request at python.org 
wrote:
>
> Message: 5
> Date: Wed, 11 Oct 2006 10:23:40 +0200
> From: "M.-A. Lemburg" 
> Subject: Re: [Python-Dev] Cloning threading.py using proccesses
> To: Josiah Carlson 
> Cc: python-dev at python.org
> Message-ID: <452CAA0C.6030306 at egenix.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Josiah Carlson wrote:
> > Fredrik Lundh  wrote:
> >> Josiah Carlson wrote:
> >>
> >>> Presumably with this library you have created, you have also written a
> >>> fast object encoder/decoder (like marshal or pickle).  If it isn't any
> >>> faster than cPickle or marshal, then users may bypass the module and
> opt
> >>> for fork/etc. + XML-RPC
> >> XML-RPC isn't close to marshal and cPickle in performance, though, so
> >> that statement is a bit misleading.
> >
> > You are correct, it is misleading, and relies on a few unstated
> > assumptions.
> >
> > In my own personal delving into process splitting, RPC, etc., I usually
> > end up with one of two cases; I need really fast call/return, or I need
> > not slow call/return.  The not slow call/return is (in my opinion)
> > satisfactorally solved with XML-RPC.  But I've personally not been
> > satisfied with the speed of any remote 'fast call/return' packages, as
> > they usually rely on cPickle or marshal, which are slow compared to
> > even moderately fast 100mbit network connections.  When we are talking
> > about local connections, I have even seen cases where the
> > cPickle/marshal calls can make it so that forking the process is faster
> > than encoding the input to a called function.
>
> This is hard to believe. I've been in that business for a few
> years and so far have not found an OS/hardware/network combination
> with the mentioned features.
>
> Usually the worst part in performance breakdown for RPC is network
> latency, ie. time to connect, waiting for the packets to come through,
> etc. and this parameter doesn't really depend on the OS or hardware
> you're running the application on, but is more a factor of which
> network hardware, architecture and structure is being used.
>
> It also depends a lot on what you send as arguments, of course,
> but I assume that you're not pickling a gazillion objects :-)
>
> > I've had an idea for a fast object encoder/decoder (with limited support
> > for certain built-in Python objects), but I haven't gotten around to
> > actually implementing it as of yet.
>
> Would be interesting to look at.
>
> BTW, did you know about http://sourceforge.net/projects/py-xmlrpc/ ?
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source  (#1, Oct 11 2006)
> >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
> >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
>
> ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
>
>
> ------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061013/b836ab79/attachment.html 

From martin at v.loewis.de  Sat Oct 14 00:10:52 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 14 Oct 2006 00:10:52 +0200
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <87iriowfxc.fsf@pereiro.luannocracy.com>
References: 	<20051104202824.GA19678@discworld.dyndns.org>		
	<20051202025557.GA22377@ActiveState.com>	
	<452EBE5F.2040609@v.loewis.de>	<8764eprvr0.fsf@pereiro.luannocracy.com>
	<452F0F8C.10708@v.loewis.de>
	<87iriowfxc.fsf@pereiro.luannocracy.com>
Message-ID: <45300EEC.7080003@v.loewis.de>

David Abrahams schrieb:
>> I'm not sure whether you are requesting these for yourself or for
>> somebody else. If for somebody else, that somebody else should seriously
>> consider building Python himself, and publishing the result.
> 
> I'm requesting it for the many Boost.Python (heck, all Python 'C' API)
> users who find it a usability hurdle when their first visual studio
> projects fail to work properly in the default mode (debug) just
> because they don't have the right Python libraries.

And there is not one of them who would be willing and able to build
a debug release, and distribute that????

Regards,
Martin

From martin at v.loewis.de  Sat Oct 14 00:23:38 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 14 Oct 2006 00:23:38 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>	<452EBC8C.4080800@voidspace.org.uk>		<200610131505.24258.anthony@interlink.com.au>	

Message-ID: <453011EA.2090800@v.loewis.de>

Steve Holden schrieb:
>>> The other thing to watch out for is that I (or whoever) can still do local 
>>> work on a bunch of different files
>>
>> the point of my previous post is that you *shouldn't* have to edit a 
>> bunch of different files to make a new release.
>>
> Indeed. I seem to remember suggesting a while ago on pydotorg that 
> whatever replaces pyramid should cater to groups such as the release 
> team by allowing everything necessary to be generated from a simple set 
> of data that wouldn't be difficult to maintain. Anthony has enough on 
> his plate without having to fight the web server too ...

There is always some sort of text that accompanies a release. That has
to be edited to be correct; a machine can't do that.

Regards,
Martin

From martin at v.loewis.de  Sat Oct 14 00:24:39 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 14 Oct 2006 00:24:39 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>		<200610130528.01672.anthony@interlink.com.au>	<452EAAAE.2050200@v.loewis.de>

Message-ID: <45301227.5020805@v.loewis.de>

Thomas Heller schrieb:
> Yes.  But I've switched machines since I last build an installer, and I do not
> have all of the needed software installed any longer, for example the Wise Installer.

Ok. So we are technically incapable of producing the Windows binaries of
 another 2.3.x release, then?

Regards,
Martin

From jcarlson at uci.edu  Sat Oct 14 01:46:20 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 13 Oct 2006 16:46:20 -0700
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <45301227.5020805@v.loewis.de>
References:  <45301227.5020805@v.loewis.de>
Message-ID: <20061013164246.09F8.JCARLSON@uci.edu>

"Martin v. L?wis"  wrote:
> 
> Thomas Heller schrieb:
> > Yes.  But I've switched machines since I last build an installer, and I do not
> > have all of the needed software installed any longer, for example the Wise Installer.
> 
> Ok. So we are technically incapable of producing the Windows binaries of
>  another 2.3.x release, then?

I've got a build setup for 2.3.x, but I lack the Wise Installer.  It may
be possible to use the 2.4 or 2.5 .msi creation tools, if that was
sufficient.

 - Josiah

From tim.peters at gmail.com  Sat Oct 14 01:53:12 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 13 Oct 2006 19:53:12 -0400
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <45301227.5020805@v.loewis.de>
References: <200610121808.47010.anthony@interlink.com.au>

	<200610130528.01672.anthony@interlink.com.au>
	<452EAAAE.2050200@v.loewis.de> 
	<45301227.5020805@v.loewis.de>
Message-ID: <1f7befae0610131653h52fd3bfcnd33af7c08f6fe9d@mail.gmail.com>

[Thomas Heller]
>> Yes.  But I've switched machines since I last build an installer,
and I do not
>> have all of the needed software installed any longer, for example the Wise
>> Installer.

[Martin v. L?wis]
> Ok. So we are technically incapable of producing the Windows binaries of
>  another 2.3.x release, then?

FYI, I still have the Wise Installer.  But since my understanding is
that the "Unicode buffer overrun" thingie is a non-issue on Windows,
I've got no interest in wrestling with a 2.3.6 for Windows.

From dave at boost-consulting.com  Sat Oct 14 02:51:44 2006
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 13 Oct 2006 20:51:44 -0400
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <45300EEC.7080003@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wi?=
	=?utf-8?Q?s's?= message of "Sat, 14 Oct 2006 00:10:52 +0200")
References: 
	<20051104202824.GA19678@discworld.dyndns.org>

	 <20051202025557.GA22377@ActiveState.com>

	<452EBE5F.2040609@v.loewis.de>
	<8764eprvr0.fsf@pereiro.luannocracy.com> <452F0F8C.10708@v.loewis.de>
	<87iriowfxc.fsf@pereiro.luannocracy.com> <45300EEC.7080003@v.loewis.de>
Message-ID: <877iz3u3bz.fsf@pereiro.luannocracy.com>

"Martin v. L?wis"  writes:

> David Abrahams schrieb:
>>> I'm not sure whether you are requesting these for yourself or for
>>> somebody else. If for somebody else, that somebody else should seriously
>>> consider building Python himself, and publishing the result.
>> 
>> I'm requesting it for the many Boost.Python (heck, all Python 'C' API)
>> users who find it a usability hurdle when their first visual studio
>> projects fail to work properly in the default mode (debug) just
>> because they don't have the right Python libraries.
>
> And there is not one of them who would be willing and able to build
> a debug release, and distribute that????

I don't know.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From martin at v.loewis.de  Sat Oct 14 07:58:59 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 14 Oct 2006 07:58:59 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <1f7befae0610131653h52fd3bfcnd33af7c08f6fe9d@mail.gmail.com>
References: <200610121808.47010.anthony@interlink.com.au>	

	<200610130528.01672.anthony@interlink.com.au>	
	<452EAAAE.2050200@v.loewis.de> 	
	<45301227.5020805@v.loewis.de>
	<1f7befae0610131653h52fd3bfcnd33af7c08f6fe9d@mail.gmail.com>
Message-ID: <45307CA3.1070100@v.loewis.de>

Tim Peters schrieb:
> FYI, I still have the Wise Installer.  But since my understanding is
> that the "Unicode buffer overrun" thingie is a non-issue on Windows,
> I've got no interest in wrestling with a 2.3.6 for Windows.

In 2.3.6, there wouldn't just be that change, but also a few other
changes that have been collected, some relevant for Windows as well:
there are several updates to the email package, and a fix to pcre
to prevent a buffer overrun.

I'm not saying that you should produce a Windows binary then, just
that it would be good if one was produced if there was another
release. Of course, people might also get the binaries from ActiveState
should they produce some.

Regards,
Martin

From martin at v.loewis.de  Sat Oct 14 07:50:43 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 14 Oct 2006 07:50:43 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <20061013164246.09F8.JCARLSON@uci.edu>
References:  <45301227.5020805@v.loewis.de>
	<20061013164246.09F8.JCARLSON@uci.edu>
Message-ID: <45307AB3.8010504@v.loewis.de>

Josiah Carlson schrieb:
> I've got a build setup for 2.3.x, but I lack the Wise Installer.  It may
> be possible to use the 2.4 or 2.5 .msi creation tools, if that was
> sufficient.

I don't think that would be appropriate. There are differences in usage
which might be significant to some users, e.g. in automated install
scenarios. We should attempt not to break this.

Regards,
Martin

From aahz at pythoncraft.com  Sat Oct 14 19:38:04 2006
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 14 Oct 2006 10:38:04 -0700
Subject: [Python-Dev] ConfigParser: whitespace leading comment lines
In-Reply-To: <903323ff0610101240p2f4e0a18g18d34d1a800624ec@mail.gmail.com>
References: <903323ff0610101240p2f4e0a18g18d34d1a800624ec@mail.gmail.com>
Message-ID: <20061014173804.GA25333@panix.com>

On Tue, Oct 10, 2006, Greg Willden wrote:
>
> I'd like to propose the following change to ConfigParser.py.
> I won't call it a bug-fix because I don't know the relevant standards.

Go ahead and submit a patch; it's guaranteed you won't get progress
without it.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you don't know what your program is supposed to do, you'd better not
start writing it."  --Dijkstra

From arigo at tunes.org  Sun Oct 15 11:30:25 2006
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 15 Oct 2006 11:30:25 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: 
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>
	<452F5740.8000504@gmail.com> 
Message-ID: <20061015093020.GA2162@code0.codespeak.net>

Hi Fredrik,

On Fri, Oct 13, 2006 at 11:22:09AM +0200, Fredrik Lundh wrote:
> > >     static PyTypeObject NoddyType;
> >      static PyTypeObject *NoddyType;
> 
> yeah, that's a silly typo.

Ah, then ignore my previous remark.

Armin

From steve at holdenweb.com  Sun Oct 15 13:23:57 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sun, 15 Oct 2006 12:23:57 +0100
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <453011EA.2090800@v.loewis.de>
References: <200610121808.47010.anthony@interlink.com.au>	<452EBC8C.4080800@voidspace.org.uk>		<200610131505.24258.anthony@interlink.com.au>		
	<453011EA.2090800@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Steve Holden schrieb:
> 
>>>>The other thing to watch out for is that I (or whoever) can still do local 
>>>>work on a bunch of different files
>>>
>>>the point of my previous post is that you *shouldn't* have to edit a 
>>>bunch of different files to make a new release.
>>>
>>
>>Indeed. I seem to remember suggesting a while ago on pydotorg that 
>>whatever replaces pyramid should cater to groups such as the release 
>>team by allowing everything necessary to be generated from a simple set 
>>of data that wouldn't be difficult to maintain. Anthony has enough on 
>>his plate without having to fight the web server too ...
> 
> 
> There is always some sort of text that accompanies a release. That has
> to be edited to be correct; a machine can't do that.
> 
OK.

^everything^the content structure and many of the files^

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From ocean at m2.ccsnet.ne.jp  Sun Oct 15 13:21:18 2006
From: ocean at m2.ccsnet.ne.jp (ocean)
Date: Sun, 15 Oct 2006 20:21:18 +0900
Subject: [Python-Dev] VC6 support on release25-maint
Message-ID: <000d01c6f04c$092be450$0300a8c0@whiterabc2znlh>

Hello. I noticed VisualC++6 support came back. I'm glad with that, but still
it seems
incomplete. (for example, _sqlite3 support) Maybe does this patch help
process? On
my machine,  testcases other than distutils runs fine.
http://sourceforge.net/tracker/?func=detail&aid=1457736&group_id=5470&atid=305470

From anthony at interlink.com.au  Sun Oct 15 13:42:05 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sun, 15 Oct 2006 21:42:05 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<453011EA.2090800@v.loewis.de> 
Message-ID: <200610152142.07869.anthony@interlink.com.au>

On Sunday 15 October 2006 21:23, Steve Holden wrote:
> Martin v. L?wis wrote:
> > Steve Holden schrieb:
> >>>>The other thing to watch out for is that I (or whoever) can still do
> >>>> local work on a bunch of different files
> >>>
> >>>the point of my previous post is that you *shouldn't* have to edit a
> >>>bunch of different files to make a new release.
> >>
> >>Indeed. I seem to remember suggesting a while ago on pydotorg that
> >>whatever replaces pyramid should cater to groups such as the release
> >>team by allowing everything necessary to be generated from a simple set
> >>of data that wouldn't be difficult to maintain. Anthony has enough on
> >>his plate without having to fight the web server too ...
> >
> > There is always some sort of text that accompanies a release. That has
> > to be edited to be correct; a machine can't do that.
>
> OK.
>
> ^everything^the content structure and many of the files^

If you compare the various pieces that make up the release pages, you'll see 
that much of it is boilerplate, true. 

There's two cases worth mentioning:

First release of a new series (2.4.4c1, 2.5a1). This involves making the new 
directory and all the little fiddly files. In practice, this is done by 
recursively copying the previous release and removing the .ssh directories so
that it can be re-added. I then go through and update the files.

Subsequent release. This is still largely a manual process - I search for all 
the references to the previous release, update them, then read through it for 
missed bits. I then update the text bits that need to be changed. There's all 
sorts of minor variations there - for instance, often in a non-final release, 
we don't have an unpacked version of the documentation (but sometimes we do, 
wah). 

The killer bits for me are all the other places. For instance, updating the 
sidebar menu quicklinks for 2.4.4 to 2.5. There's just too many files, and 
the structure of pyramid's files still doesn't make sense to me. 

Anthony
-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From martin at v.loewis.de  Sun Oct 15 13:49:01 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 15 Oct 2006 13:49:01 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: <200610152142.07869.anthony@interlink.com.au>
References: <200610121808.47010.anthony@interlink.com.au>	<453011EA.2090800@v.loewis.de>

	<200610152142.07869.anthony@interlink.com.au>
Message-ID: <4532202D.2080607@v.loewis.de>

Anthony Baxter schrieb:
> Subsequent release. This is still largely a manual process - I search for all 
> the references to the previous release, update them, then read through it for 
> missed bits. I then update the text bits that need to be changed. There's all 
> sorts of minor variations there - for instance, often in a non-final release, 
> we don't have an unpacked version of the documentation (but sometimes we do, 
> wah). 

If that's a source of pain, we can standardize (assuming you are talking
about the .chm file). Which way would you like it? It really doesn't
matter to me either way - I just didn't think of it causing problems.

Regards,
Martin

From martin at v.loewis.de  Sun Oct 15 14:05:32 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 15 Oct 2006 14:05:32 +0200
Subject: [Python-Dev] VC6 support on release25-maint
In-Reply-To: <000d01c6f04c$092be450$0300a8c0@whiterabc2znlh>
References: <000d01c6f04c$092be450$0300a8c0@whiterabc2znlh>
Message-ID: <4532240C.3060906@v.loewis.de>

ocean schrieb:
> Hello. I noticed VisualC++6 support came back. I'm glad with that,
> but still it seems incomplete. (for example, _sqlite3 support) Maybe
> does this patch help process?

These changes were all contributed by Larry Hastings. For some reason,
I missed/forgot about your patch. Can you please update it?

Regards,
Martin

From martin at v.loewis.de  Sun Oct 15 14:59:57 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 15 Oct 2006 14:59:57 +0200
Subject: [Python-Dev] Cloning threading.py using proccesses
In-Reply-To: 
References: <20061010130901.09B1.JCARLSON@uci.edu>	<452CAA0C.6030306@egenix.com>	<20061011090701.09CA.JCARLSON@uci.edu>		<452D7E92.4050206@canterbury.ac.nz>		<452ECF5F.7040204@canterbury.ac.nz>

Message-ID: <453230CD.60408@v.loewis.de>

Fredrik Lundh schrieb:
> but given that the format *has* been stable for many years, surely it 
> would make more sense to just codify that fact, rather than developing 
> Yet Another Serialization Format instead?

There have been minor changes over time, e.g. r26146 (gvanrossum)
introduced TYPE_TRUE and TYPE_FALSE, r36242 (loewis) introduced
TYPE_INTERNED and TYPE_STRINGREF, and r38266 (rhettinger) introduced
TYPE_SET and TYPE_FROZENSET.

With these changes, old dumps can load in new versions, but not vice
versa.

Furthermore, r27219 (nnorwitz) changed the co_argcount, co_nlocals,
co_stacksize, co_flags, and co_firstlineno fields from short to long;
unmarshalling from an old version would just crash/read garbage.

So how would you propose to deal with such changes in the future?

Regards,
Martin

From martin at v.loewis.de  Sun Oct 15 15:13:21 2006
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 15 Oct 2006 15:13:21 +0200
Subject: [Python-Dev] os.utime on directories: bug fix or new feature?
Message-ID: <453233F1.2070202@v.loewis.de>

In Python 2.5.0 and earlier, it is not possible to modify
the time stamps of a directory (mtime and atime) on Windows.
The reason is that you cannot "open" (CreateFile) a
directory.

On W9x, it isn't possible, period. On WNT+, it's possible
if you pass FILE_FLAG_BACKUP_SEMANTICS to CreateFile.
I just applied patch #1576166 to the trunk which does that.

Should I backport the patch to 2.5, as it is a bug that
you can modify the time stamps of regular files but not
directories? Or should I not backport as it is a new
feature that you can now adjust the time stamps of a
directory, and couldn't before?

Anthony, can you please pronounce?

Regards,
Martin

From aahz at pythoncraft.com  Sun Oct 15 15:35:21 2006
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 15 Oct 2006 06:35:21 -0700
Subject: [Python-Dev] os.utime on directories: bug fix or new feature?
In-Reply-To: <453233F1.2070202@v.loewis.de>
References: <453233F1.2070202@v.loewis.de>
Message-ID: <20061015133521.GA22874@panix.com>

On Sun, Oct 15, 2006, "Martin v. L?wis" wrote:
>
> Should I backport the patch to 2.5, as it is a bug that you can modify
> the time stamps of regular files but not directories? Or should I
> not backport as it is a new feature that you can now adjust the time
> stamps of a directory, and couldn't before?

My vote is that it's a bugfix but should be treated as a new feature and
rejected for 2.5, based on the standard argument about capabilities and
the problems with bugfix releases having new capabilities.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you don't know what your program is supposed to do, you'd better not
start writing it."  --Dijkstra

From anthony at interlink.com.au  Sun Oct 15 16:01:42 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon, 16 Oct 2006 00:01:42 +1000
Subject: [Python-Dev] os.utime on directories: bug fix or new feature?
In-Reply-To: <20061015133521.GA22874@panix.com>
References: <453233F1.2070202@v.loewis.de> <20061015133521.GA22874@panix.com>
Message-ID: <200610160001.45322.anthony@interlink.com.au>

On Sunday 15 October 2006 23:35, Aahz wrote:
> On Sun, Oct 15, 2006, "Martin v. L?wis" wrote:
> > Should I backport the patch to 2.5, as it is a bug that you can modify
> > the time stamps of regular files but not directories? Or should I
> > not backport as it is a new feature that you can now adjust the time
> > stamps of a directory, and couldn't before?
>
> My vote is that it's a bugfix but should be treated as a new feature and
> rejected for 2.5, based on the standard argument about capabilities and
> the problems with bugfix releases having new capabilities.

Since it wasn't possible in earlier than 2.5 either, I'd say it's on the 
edge of being a bugfix. Let's be conservative and not backport it, since it's 
also a pretty marginal feature.

Anthony
-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From barry at barrys-emacs.org  Sun Oct 15 20:50:22 2006
From: barry at barrys-emacs.org (Barry Scott)
Date: Sun, 15 Oct 2006 19:50:22 +0100
Subject: [Python-Dev] Problem building module against Mac Python 2.4 and
	Python 2.5
Message-ID: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org>

This may be down to my lack of knowledge of Mac OS X development.

I want to build my python extension for Python 2.3, 2.4 and 2.5 on  
the same Mac.
Build Python 2.3 and Python 2.4 has been working well for a long  
time. But
after I installed Python 2.5 it seems that I can no longer link a  
against Python 2.4
without changing sym link /Library/Frameworks/Python.framework/ 
Versions/Current
to point at the one I want to build against.

The problem did not arise with Python 2.3 and Python 2.4 because  
Python 2.3
is in /System/Library and Python 2.4 is in /LIbrary. Telling ld which  
framework
folder to look in allows both to be linked against.

Is there a way to force ld to use a particular version of the python  
framework or do
I have to change the symlink each time I build against a different  
version?

This type of problem does not happen on Windows or Unix by design.

Barry

From bob at redivi.com  Sun Oct 15 21:41:53 2006
From: bob at redivi.com (Bob Ippolito)
Date: Sun, 15 Oct 2006 12:41:53 -0700
Subject: [Python-Dev] Problem building module against Mac Python 2.4 and
	Python 2.5
In-Reply-To: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org>
References: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org>
Message-ID: <6a36e7290610151241y55e1078dx5f11126e31bbb01f@mail.gmail.com>

On 10/15/06, Barry Scott  wrote:
> This may be down to my lack of knowledge of Mac OS X development.
>
> I want to build my python extension for Python 2.3, 2.4 and 2.5 on
> the same Mac.
> Build Python 2.3 and Python 2.4 has been working well for a long
> time. But
> after I installed Python 2.5 it seems that I can no longer link a
> against Python 2.4
> without changing sym link /Library/Frameworks/Python.framework/
> Versions/Current
> to point at the one I want to build against.
>
> The problem did not arise with Python 2.3 and Python 2.4 because
> Python 2.3
> is in /System/Library and Python 2.4 is in /LIbrary. Telling ld which
> framework
> folder to look in allows both to be linked against.
>
> Is there a way to force ld to use a particular version of the python
> framework or do
> I have to change the symlink each time I build against a different
> version?
>
> This type of problem does not happen on Windows or Unix by design.

Use an absolute path to the library rather than -framework.

Or use distutils!

-bob

From ronaldoussoren at mac.com  Sun Oct 15 22:11:12 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 15 Oct 2006 22:11:12 +0200
Subject: [Python-Dev] Problem building module against Mac Python 2.4 and
	Python 2.5
In-Reply-To: <6a36e7290610151241y55e1078dx5f11126e31bbb01f@mail.gmail.com>
References: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org>
	<6a36e7290610151241y55e1078dx5f11126e31bbb01f@mail.gmail.com>
Message-ID: <58929AAE-9357-4EE5-BD46-8597A343AACE@mac.com>

On Oct 15, 2006, at 9:41 PM, Bob Ippolito wrote:

> On 10/15/06, Barry Scott  wrote:
>> This may be down to my lack of knowledge of Mac OS X development.
>>
>> I want to build my python extension for Python 2.3, 2.4 and 2.5 on
>> the same Mac.
>> Build Python 2.3 and Python 2.4 has been working well for a long
>> time. But
>> after I installed Python 2.5 it seems that I can no longer link a
>> against Python 2.4
>> without changing sym link /Library/Frameworks/Python.framework/
>> Versions/Current
>> to point at the one I want to build against.
>>
>> The problem did not arise with Python 2.3 and Python 2.4 because
>> Python 2.3
>> is in /System/Library and Python 2.4 is in /LIbrary. Telling ld which
>> framework
>> folder to look in allows both to be linked against.
>>
>> Is there a way to force ld to use a particular version of the python
>> framework or do
>> I have to change the symlink each time I build against a different
>> version?
>>
>> This type of problem does not happen on Windows or Unix by design.
>
> Use an absolute path to the library rather than -framework.

That is, add '/Library/Frameworks/Python.framework/Versions/2.4/ 
Python' to the link command instead of '-framework Python'.

>
> Or use distutils!

That's definitely advisable anyway, that way you'll automaticly get  
the right flags to compile and link the extension :-)

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061015/646f0311/attachment.bin 

From kbk at shore.net  Mon Oct 16 04:20:15 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sun, 15 Oct 2006 22:20:15 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200610160220.k9G2KFNN020854@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  431 open ( +3) /  3425 closed ( +8) /  3856 total (+11)
Bugs    :  916 open (-23) /  6273 closed (+44) /  7189 total (+21)
RFE     :  244 open ( +4) /   240 closed ( +1) /   484 total ( +5)

New / Reopened Patches
______________________

typo in PC/_msi.c  (2006-10-07)
CLOSED http://python.org/sf/1572724  opened by  jose nazario

Fix for segfault in ISO 2022 codecs  (2006-10-07)
CLOSED http://python.org/sf/1572832  opened by  Ray Chason

let quit and exit really exit  (2006-10-09)
CLOSED http://python.org/sf/1573835  opened by  Gerrit Holl

urllib2 - Fix line breaks in authorization headers  (2006-10-09)
       http://python.org/sf/1574068  opened by  Scott Dial

Add %var% support to ntpath.expandvars  (2006-10-09)
       http://python.org/sf/1574252  opened by  Chip Norkus

Mailbox will lock properly after flush()  (2006-10-11)
       http://python.org/sf/1575506  opened by  Philippe Gauthier

Support spawnvp[e] + use native execvp[e] on win32  (2006-10-12)
       http://python.org/sf/1576120  opened by  Snaury

os.utime acess denied with directories on win32  (2006-10-12)
CLOSED http://python.org/sf/1576166  opened by  Snaury

os.execvp[e] on win32 fails for current directory  (2006-10-13)
       http://python.org/sf/1576313  opened by  Snaury

Fix VC6 build, remove redundant files for VC7 build  (2006-10-14)
CLOSED http://python.org/sf/1576954  opened by  Larry Hastings

Fix VC6 build, remove redundant files for VC7 build  (2006-10-14)
CLOSED http://python.org/sf/1577078  opened by  Larry Hastings

Add _ctypes, _ctypes_test, and _elementtree to VC6 build  (2006-10-15)
CLOSED http://python.org/sf/1577551  opened by  Larry Hastings

newline in -DSVNVERSION=\"`LANG=C svnversion .`\"  (2006-10-15)
CLOSED http://python.org/sf/1577756  opened by  Daniel Str?nger

Patches Closed
______________

typo in PC/_msi.c  (2006-10-07)
       http://python.org/sf/1572724  closed by  gbrandl

Fix for segfault in ISO 2022 codecs  (2006-10-08)
       http://python.org/sf/1572832  closed by  perky

fix crash with continue in nested try/finally  (2006-08-18)
       http://python.org/sf/1542451  closed by  gbrandl

let quit and exit really exit  (2006-10-09)
       http://python.org/sf/1573835  closed by  mwh

Fix for Lib/test/crashers/gc_inspection.py  (2006-07-04)
       http://python.org/sf/1517042  closed by  gbrandl

os.utime acess denied with directories on win32  (2006-10-12)
       http://python.org/sf/1576166  closed by  loewis

Fix VC6 build, remove redundant files for VC7 build  (2006-10-14)
       http://python.org/sf/1576954  closed by  loewis

Fix VC6 build, remove redundant files for VC7 build  (2006-10-14)
       http://python.org/sf/1577078  deleted by  lhastings

Add _ctypes, _ctypes_test, and _elementtree to VC6 build  (2006-10-15)
       http://python.org/sf/1577551  closed by  loewis

newline in -DSVNVERSION=\"`LANG=C svnversion .`\"  (2006-10-15)
       http://python.org/sf/1577756  deleted by  schmaller

New / Reopened Bugs
___________________

cElementTree.SubElement doesn't recognize keyword "attrib"  (2006-10-07)
CLOSED http://python.org/sf/1572710  opened by  Mark Stephens

import org.python.core imports local org.py  (2006-10-08)
CLOSED http://python.org/sf/1573180  opened by  E.-O. Le Bigot

ctypes unit test fails (test_macholib.py) under MacOS 10.4.7  (2006-08-21)
CLOSED http://python.org/sf/1544102  reopened by  ronaldoussoren

struct module doesn't use weakref for cache  (2006-10-08)
CLOSED http://python.org/sf/1573394  opened by  Mark Flacy

sqlite3 documentation on rowcount is contradictory  (2006-10-10)
       http://python.org/sf/1573854  opened by  Seo Sanghyeon

if(status = ERROR_MORE_DATA)  (2006-10-09)
CLOSED http://python.org/sf/1573928  opened by  Helmut Grohne

WSGI, cgi.FieldStorage incompatibility  (2006-10-09)
       http://python.org/sf/1573931  opened by  Michael Kerrin

isinstance swallows exceptions  (2006-10-09)
       http://python.org/sf/1574217  opened by  Brian Harring

os.popen with os.close gives error message  (2006-10-10)
       http://python.org/sf/1574310  opened by  dtrosset

Error with callback function and as_parameter with NumPy ndp  (2006-10-10)
       http://python.org/sf/1574584  opened by  Albert Strasheim

ctypes: Pointer-to-pointer unchanged in callback  (2006-10-10)
       http://python.org/sf/1574588  opened by  Albert Strasheim

ctypes: Returning c_void_p from callback doesn't work  (2006-10-10)
       http://python.org/sf/1574593  opened by  Albert Strasheim

Request wave support > 16 bit samples  (2006-10-11)
       http://python.org/sf/1575020  opened by  Murray Lang

isSequenceType returns True for dict subclasses (<> 2.3)  (2006-10-11)
       http://python.org/sf/1575169  opened by  Martin Gfeller

typo: section 2.1 -> property  (2006-10-12)
CLOSED http://python.org/sf/1575746  opened by  Antoine De Groote

Missing notice on environment setting LD_LIBRARY_PATH  (2006-10-12)
CLOSED http://python.org/sf/1575803  opened by  Anastasios Hatzis

from_param and _as_parameter_ truncating 64-bit value  (2006-10-12)
       http://python.org/sf/1575945  opened by  Albert Strasheim

str(WindowsError) wrong  (2006-10-12)
       http://python.org/sf/1576174  opened by  Thomas Heller

ConfigParser: whitespace leading comment lines  (2006-10-12)
       http://python.org/sf/1576208  opened by  gregwillden

functools.wraps fails on builtins  (2006-10-12)
       http://python.org/sf/1576241  opened by  kajiuma

Example typo in section 4 of 'Installing Python Modules'  (2006-10-13)
       http://python.org/sf/1576348  opened by  ytrewq1

enable-shared .dso location  (2006-10-12)
CLOSED http://python.org/sf/1576394  opened by  Mike Klaas

cStringIO misbehaving with unicode  (2006-10-13)
CLOSED http://python.org/sf/1576443  opened by  Yang Zhang

ftplib doesn't follow standard  (2006-10-13)
       http://python.org/sf/1576598  opened by  Denis S. Otkidach

dict keyerror formatting and tuples  (2006-10-13)
       http://python.org/sf/1576657  opened by  M.-A. Lemburg

potential buffer overflow in complexobject.c  (2006-10-13)
       http://python.org/sf/1576861  opened by  Jochen Voss

GetFileAttributesExA and Win95  (2006-09-29)
CLOSED http://python.org/sf/1567666  reopened by  giomach

Bugs Closed
___________

csv "dialect = 'excel-tab'" to use excel_tab  (2006-10-06)
       http://python.org/sf/1572471  closed by  montanaro

cElementTree.SubElement doesn't recognize keyword "attrib"  (2006-10-07)
       http://python.org/sf/1572710  closed by  effbot

tabs missing in idle options configure  (2006-09-28)
       http://python.org/sf/1567450  closed by  kbk

IDLE doesn't load - apparently without firewall problems  (2006-09-22)
       http://python.org/sf/1563630  closed by  kbk

Let assign to as raise SyntaxWarning as well  (2003-02-23)
       http://python.org/sf/691733  closed by  nnorwitz

cvs update warnings  (2003-07-02)
       http://python.org/sf/764447  closed by  nnorwitz

Minor floatobject.c bug  (2003-08-15)
       http://python.org/sf/789159  closed by  nnorwitz

another threads+readline+signals nasty  (2004-06-11)
       http://python.org/sf/971213  closed by  nnorwitz

init_types  (2006-09-30)
       http://python.org/sf/1568243  closed by  gbrandl

import org.python.core imports local org.py  (2006-10-08)
       http://python.org/sf/1573180  closed by  gbrandl

PGIRelease linkage fails on pgodb80.dll  (2006-10-02)
       http://python.org/sf/1569517  closed by  krisvale

missing _typesmodule.c,Visual Studio 2005 pythoncore.vcproj  (2006-09-29)
       http://python.org/sf/1567910  closed by  krisvale

ctypes unit test fails (test_macholib.py) under MacOS 10.4.7  (2006-08-21)
       http://python.org/sf/1544102  closed by  ronaldoussoren

Tutorial: incorrect info about package importing and mac  (2006-09-17)
       http://python.org/sf/1560114  closed by  gbrandl

struct module doesn't use weakref for cache  (2006-10-08)
       http://python.org/sf/1573394  closed by  etrepum

setup() keyword have to be list (doesn't work with tuple)  (2006-08-23)
       http://python.org/sf/1545341  closed by  akuchling

if(status = ERROR_MORE_DATA)  (2006-10-09)
       http://python.org/sf/1573928  closed by  gbrandl

os.stat() subsecond file mode time is incorrect on Windows  (2006-09-25)
       http://python.org/sf/1565150  closed by  loewis

2.6 changes stomp on  2.5 docs  (2006-09-23)
       http://python.org/sf/1564039  closed by  sf-robot

Cannot use high-numbered sockets in 2.4.3  (2006-05-24)
       http://python.org/sf/1494314  closed by  anthonybaxter

typo: section 2.1 -> property  (2006-10-12)
       http://python.org/sf/1575746  closed by  gbrandl

-Qnew switch doesn't work  (2003-09-26)
       http://python.org/sf/813342  closed by  gbrandl

sets missing from standard types list in ref  (2006-09-26)
       http://python.org/sf/1565919  closed by  gbrandl

make plistlib.py available in every install  (2006-09-25)
       http://python.org/sf/1565129  closed by  gbrandl

inspect module and class startlineno  (2006-09-01)
       http://python.org/sf/1550524  closed by  gbrandl

Pdb parser bug  (2006-08-30)
       http://python.org/sf/1549574  closed by  gbrandl

Missing notice on environment setting LD_LIBRARY_PATH  (2006-10-12)
       http://python.org/sf/1575803  closed by  loewis

shlex (or perhaps cStringIO) and unicode strings  (2006-08-29)
       http://python.org/sf/1548891  closed by  gbrandl

Build of 2.4.3 on fedora core 5 fails to find asm/msr.h  (2006-09-03)
       http://python.org/sf/1551238  closed by  gbrandl

urlparse.urljoin odd behaviour  (2006-08-25)
       http://python.org/sf/1546628  closed by  gbrandl

inconsistent treatment of NULs in int()  (2006-08-23)
       http://python.org/sf/1545497  closed by  gbrandl

Move fpectl elsewhere in library reference  (2006-09-11)
       http://python.org/sf/1556261  closed by  gbrandl

Fix Lib/test/test___all__.py  (2006-08-22)
       http://python.org/sf/1544295  closed by  gbrandl

smeared title when installing  (2004-10-18)
       http://python.org/sf/1049615  closed by  gbrandl

nit for builtin sum doc  (2005-09-07)
       http://python.org/sf/1283491  closed by  gbrandl

site-packages & build-dir python  (2002-07-25)
       http://python.org/sf/586700  closed by  gbrandl

__name__ doesn't show up in dir() of class  (2006-08-03)
       http://python.org/sf/1534014  closed by  gbrandl

Interpreter crash: filter() + gc.get_referrers()  (2006-07-05)
       http://python.org/sf/1517663  closed by  gbrandl

Better/faster implementation of os.path.basename/dirname  (2006-09-17)
       http://python.org/sf/1560179  closed by  gbrandl

enable-shared .dso location  (2006-10-13)
       http://python.org/sf/1576394  closed by  loewis

cStringIO misbehaving with unicode  (2006-10-13)
       http://python.org/sf/1576443  closed by  gbrandl

GetFileAttributesExA and Win95  (2006-09-29)
       http://python.org/sf/1567666  closed by  loewis

site-packages isn't created before install_egg_info  (2006-09-27)
       http://python.org/sf/1566719  closed by  sf-robot

New / Reopened RFE
__________________

release GIL while doing I/O operations in the mmap module  (2006-10-08)
       http://python.org/sf/1572968  opened by  Lukas Lalinsky

RFE Closed
__________

Print identical floats consistently  (2006-08-05)
       http://python.org/sf/1534942  closed by  gbrandl

From kristjan at ccpgames.com  Mon Oct 16 15:07:09 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Mon, 16 Oct 2006 13:07:09 -0000
Subject: [Python-Dev] Python 2.5 performance
Message-ID: <129CEF95A523704B9D46959C922A280002FE99FE@nemesis.central.ccp.cc>

Well, it ought to be possible.  I can turn off the instrumentation on the other modules, and see what happens.
K 

> -----Original Message-----
> From: Giovanni Bajo [mailto:rasky at develer.com] 
> Sent: 12. okt?ber 2006 20:30
> To: Kristj?n V. J?nsson
> Cc: python-dev at python.org
> Subject: Re: Python 2.5 performance
> 
> Kristj?n V. J?nsson wrote:
> 
> > This is an improvement of another 3.5 %.
> > In all, we have a performance increase of more than 10%.
> > Granted, this is from a single set of runs, but I think we should 
> > start considering to make PCBuild8 a "supported" build.
> 
> Kristj?n, I wonder if the performance improvement comes from 
> ceval.c only (or maybe a few other selected files). Is it 
> possible to somehow link the PGO-optimized ceval.obj into the 
> VS2003 project?
> --
> Giovanni Bajo
> 
> 

From kristjan at ccpgames.com  Mon Oct 16 15:09:44 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Mon, 16 Oct 2006 13:09:44 -0000
Subject: [Python-Dev] Python 2.5 performance
Message-ID: <129CEF95A523704B9D46959C922A280002FE99FF@nemesis.central.ccp.cc>

I must confess that I am not familiar with the buildbots.  I could imagine that it would be difficult to set up internally due to security concerns, but I can voice the issue here.

K 

> -----Original Message-----
> From: Anthony Baxter [mailto:anthony at interlink.com.au] 
> Sent: 12. okt?ber 2006 21:13
> To: python-dev at python.org
> Cc: Martin v. L?wis; Kristj?n V. J?nsson
> Subject: Re: [Python-Dev] Python 2.5 performance
> 
> On Friday 13 October 2006 07:00, Martin v. L?wis wrote:
> > Kristj?n V. J?nsson schrieb:
> > > This is an improvement of another 3.5 %.
> > > In all, we have a performance increase of more than 10%.
> > > Granted, this is from a single set of runs, but I think we should 
> > > start considering to make PCBuild8 a "supported" build.
> >
> > What do you mean by that? That Python 2.5.1 should be 
> compiled with VC 
> > 2005? Something else (if so, what)?
> 
> I don't think we should switch the "official" compiler for a 
> point release. 
> I'm happy to say something like "we make the PCbuild8 
> environment a supported compiler", which means we need, at a 
> bare minimum, a buildbot slave for that compiler/platform. 
> Kristj?n, is this something you can offer?
> 
> Without a buildbot for that compiler, I don't think we can 
> claim it's supported. There's plenty of platforms we 
> "support" which don't have buildslaves, but they're all 
> variants of Unix - I'm happy that they are all mostly[1] sane.
> 
> Anthony
> 
> [1] Offer void on some versions of HP/UX, Irix, AIX 
> -- 
> Anthony Baxter     
> It's never too late to have a happy childhood.
> 

From martin at v.loewis.de  Mon Oct 16 21:37:56 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 16 Oct 2006 21:37:56 +0200
Subject: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 performance)
In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99FF@nemesis.central.ccp.cc>
References: <129CEF95A523704B9D46959C922A280002FE99FF@nemesis.central.ccp.cc>
Message-ID: <4533DF94.9010404@v.loewis.de>

Kristj?n V. J?nsson schrieb:
> I must confess that I am not familiar with the buildbots.  

The challenge and work-load is primarily initially in setting it
up; in this case (for PCbuild8), there is work for both the master
and the slave sides (probably, new scripts in Tools/buildbot will
have to be created).

> I could
> imagine that it would be difficult to set up internally due to
> security concerns, but I can voice the issue here.

It's not mandatory, of course: neither that there is a PCbuild8
buildbot at all, or that it is hosted at ccpgames. It just would
reduce the chance that breakage of PCbuild8 goes unnoticed for
long.

As for the security concerns: the buildbot slave actively opens
a networking connection to the master; you don't have to open
any additional ports on your firewalls. Of course, the master
can send the slave arbitrary commands to execute, so if the master
is taken over by some attacker, that attacker could easily get
control over all slaves also (except that you want to run the
slave in a restricted account, so that the attacker would have
to find a hole in the slave's operating system, also, before
taking the machine over completely).

As for making VS 2005 "more official": you also might have
meant that the PCbuild directory should be converted to VS 2005.
That would have a number of implications (on the buildbots,
on changes to Tools/msi, and on potential usage of VS 2007
for Python 2.6), which need to be discussed when somebody
actually proposes such a change.

Regards,
Martin

From barry at barrys-emacs.org  Mon Oct 16 21:57:04 2006
From: barry at barrys-emacs.org (Barry Scott)
Date: Mon, 16 Oct 2006 20:57:04 +0100
Subject: [Python-Dev] Problem building module against Mac Python 2.4 and
	Python 2.5
In-Reply-To: <58929AAE-9357-4EE5-BD46-8597A343AACE@mac.com>
References: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org>
	<6a36e7290610151241y55e1078dx5f11126e31bbb01f@mail.gmail.com>
	<58929AAE-9357-4EE5-BD46-8597A343AACE@mac.com>
Message-ID: 

>>
>> Use an absolute path to the library rather than -framework.
>
> That is, add '/Library/Frameworks/Python.framework/Versions/2.4/ 
> Python' to the link command instead of '-framework Python'.

Thanks  I'll update my builds to do that.

>>
>> Or use distutils!
>
> That's definitely advisable anyway, that way you'll automaticly get  
> the right flags to compile and link the extension :-)

I call distutils to get some information for CFLAGS and include dirs.  
I'll look at what I get back for
libs and update my build script.

All my code is C++ and in the past distutils lacked C++ support so I  
could not use it and have develoer
my own solution to the build problem.

Does distutils work for C++ code these days?

Barry

From fredrik at pythonware.com  Tue Oct 17 10:54:29 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 17 Oct 2006 10:54:29 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
References: <200610121808.47010.anthony@interlink.com.au>		<200610130528.01672.anthony@interlink.com.au>	<452EAAAE.2050200@v.loewis.de>
		<45301227.5020805@v.loewis.de><1f7befae0610131653h52fd3bfcnd33af7c08f6fe9d@mail.gmail.com>
	<45307CA3.1070100@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:

> In 2.3.6, there wouldn't just be that change, but also a few other
> changes that have been collected, some relevant for Windows as well

why not just do a "2.3.5+security" source release, and leave the rest to the
downstream maintainers?

From anthony at interlink.com.au  Tue Oct 17 11:02:10 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue, 17 Oct 2006 19:02:10 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<45307CA3.1070100@v.loewis.de> 
Message-ID: <200610171902.12840.anthony@interlink.com.au>

On Tuesday 17 October 2006 18:54, Fredrik Lundh wrote:
> Martin v. L?wis wrote:
> > In 2.3.6, there wouldn't just be that change, but also a few other
> > changes that have been collected, some relevant for Windows as well
>
> why not just do a "2.3.5+security" source release, and leave the rest to
> the downstream maintainers?

I think we'd need to renumber it to 2.3.6 at least, otherwise there's the 
problem of distinguishing between the two. I'd _hope_ that all the 
downstreams will have picked up the patch (if you know of someone who hasn't, 
let me know and I'll kick them for you if it would help). 

But I'm certainly thinking if there's a 2.3.6, it's going to be 2.3.5 with the 
email fix and the unicode repr() fix, and that's it. No windows or Mac 
binaries - they'll be pointed to the perfectly fine 2.3.5 binary installers.

And no, I'm not doing another 2.2 release :)

From fredrik at pythonware.com  Tue Oct 17 11:03:43 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 17 Oct 2006 11:03:43 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
References: <200610121808.47010.anthony@interlink.com.au>	<2514DA1C-F5A1-4144-9068-006A933C516C@python.org>		<200610130714.00673.anthony@interlink.com.au>	

Message-ID: 

Steve Holden wrote:

> Or you can start to promote Django again ...

my original plan would still work, I think:

    http://effbot.org/zone/pydotorg-cache.htm#todo

From fredrik at pythonware.com  Tue Oct 17 11:09:20 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 17 Oct 2006 11:09:20 +0200
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
References: <200610121808.47010.anthony@interlink.com.au><45307CA3.1070100@v.loewis.de>

	<200610171902.12840.anthony@interlink.com.au>
Message-ID: 

Anthony Baxter wrote:

> > why not just do a "2.3.5+security" source release, and leave the rest to
> > the downstream maintainers?
>
> I think we'd need to renumber it to 2.3.6 at least, otherwise there's the
> problem of distinguishing between the two. I'd _hope_ that all the
> downstreams will have picked up the patch (if you know of someone who hasn't,
> let me know and I'll kick them for you if it would help).

in my experience, downstream builders tend to deal with patches just fine;
I'm more worried about people who build directly from tarballs (using the
good old "wget, tar xvfz, configure, make" mental macro)

> But I'm certainly thinking if there's a 2.3.6, it's going to be 2.3.5 with the
> email fix and the unicode repr() fix, and that's it.

sounds good to me.  how much work would that be, and if you're willing to
coordinate, is there anything we can do to help?

From anthony at interlink.com.au  Tue Oct 17 11:35:16 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue, 17 Oct 2006 19:35:16 +1000
Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun
In-Reply-To: 
References: <200610121808.47010.anthony@interlink.com.au>
	<200610171902.12840.anthony@interlink.com.au>

Message-ID: <200610171935.18447.anthony@interlink.com.au>

On Tuesday 17 October 2006 19:09, Fredrik Lundh wrote:
> > But I'm certainly thinking if there's a 2.3.6, it's going to be 2.3.5
> > with the email fix and the unicode repr() fix, and that's it.
>
> sounds good to me.  how much work would that be, and if you're willing to
> coordinate, is there anything we can do to help?

Less than a normal release, since I'm not going to worry about changing the 
docs, the windows installers or the mac installers. I'll look at it next 
week, once 2.4.4 final is done.

Anthony

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From grig.gheorghiu at gmail.com  Tue Oct 17 16:59:59 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Tue, 17 Oct 2006 07:59:59 -0700
Subject: [Python-Dev] svn.python.org down
Message-ID: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>

FYI -- can't do svn checkouts/updates from the trunk at this point.

starting svn operation
svn update --revision HEAD
 in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs)
svn: PROPFIND request failed on '/projects/python/trunk'
svn: PROPFIND of '/projects/python/trunk': could not connect to server
(http://svn.python.org)

Grig

-- 
http://agiletesting.blogspot.com

From kristjan at ccpgames.com  Tue Oct 17 17:09:45 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Tue, 17 Oct 2006 15:09:45 -0000
Subject: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 performance)
Message-ID: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc>

Okay, a buildbot then doesn't sound quite that scary.  Any info somewhere on how to set one up on a windows box?

I Also wasn't suggesting that we change the PCBuild directory, since I think we definitely want to keep the old support.  But I agree that getting regular builds running would be a good thing.  An x64 box would be ideal to build both the x86 and x64 versions on.  A single bot can manage many platforms, right?

I would also need to get the _msi and _sqlite3 modules building (which I haven't yet, since I didn't get their sources.)

Kristj?n

> -----Original Message-----
> From: "Martin v. L?wis" [mailto:martin at v.loewis.de] 
> Sent: 16. okt?ber 2006 19:38
> To: Kristj?n V. J?nsson
> Cc: Anthony Baxter; python-dev at python.org
> Subject: Re: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 
> performance)
> 
> Kristj?n V. J?nsson schrieb:
> > I must confess that I am not familiar with the buildbots.  
> 
> The challenge and work-load is primarily initially in setting 
> it up; in this case (for PCbuild8), there is work for both 
> the master and the slave sides (probably, new scripts in 
> Tools/buildbot will have to be created).
> 
> > I could
> > imagine that it would be difficult to set up internally due to 
> > security concerns, but I can voice the issue here.
> 
> It's not mandatory, of course: neither that there is a 
> PCbuild8 buildbot at all, or that it is hosted at ccpgames. 
> It just would reduce the chance that breakage of PCbuild8 
> goes unnoticed for long.
> 
> As for the security concerns: the buildbot slave actively 
> opens a networking connection to the master; you don't have 
> to open any additional ports on your firewalls. Of course, 
> the master can send the slave arbitrary commands to execute, 
> so if the master is taken over by some attacker, that 
> attacker could easily get control over all slaves also 
> (except that you want to run the slave in a restricted 
> account, so that the attacker would have to find a hole in 
> the slave's operating system, also, before taking the machine 
> over completely).
> 
> As for making VS 2005 "more official": you also might have 
> meant that the PCbuild directory should be converted to VS 2005.
> That would have a number of implications (on the buildbots, 
> on changes to Tools/msi, and on potential usage of VS 2007 
> for Python 2.6), which need to be discussed when somebody 
> actually proposes such a change.
> 
> Regards,
> Martin
> 

From anthony at interlink.com.au  Tue Oct 17 17:28:53 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed, 18 Oct 2006 01:28:53 +1000
Subject: [Python-Dev] BRANCH FREEZE release24-maint, Wed 18th Oct, 00:00UTC
Message-ID: <200610180128.57266.anthony@interlink.com.au>

I'm declaring the branch frozen for 2.4.4 final from 00:00 UTC (that's about 8 
hours from now). The release will either be Wednesday 18th or Thursday 19th. 
There's a blocking bug http://www.python.org/sf/1578513 - I've attached a 
patch for it, if someone with autoconf knowledge wants to review that it can 
be checked in. It _should_ be good, and probably needs to be applied to 
release25-maint and the trunk as well.

Anthony
-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From grig.gheorghiu at gmail.com  Tue Oct 17 17:38:58 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Tue, 17 Oct 2006 08:38:58 -0700
Subject: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 performance)
In-Reply-To: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc>
References: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc>
Message-ID: <3f09d5a00610170838w40217aa9s7f9ac713c6e0c866@mail.gmail.com>

On 10/17/06, Kristj?n V. J?nsson  wrote:
>
> Okay, a buildbot then doesn't sound quite that scary.  Any info somewhere on how to set one up on a windows box?
>

http://wiki.python.org/moin/BuildbotOnWindows

Grig

From anthony at interlink.com.au  Tue Oct 17 17:48:07 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed, 18 Oct 2006 01:48:07 +1000
Subject: [Python-Dev] svn.python.org down
In-Reply-To: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
Message-ID: <200610180148.12006.anthony@interlink.com.au>

On Wednesday 18 October 2006 00:59, Grig Gheorghiu wrote:
> FYI -- can't do svn checkouts/updates from the trunk at this point.
>
> starting svn operation
> svn update --revision HEAD
>  in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs)
> svn: PROPFIND request failed on '/projects/python/trunk'
> svn: PROPFIND of '/projects/python/trunk': could not connect to server
> (http://svn.python.org)

It works for me. Can you connect to port 22 on svn.python.org?

From grig.gheorghiu at gmail.com  Tue Oct 17 17:51:07 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Tue, 17 Oct 2006 08:51:07 -0700
Subject: [Python-Dev] svn.python.org down
In-Reply-To: <200610180148.12006.anthony@interlink.com.au>
References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
	<200610180148.12006.anthony@interlink.com.au>
Message-ID: <3f09d5a00610170851ic91cdf2x7a9e6a5a687775df@mail.gmail.com>

On 10/17/06, Anthony Baxter  wrote:
> On Wednesday 18 October 2006 00:59, Grig Gheorghiu wrote:
> > FYI -- can't do svn checkouts/updates from the trunk at this point.
> >
> > starting svn operation
> > svn update --revision HEAD
> >  in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs)
> > svn: PROPFIND request failed on '/projects/python/trunk'
> > svn: PROPFIND of '/projects/python/trunk': could not connect to server
> > (http://svn.python.org)
>
> It works for me. Can you connect to port 22 on svn.python.org?
>

I can connect with ssh, but svn checkouts fail across the board for
all pybots buildslaves:

http://www.python.org/dev/buildbot/community/all/

Grig

From p.f.moore at gmail.com  Tue Oct 17 17:51:18 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 17 Oct 2006 16:51:18 +0100
Subject: [Python-Dev] svn.python.org down
In-Reply-To: <200610180148.12006.anthony@interlink.com.au>
References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
	<200610180148.12006.anthony@interlink.com.au>
Message-ID: <79990c6b0610170851q7a0cec02h378a449d466fe7f1@mail.gmail.com>

On 10/17/06, Anthony Baxter  wrote:
> On Wednesday 18 October 2006 00:59, Grig Gheorghiu wrote:
> > FYI -- can't do svn checkouts/updates from the trunk at this point.
> >
> > starting svn operation
> > svn update --revision HEAD
> >  in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs)
> > svn: PROPFIND request failed on '/projects/python/trunk'
> > svn: PROPFIND of '/projects/python/trunk': could not connect to server
> > (http://svn.python.org)
>
> It works for me. Can you connect to port 22 on svn.python.org?

I think it's the HTTP side of things. The ViewCVS interface isn't
working either.
Paul.

From anthony at interlink.com.au  Tue Oct 17 17:54:33 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed, 18 Oct 2006 01:54:33 +1000
Subject: [Python-Dev] svn.python.org down
In-Reply-To: <200610180148.12006.anthony@interlink.com.au>
References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
	<200610180148.12006.anthony@interlink.com.au>
Message-ID: <200610180154.34925.anthony@interlink.com.au>

Ah - the svn-apache server was down. I've restarted it. We should probably put 
some monitoring/restarting in place for those servers - if someone wants to 
volunteer a script I'll add it to cron, or I'll write it myself when I get a 
chance.

(I was testing with svn+ssh, it was the http version that was down)

Anthony

From p.f.moore at gmail.com  Tue Oct 17 17:57:52 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 17 Oct 2006 16:57:52 +0100
Subject: [Python-Dev] svn.python.org down
In-Reply-To: <200610180154.34925.anthony@interlink.com.au>
References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
	<200610180148.12006.anthony@interlink.com.au>
	<200610180154.34925.anthony@interlink.com.au>
Message-ID: <79990c6b0610170857y268f7b40ye519c007c065dd59@mail.gmail.com>

On 10/17/06, Anthony Baxter  wrote:
> Ah - the svn-apache server was down. I've restarted it. We should probably put
> some monitoring/restarting in place for those servers - if someone wants to
> volunteer a script I'll add it to cron, or I'll write it myself when I get a
> chance.

Working now. Thanks.
Paul.

From skip at pobox.com  Tue Oct 17 18:32:26 2006
From: skip at pobox.com (skip at pobox.com)
Date: Tue, 17 Oct 2006 11:32:26 -0500
Subject: [Python-Dev] svn.python.org down
In-Reply-To: <200610180154.34925.anthony@interlink.com.au>
References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
	<200610180148.12006.anthony@interlink.com.au>
	<200610180154.34925.anthony@interlink.com.au>
Message-ID: <17717.1434.974171.689688@montanaro.dyndns.org>

    Anthony> Ah - the svn-apache server was down. I've restarted it. We
    Anthony> should probably put some monitoring/restarting in place for
    Anthony> those servers - if someone wants to volunteer a script I'll add
    Anthony> it to cron, or I'll write it myself when I get a chance.

Is this on a machine hosted by xs4all?  If so, we can probably just ask them
to monitor it from nagios (or whatever tool they use).

Skip

From brett at python.org  Tue Oct 17 20:58:53 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 17 Oct 2006 11:58:53 -0700
Subject: [Python-Dev] who is interested on being on a python-dev panel at
	PyCon?
Message-ID: 

For the past couple years there has been the suggestion of having a panel
discussion made up of core developers at PyCon.  Basically it would provide
a way for the community to find how we do things, where we are going, our
views, etc.

I have finally decided to step forward and try to organize such a panel.
Steve Holden has already graciously stepped forward at my request to be the
moderator.  That means I just need to fill out the panel.  =)

Since I am organizing this I am also going to stick my neck out and be on
the panel.  AMK has also volunteered.  Who else is interested?  If you think
you will be at PyCon (does not have to be a definite "yes" at the moment,
just that you are hoping to) and are interested in participating, send me an
email.  Let me know how good your chances are attending PyCon are in case
there are more people volunteering than would reasonably fit on the panel (I
am guessing five people would be good, especially if we get folks who fill
different roles on python-dev).

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061017/2e834f6d/attachment.html 

From martin at v.loewis.de  Tue Oct 17 21:09:03 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 17 Oct 2006 21:09:03 +0200
Subject: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 performance)
In-Reply-To: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc>
References: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc>
Message-ID: <45352A4F.2080203@v.loewis.de>

Kristj?n V. J?nsson schrieb:
> Okay, a buildbot then doesn't sound quite that scary.  Any info
> somewhere on how to set one up on a windows box?

Sure. See

http://wiki.python.org/moin/BuildbotOnWindows

Feel free to make changes if you find the instructions need to be
enhanced.

> I Also wasn't suggesting that we change the PCBuild directory, since
> I think we definitely want to keep the old support.  

Well, at any point in time, there is one and only one "official"
build procedure, which is the procedure used to make releases. If
VS 2003 is not used anymore, a copy of it is made into PC, and the
existing PCBuild directory is converted to the new procedure (or
some other directory name is invented; that should *not* be
PCbuild8 - we shouldn't have to rename directories each time we
switch the compiler).

> But I agree that
> getting regular builds running would be a good thing.  An x64 box
> would be ideal to build both the x86 and x64 versions on.  A single
> bot can manage many platforms, right?

A single machine, and a single buildbot installation, yes. But not
a single build slave, since there can be only one build procedure
per slave. So if we need different procedures (which we likely do:
how else could it find out which of them it should do?), we would
need two slaves. That should work fine, except that both slaves
will typically start simultaneously on the machine, doubling the
load. It's possible to tell the master not to build different
branches on a single slave (i.e. 2.5 has to wait if trunk is
building), but it's not possible to tell it that two slaves
reside on the same machine (it might be possible, but I don't
know how to do it).

> I would also need to get the _msi and _sqlite3 modules building
> (which I haven't yet, since I didn't get their sources.)

You don't need any additional sources for _msi, and, in fact,
my AMD64 and Itanium installers do provide _msi.pyd binaries.

Regards,
Martin

From pandyacus at gmail.com  Wed Oct 18 12:26:54 2006
From: pandyacus at gmail.com (Chetan Pandya)
Date: Wed, 18 Oct 2006 03:26:54 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string Re: PATCH
	submitted: Speed up + for string concatenation,
	now as fast as "".join(x) idiom
Message-ID: 

The discussion on this topic seems to have died down. However, I had a look
at the patch and here are some comments:

This has the potential to speed up simple strings expressions like
s = '1' + '2' + '3' + '4' + '5' + '6' + '7' + '8'

However, if this is followed by
s += '9' this (the 9th string) will cause rendering of the existing value of
s and then create another concatenated string. This can, however, be
changed, but I have not checked to see if it is worth it.

The deallocation code needs to be robust for a complex tree - it is
currently not recursive, but needs to be, like the concatenation code.

Construct like s = a + b + c + d + e , where a, b etc. have been assigned
string values earlier will not benefit from the patch.

If the values are generated and concatenated in a single expression, that is
another type of construct that will benefit.

There are some other changes needed that I can write up if needed.

-Chetan

On 10/13/06, python-dev-request at python.org 
wrote:

> Date: Fri, 13 Oct 2006 12:02:06 -0700
> From: Josiah Carlson 
> Subject: Re: [Python-Dev] PATCH submitted: Speed up + for       string
>         concatenation, now as fast as "".join(x) idiom
> To: Larry Hastings , python-dev at python.org
> Message-ID: <20061013115748.09F2.JCARLSON at uci.edu>
> Content-Type: text/plain; charset="US-ASCII"
>
>
> Larry Hastings  wrote:
> [snip]
> > The machine is dual-core, and was quiescent at the time.  XP's scheduler
> > is hopefully good enough to just leave the process running on one core.
>
> It's not.  Go into the task manager (accessable via Ctrl+Alt+Del by
> default) and change the process' affinity to the second core.  In my
> experience, running on the second core (in both 2k and XP) tends to
> produce slightly faster results.  Linux tends to keep processes on a
> single core for a few seconds at a time.
>
> - Josiah
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/0aa8795b/attachment.html 

From kristjan at ccpgames.com  Wed Oct 18 12:47:22 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Wed, 18 Oct 2006 10:47:22 -0000
Subject: [Python-Dev] PATCH submitted: Speed up + for string Re:
	PATCHsubmitted: Speed up + for string concatenation,
	now as fast as "".join(x) idiom
Message-ID: <129CEF95A523704B9D46959C922A280002FE9A10@nemesis.central.ccp.cc>

Doesn't it end up in a call to PyString_Concat()?  That should return a PyStringConcatenationObject too, right?
K

________________________________

	Construct like s = a + b + c + d + e , where a, b etc. have been assigned string values earlier will not benefit from the patch. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/e3d4f6ba/attachment.htm 

From pandyacus at gmail.com  Wed Oct 18 20:01:35 2006
From: pandyacus at gmail.com (Chetan Pandya)
Date: Wed, 18 Oct 2006 11:01:35 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string Re:
	PATCHsubmitted: Speed up + for string concatenation,
	now as fast as "".join(x) idiom
In-Reply-To: 
References: <129CEF95A523704B9D46959C922A280002FE9A10@nemesis.central.ccp.cc>

Message-ID: 

My statement wasn't clear enough.

Rendering occurs if the string being concatenated is already a concatenation
object created by an earlier assignment.

In s = a + b + c + d + e + f , there would be rendering of the source string
if it is already a concatenation.

Here is an example that would make it clear:
a = "Value a ="
a += "anything"  # creates a concatenation
c = a + b             #This would cause rendering of a and then c will
become concatenation between a and b.
c += "Something"
# This will not append to the concatenation object, but cause rendering of c
and then it will create a concatenation between c and "Something", which
will be assigned to c.

Now if there are a series of assignments,
(1) s = c + "something" # causes rendering of c
(2) s += a   # causes rendering of s and creates a new concatenation
(3) s += b  # causes rendering of s and creates a new concatenation
(4) s += c  # causes rendering of s and creates a new concatenation
(5) print s   # causes rendering of s

If there is list of strings created and then they are concatenated with +=,
I would expect it to be slower because of the additional allocations
involved in rendering.

-Chetan
On 10/18/06, Kristj?n V. J?nsson  wrote:
>
>  Doesn't it end up in a call to PyString_Concat()?  That should return a
> PyStringConcatenationObject too, right?
> K
>
>  ------------------------------
>
>
> Construct like s = a + b + c + d + e , where a, b etc. have been assigned
> string values earlier will not benefit from the patch.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/18e0c506/attachment.html 

From scott+python-dev at scottdial.com  Wed Oct 18 21:59:16 2006
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Wed, 18 Oct 2006 15:59:16 -0400
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 Re:	PATCHsubmitted: Speed up + for string concatenation,
 now as fast as "".join(x) idiom
In-Reply-To: 
References: <129CEF95A523704B9D46959C922A280002FE9A10@nemesis.central.ccp.cc>	

Message-ID: <45368794.50508@scottdial.com>

Chetan Pandya wrote:
> My statement wasn't clear enough.
> 
> Rendering occurs if the string being concatenated is already a 
> concatenation object created by an earlier assignment.
> 

I'm not sure how you came to that conclusion. My reading of the patch 
doesn't suggest that at all. The operation of string_concat would not 
produce what you suggest. I can't find anything anywhere that would 
cause the behavior you suggest. The only case where this is true is if 
the depth of the tree is too great.

To revisit your example, I will notate concats as string pairs:
a = "Value a ="  # new string
a += "anything"  # ("Value a =", "anything")
c = a + b        # (("Value a =", "anything"), b)
c += "Something" # ((("Value a =", "anything"), b), "Something"

So again, for your other example of repeated right-hand concatenation, 
you do not continually render the concat object, you merely create new 
one and attach to the leaves. Once the print is executed, you will force 
the rendering of the object, but only once that happens.

So in contrast to your statement, there are actually there are fewer 
allocations of strings and smaller objects being allocated than the 
current trunk uses.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From larry at hastings.org  Wed Oct 18 22:04:14 2006
From: larry at hastings.org (Larry Hastings)
Date: Wed, 18 Oct 2006 13:04:14 -0700
Subject: [Python-Dev] PATCH submitted: Speed up + for string
 Re:	PATCHsubmitted: Speed up + for string concatenation,
 now as fast as "".join(x) idiom
In-Reply-To: 
References: <129CEF95A523704B9D46959C922A280002FE9A10@nemesis.central.ccp.cc>	

Message-ID: <453688BE.2060509@hastings.org>

Chetan Pandya wrote:
> The deallocation code needs to be robust for a complex tree - it is 
> currently not recursive, but needs to be, like the concatenation code.
It is already both those things.

Deallocation is definitely recursive.  See Objects/stringobject.c, 
function (*ahem*) recursive_dealloc.  That Py_DECREF() line is where it 
recurses into child string concatenation objects.

You might have been confused because it is *optimized* for the general 
case, where the tree only recurses down the left-hand side.  For the 
left-hand side it iterates, instead of recursing, which is both slightly 
faster and much more robust (unlikely to blow the stack).

> Rendering occurs if the string being concatenated is already a 
> concatenation object created by an earlier assignment.
Nope.  Rendering only occurs when somebody asks for the string's value, 
not when merely concatenating.  If you add nine strings together, the 
ninth one fails the "left side has room" test and creates a second object.

Try stepping through it.  Run Python interactively under the debugger.  
Let it get to the prompt.  Execute some expression like "print 3", just 
so the interpreter creates its concatenated encoding object (I get 
"encodings.cp437").  Now, in the debugger, put a breakpoint in the 
rendering code in recursiveConcatenate(), and another on the "op = 
(PyStringConcatenationObject *)PyObject_MALLOC()" line in 
string_concat.  Finally, go back to the Python console and concatenate 
nine strings with this code:
  x = ""
  for i in xrange(9):
    x += "a"
You won't hit any breakpoints for rendering, and you'll hit the string 
concatenation object malloc line twice.  (Note that for demonstration 
purposes, this code is more illustrative than running x = "a" + "b" ... 
+ "i" because the peephole optimizer makes a constant folding pass.  
It's mostly harmless, but for my code it does mean I create 
concatenation objects more often.)

In the interests of full disclosure, there is *one* scenario where pure 
string concatenation will cause it to render.  Rendering or deallocating 
a recursive object that's too deep would blow the program stack, so I 
limit recursion depth on the right seven slots of the recursion object.  
That's what the "right recursion depth" field is used for.  If you 
attempt to concatenate a string concatenation object that's already at 
the depth limit, it renders the deep object first.  The depth limit is 
2**14 right now.

You can force this to happen by prepending like crazy:
  x = ""
  for i in xrange(2**15):
    x = "a" + x

Since my code is careful to be only iterative when rendering and 
deallocating down the left-hand side of the tree, there is no depth 
limit for the left-hand side.

Step before you leap,

/larry/

From mike.klaas at gmail.com  Wed Oct 18 22:02:42 2006
From: mike.klaas at gmail.com (Mike Klaas)
Date: Wed, 18 Oct 2006 13:02:42 -0700
Subject: [Python-Dev] Segfault in python 2.5
Message-ID: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com>

[http://sourceforge.net/tracker/index.php?func=detail&aid=1579370&group_id=5470&atid=105470]

Hello,

I'm managed to provoke a segfault in python2.5 (occasionally it just a
"invalid argument to internal function" error).  I've posted a
traceback and a general idea of what the code consists of in the
sourceforge entry.  Unfortunately, I've been attempting for hours to
reduce the problem to a completely self-contained script, but it is
resisting my efforts due to timing problems.

Should I continue in that vein, or is it more useful to provide more
detailed results from gdb?

Thanks,
-Mike

From mwh at python.net  Wed Oct 18 22:08:42 2006
From: mwh at python.net (Michael Hudson)
Date: Wed, 18 Oct 2006 22:08:42 +0200
Subject: [Python-Dev] Segfault in python 2.5
In-Reply-To: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com>
	(Mike Klaas's message of "Wed, 18 Oct 2006 13:02:42 -0700")
References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com>
Message-ID: <8764eh2xpx.fsf@starship.python.net>

"Mike Klaas"  writes:

> [http://sourceforge.net/tracker/index.php?func=detail&aid=1579370&group_id=5470&atid=105470]
>
> Hello,
>
> I'm managed to provoke a segfault in python2.5 (occasionally it just a
> "invalid argument to internal function" error).  I've posted a
> traceback and a general idea of what the code consists of in the
> sourceforge entry. 

I've been reading the bug report with interest, but unless I can
reproduce it it's mighty hard for me to debug, as I'm sure you know.

> Unfortunately, I've been attempting for hours to
> reduce the problem to a completely self-contained script, but it is
> resisting my efforts due to timing problems.
>
> Should I continue in that vein, or is it more useful to provide more
> detailed results from gdb?

Well, I don't think that there's much point in posting masses of
details from gdb.  You might want to try trying to fix the bug
yourself I guess, trying to figure out where the bad pointers come
from, etc.

Are you absolutely sure that the fault does not lie with any extension
modules you may be using?  Memory scribbling bugs have been known to
cause arbitrarily confusing problems...

Cheers,
mwh

-- 
  I'm not sure that the ability to create routing diagrams 
  similar to pretzels with mad cow disease is actually a 
  marketable skill.                                     -- Steve Levin
               -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html

From Jack.Jansen at cwi.nl  Thu Oct 19 00:23:43 2006
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu, 19 Oct 2006 00:23:43 +0200
Subject: [Python-Dev] Segfault in python 2.5
In-Reply-To: <8764eh2xpx.fsf@starship.python.net>
References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com>
	<8764eh2xpx.fsf@starship.python.net>
Message-ID: <7654C99F-5062-49AC-B604-0CBC9567A586@cwi.nl>

On 18-Oct-2006, at 22:08 , Michael Hudson wrote:
>> Unfortunately, I've been attempting for hours to
>> reduce the problem to a completely self-contained script, but it is
>> resisting my efforts due to timing problems.

Has anyone ever tried to use helgrind (the valgrind module, not the  
heavy metal band:-) on Python?
--
Jack Jansen, , http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman

From mike.klaas at gmail.com  Thu Oct 19 02:08:51 2006
From: mike.klaas at gmail.com (Mike Klaas)
Date: Wed, 18 Oct 2006 17:08:51 -0700
Subject: [Python-Dev] Segfault in python 2.5
In-Reply-To: <8764eh2xpx.fsf@starship.python.net>
References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com>
	<8764eh2xpx.fsf@starship.python.net>
Message-ID: <3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com>

On 10/18/06, Michael Hudson  wrote:
> "Mike Klaas"  writes:

> I've been reading the bug report with interest, but unless I can
> reproduce it it's mighty hard for me to debug, as I'm sure you know.

Indeed.

> > Unfortunately, I've been attempting for hours to
> > reduce the problem to a completely self-contained script, but it is
> > resisting my efforts due to timing problems.
> >
> > Should I continue in that vein, or is it more useful to provide more
> > detailed results from gdb?
>
> Well, I don't think that there's much point in posting masses of
> details from gdb.  You might want to try trying to fix the bug
> yourself I guess, trying to figure out where the bad pointers come
> from, etc.

I've peered at the code, but my knowledge of the python core is
superficial at best.  The fact that it is occuring as a result of a
long string of garbage collection/dealloc/etc. and involves threading
lowers my confidence further.   That said, I'm beginning to think that
to reproduce this in a standalone script will require understanding
the problem in greater depth regardless...

> Are you absolutely sure that the fault does not lie with any extension
> modules you may be using?  Memory scribbling bugs have been known to
> cause arbitrarily confusing problems...

I've had sufficient experience being arbitrarily confused to never be
sure about such things, but I am quite confident.  The script I posted
in the bug report is all stock python save for the operation in <>'s.
That operation is pickling and unpickling (using pickle, not cPickle)
a somewhat complicated pure-python instance several times.  It's doing
nothing with the actual instance--it just happens to take the right
amount of time to trigger the segfault.  It's still not perfect--this
trimmed-down version segfaults only sporatically, while the original
python script segfaults reliably.

-Mike

From pandyacus at gmail.com  Thu Oct 19 02:36:43 2006
From: pandyacus at gmail.com (Chetan Pandya)
Date: Wed, 18 Oct 2006 17:36:43 -0700
Subject: [Python-Dev] Python-Dev Digest, Vol 39, Issue 54
In-Reply-To: 
References: 
Message-ID: 

I got up in the middle of the night and wrote the email - and it shows.
Apologies for creating confusion. My comments below.

-Chetan

On 10/18/06, python-dev-request at python.org

>
> Date: Wed, 18 Oct 2006 13:04:14 -0700
> From: Larry Hastings 
> Subject: Re: [Python-Dev] PATCH submitted: Speed up + for string Re:
>         PATCHsubmitted: Speed up + for string concatenation, now as fast
> as
>         "".join(x) idiom
> To: python-dev at python.org
> Message-ID: <453688BE.2060509 at hastings.org>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Chetan Pandya wrote:
> > The deallocation code needs to be robust for a complex tree - it is
> > currently not recursive, but needs to be, like the concatenation code.
> It is already both those things.
>
> Deallocation is definitely recursive.  See Objects/stringobject.c,
> function (*ahem*) recursive_dealloc.  That Py_DECREF() line is where it
> recurses into child string concatenation objects.
>
> You might have been confused because it is *optimized* for the general
> case, where the tree only recurses down the left-hand side.  For the
> left-hand side it iterates, instead of recursing, which is both slightly
> faster and much more robust (unlikely to blow the stack).

Actually I looked at the setting of ob_sstrings to  NULL in
recursive_dealloc and thought  none of the strings will get destroyed as the
list is destroyed. However it is only setting the first element to NULL,
which is fine.

> Rendering occurs if the string being concatenated is already a
> > concatenation object created by an earlier assignment.
> Nope.  Rendering only occurs when somebody asks for the string's value,
> not when merely concatenating.  If you add nine strings together, the
> ninth one fails the "left side has room" test and creates a second object.

I don't know what I was thinking. In the whole of string_concat() there is
no call to render the string, except for the right recursion case.

Try stepping through it.  Run Python interactively under the debugger.
> Let it get to the prompt.  Execute some expression like "print 3", just
> so the interpreter creates its concatenated encoding object (I get
> "encodings.cp437").  Now, in the debugger, put a breakpoint in the
> rendering code in recursiveConcatenate(), and another on the "op =
> (PyStringConcatenationObject *)PyObject_MALLOC()" line in
> string_concat.  Finally, go back to the Python console and concatenate
> nine strings with this code:
>   x = ""
>   for i in xrange(9):
>     x += "a"
> You won't hit any breakpoints for rendering, and you'll hit the string
> concatenation object malloc line twice.  (Note that for demonstration
> purposes, this code is more illustrative than running x = "a" + "b" ...
> + "i" because the peephole optimizer makes a constant folding pass.
> It's mostly harmless, but for my code it does mean I create
> concatenation objects more often.)

I don't have a patch build, since I didn't download the revision used by the
patch.
However, I did look at values in the debugger and it looked like x in your
example above had a reference count of 2 or more within string_concat even
when there were no other assignments that would account for it. My idea was
to investibate this, but this was the whole reason for saying that the
concatenation will create new objects. However, I ran on another machine
under debugger and I get the reference count as 1,  which is what I would
expect.  I need to find out what has happened to my work machine.

In the interests of full disclosure, there is *one* scenario where pure
> string concatenation will cause it to render.  Rendering or deallocating
> a recursive object that's too deep would blow the program stack, so I
> limit recursion depth on the right seven slots of the recursion object.
> That's what the "right recursion depth" field is used for.  If you
> attempt to concatenate a string concatenation object that's already at
> the depth limit, it renders the deep object first.  The depth limit is
> 2**14 right now.

You can force this to happen by prepending like crazy:
>   x = ""
>   for i in xrange(2**15):
>     x = "a" + x
>
> Since my code is careful to be only iterative when rendering and
> deallocating down the left-hand side of the tree, there is no depth
> limit for the left-hand side.

The recursion limit seems to be optimistic, given the default stack limit,
but of course, I haven't tried it. There is probably a depth limit on the
left hand side as well, since recursiveConcatenate is recursive even on the
left side.

Step before you leap,
>
>
> /larry/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/a3e48cc9/attachment.htm 

From tim.peters at gmail.com  Thu Oct 19 03:02:31 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 18 Oct 2006 21:02:31 -0400
Subject: [Python-Dev] Segfault in python 2.5
In-Reply-To: <3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com>
References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com>
	<8764eh2xpx.fsf@starship.python.net>
	<3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com>
Message-ID: <1f7befae0610181802y7e0fca9ateb7d43c2ae2cff01@mail.gmail.com>

[Michael Hudson]
>> I've been reading the bug report with interest, but unless I can
>> reproduce it it's mighty hard for me to debug, as I'm sure you know.

[Mike Klaas]
> Indeed.

Note that I just attached a much simpler pure-Python script that fails
very quickly, on Windows, using a debug build.  Read the new comment
to learn why both "Windows" and "debug build" are essential to it
failing reliably and quickly ;-)

>>> Unfortunately, I've been attempting for hours to reduce the problem to a
>>> completely self-contained script, but it is resisting my efforts
due to timing
>>> problems.

Yes, but you did good!  This is still just an educated guess on my
part, but my education here is hard to match ;-):  this new business
of generators deciding to "clean up after themselves" if they're left
hanging appears to have made it possible for a generator to hold on to
a frame whose thread state has been free()'d, after the thread that
created the generator has gone away.  Then when the generator gets
collected as trash, the new exception-based "clean up abandoned
generator" gimmick tries to access the generator's frame's thread
state, but that's just a raw C struct (not a Python object with
reachability-based lifetime), and the thread free()'d that struct when
the thread went away.  The important timing-based vagary here is
whether dead-thread cleanup gets performed before the main thread
tries to clean up the trash generator.

> I've peered at the code, but my knowledge of the python core is
> superficial at best.  The fact that it is occuring as a result of a
> long string of garbage collection/dealloc/etc. and involves threading
> lowers my confidence further.   That said, I'm beginning to think that
> to reproduce this in a standalone script will require understanding
> the problem in greater depth regardless...

Or upgrade to Windows ;-)

>> Are you absolutely sure that the fault does not lie with any extension
>> modules you may be using?  Memory scribbling bugs have been known to
>> cause arbitrarily confusing problems...

Unless I've changed the symptom, it's been reduced to minimal pure
Python.  It does require a thread T, and creating a generator in T,
where the generator object's lifetime is controlled by the main
thread, and where T vanishes before the generator has exited of its
own accord.

Offhand I don't know how to repair it.  Thread states /aren't/ Python
objects, and there's no provision for a thread state to outlive the
thread it represents.

> I've had sufficient experience being arbitrarily confused to never be
> sure about such things, but I am quite confident.  The script I posted
> in the bug report is all stock python save for the operation in <>'s.
> That operation is pickling and unpickling (using pickle, not cPickle)
> a somewhat complicated pure-python instance several times.

FYI, in my whittled script, your `getdocs()` became simply:

def getdocs():
    while True:
        yield None

and it's called only once, via self.docIter.next().  In fact, the
"while True:" isn't needed there either (given that it's only resumed
once now).

From mike.klaas at gmail.com  Thu Oct 19 04:26:59 2006
From: mike.klaas at gmail.com (Mike Klaas)
Date: Wed, 18 Oct 2006 19:26:59 -0700
Subject: [Python-Dev] Segfault in python 2.5
In-Reply-To: <1f7befae0610181802y7e0fca9ateb7d43c2ae2cff01@mail.gmail.com>
References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com>
	<8764eh2xpx.fsf@starship.python.net>
	<3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com>
	<1f7befae0610181802y7e0fca9ateb7d43c2ae2cff01@mail.gmail.com>
Message-ID: <3d2ce8cb0610181926g2e0915dama208d8839bd1cc5b@mail.gmail.com>

On 10/18/06, Tim Peters  wrote:
> [Mike Klaas]
> > Indeed.
>
> Note that I just attached a much simpler pure-Python script that fails
> very quickly, on Windows, using a debug build.  Read the new comment
> to learn why both "Windows" and "debug build" are essential to it
> failing reliably and quickly ;-)

Thanks!  Next time I find a bug, installing Windows will  certainly be
my first step .

<>
> Yes, but you did good!  This is still just an educated guess on my
> part, but my education here is hard to match ;-):  this new business
> of generators deciding to "clean up after themselves" if they're left
> hanging appears to have made it possible for a generator to hold on to
> a frame whose thread state has been free()'d, after the thread that
> created the generator has gone away.  Then when the generator gets
> collected as trash, the new exception-based "clean up abandoned
> generator" gimmick tries to access the generator's frame's thread
> state, but that's just a raw C struct (not a Python object with
> reachability-based lifetime), and the thread free()'d that struct when
> the thread went away.  The important timing-based vagary here is
> whether dead-thread cleanup gets performed before the main thread
> tries to clean up the trash generator.

Indeed--and normally it doesn't happen that way.  My/your script never
crashes on the first iteration because the thread's target is the
generator and thus it gets DECREF'd before the thread terminates.  But
the exception from the first iteration holds on to a reference to the
frame/generator so when it gets cleaned up (in the second iteration,
due to a new exception overwriting it) the generator is freed after
the thread is destroyed.  At least, I think...

<>
> Offhand I don't know how to repair it.  Thread states /aren't/ Python
> objects, and there's no provision for a thread state to outlive the
> thread it represents.

Take this with a grain of salt, but ISTM that the problem can be
repaired by resetting the generator's frame threadstate to the current
threadstate:

(in genobject.c:gen_send_ex():80)
        Py_XINCREF(tstate->frame);
        assert(f->f_back == NULL);
        f->f_back = tstate->frame;
+        f->f_tstate = tstate;

        gen->gi_running = 1;
        result = PyEval_EvalFrameEx(f, exc);
        gen->gi_running = 0;

Shouldn't the thread state generally be the same anyway? (I seem to
recall some gloomy warning against resuming generators in separate
threads).

This solution is surely wrong--if f_tstate != tstate, then the
generator _is_ being resumed in another thread and so the generated
traceback will be wrong (among other issues which surely occur by
fudging a frame's threadstate).  Perhaps it could be set conditionally
by gen_close before signalling the exception?  A lie, but a smaller
lie than a segfault.  We could advertise that the exception ocurring
from generator .close() isn't guaranteed to have an accurate traceback
in this case.

Take all this with a grain of un-core-savvy salt.

Thanks again for investigating this, Tim,
-Mike

From larry at hastings.org  Thu Oct 19 08:03:25 2006
From: larry at hastings.org (Larry Hastings)
Date: Wed, 18 Oct 2006 23:03:25 -0700
Subject: [Python-Dev] Python-Dev Digest, Vol 39, Issue 54
In-Reply-To: 
References: 

Message-ID: <4537152D.7090900@hastings.org>

Chetan Pandya wrote:
> I don't have a patch build, since I didn't download the revision used 
> by the patch. 
> However, I did look at values in the debugger and it looked like x in 
> your example above had a reference count of 2 or more within 
> string_concat even when there were no other assignments that would 
> account for it.
It could be the optimizer.  If you concatenate hard-coded strings, the 
peephole optimizer does constant folding.  It says "hey, look, this 
binary operator is performed on two constant objects".  So it evaluates 
the expression itself and substitutes the result, in this case swapping 
(pseudotokens here) [PUSH "a" PUSH "b" PLUS] for [PUSH "ab"].

Oddly, it didn't seem to optimize away the whole expression.  If you say 
"a" + "b" + "c" + "d" + "e", I would have expected the peephole 
optimizer to turn that whole shebang into [PUSH "abcde"].  But when I 
gave it a cursory glance it seemed to skip every-other; it 
constant-folded "a" + "b", then  + "c" and optimized ("a" + "b" + "c") + 
"d", resulting ultimately I believe in [PUSH "ab" PUSH "cd" PLUS PUSH 
"e" PLUS].  But I suspect I missed something; it bears further 
investigation.

But this is all academic, as real-world performance of my patch is not 
contingent on what the peephole optimizer does to short runs of 
hard-coded strings in simple test cases.

> The recursion limit seems to be optimistic, given the default stack 
> limit, but of course, I haven't tried it.
I've tried it, on exactly one computer (running Windows XP).  The depth 
limit was arrived at experimentally.  But it is probably too optimistic 
and should be winched down.

On the other hand, right now when you do x = "a" + x ten zillion times 
there are always two references to the concatenation object stored in x: 
the interpreter holds one, and x itself holds the other.  That means I 
have to build a new concatenation object each time, so it becomes a 
degenerate tree (one leaf and one subtree) recursing down the right-hand 
side.

I plan to fix that in my next patch.  There's already code that says "if 
the next instruction is a store, and the location we're storing to holds 
a reference to the left-hand side of the concatenation, make the 
location drop its reference".  That was an optimization for the 
old-style concat code; when the left side only had one reference it 
would simply resize it and memcpy() in the right side.  I plan to add 
support for dropping the reference when it's the *right*-hand side of 
the concatenation, as that would help prepending immensely.  Once that's 
done, I believe it'll prepend ((depth limit) * (number of items in 
ob_sstrings - 1)) + 1 strings before needing to render.

> There is probably a depth limit on the left hand side as well, since 
> recursiveConcatenate is recursive even on the left side.
Let me again stress that recursiveConcatenate is *iterative* down the 
left side; it is *not* not *not* recursive.  The outer loop iterates 
over "s->ob_sstrings[0]"s.  The nested "for" loop iterates backwards, 
from the highest string used down to "s->ob_sstrings + 1", aka 
"&s->ob_sstrings[1]", recursing into them.  It then sets "s" to 
"*s->ob_sstrings", aka "s->ob_sstrings[0]" and the outer loop repeats.  
This is iterative.

As a personal favor to me, please step through my code before you tell 
me again how my code is recursive down the left-hand side.

Passing the dutchie,

/larry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/31d515fc/attachment-0001.htm 

From anthony at python.org  Thu Oct 19 09:42:00 2006
From: anthony at python.org (Anthony Baxter)
Date: Thu, 19 Oct 2006 17:42:00 +1000
Subject: [Python-Dev] RELEASED Python 2.4.4, Final.
Message-ID: <200610191742.14093.anthony@python.org>

On behalf of the Python development team and the Python community,
I'm happy to announce the release of Python 2.4.4 (FINAL).

Python 2.4.4 is a bug-fix release. While Python 2.5 is the latest
version of Python, we're making this release for people who are
still running Python 2.4. This is the final planned release from
the Python 2.4 series. Future maintenance releases will be in the
2.5 series, beginning with 2.5.1.

See the release notes at the website (also available as Misc/NEWS
in the source distribution) for details of the more than 80 bugs
squished in this release, including a number found by the Coverity
and Klocwork static analysis tools. We'd like to offer our thanks
to both these firms for making this available for open source
projects.

 *  Python 2.4.4 contains a fix for PSF-2006-001, a buffer overrun   *
 *  in repr() of unicode strings in wide unicode (UCS-4) builds.     *
 *  See http://www.python.org/news/security/PSF-2006-001/ for more.  *

There's only been one small change since the release candidate -
a fix to "configure" to repair cross-compiling of Python under
Unix.

For more information on Python 2.4.4, including download links
for various platforms, release notes, and known issues, please
see:

    http://www.python.org/2.4.4

Highlights of this new release include:

  - Bug fixes. According to the release notes, at least 80 have
    been fixed. This includes a fix for PSF-2006-001, a bug in
    repr() for unicode strings on UCS-4 (wide unicode) builds.

Enjoy this release,
Anthony

Anthony Baxter
anthony at python.org
Python Release Manager
(on behalf of the entire python-dev team)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061019/5bffd293/attachment.pgp 

From steve at holdenweb.com  Thu Oct 19 10:34:28 2006
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 19 Oct 2006 09:34:28 +0100
Subject: [Python-Dev] Segfault in python 2.5
In-Reply-To: <3d2ce8cb0610181926g2e0915dama208d8839bd1cc5b@mail.gmail.com>
References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com>	<8764eh2xpx.fsf@starship.python.net>	<3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com>	<1f7befae0610181802y7e0fca9ateb7d43c2ae2cff01@mail.gmail.com>
	<3d2ce8cb0610181926g2e0915dama208d8839bd1cc5b@mail.gmail.com>
Message-ID: 

Mike Klaas wrote:
> On 10/18/06, Tim Peters  wrote:
[...]
> Shouldn't the thread state generally be the same anyway? (I seem to
> recall some gloomy warning against resuming generators in separate
> threads).
> 
Is this an indication that generators aren't thread-safe?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From anthony at interlink.com.au  Thu Oct 19 18:58:10 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 20 Oct 2006 02:58:10 +1000
Subject: [Python-Dev] state of the maintenance branches
Message-ID: <200610200258.12460.anthony@interlink.com.au>

OK - 2.4.4 is done. With that, the release24-maint branch moves into dignified 
old age, where we get to mostly ignore it, yay! Unless you really feel like 
it, I don't think there's much point to making the effort to backport fixes 
to this branch. Any future releases from that branch will be of the serious 
security flaw only variety, and are almost certainly only going to have those 
critical patches applied.

Either this weekend or next week I'll cut a 2.3.6 off the release23-maint 
branch. As previously discussed, this will be a source-only release - I don't
envisage making documentation packages or binaries for it. Although should we 
maybe have new doc packages with the newer version number, just to prevent 
confusion? Fred? What do you think? I don't think there's any need to do this 
for 2.3.6c1, but maybe for 2.3.6 final? For 2.3.6, it's just 2.3.5 plus the 
email fix and the PSF-2006-001 fix. As I feared, I've had a couple of people 
asking for a 2.3.6. Oh well. Only one person has (jokingly) suggested a new 
2.2 release. That ain't going to happen :-)

I don't even want to _think_ about 2.5.1 right now. I can't see us doing this 
before December at the earliest, and preferably early in 2007. As far as I 
can see so far, the generator+threads nasty that's popped up isn't going to 
affect so many people that it needs a rushed out 2.5.1 to cover it - although 
this may change as the problem and solution becomes better understood.

Anyway, all of the above is open to disagreement or other opinions - if you 
have them, let me know.
-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From p.f.moore at gmail.com  Thu Oct 19 20:15:49 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 19 Oct 2006 19:15:49 +0100
Subject: [Python-Dev] state of the maintenance branches
In-Reply-To: <200610200258.12460.anthony@interlink.com.au>
References: <200610200258.12460.anthony@interlink.com.au>
Message-ID: <79990c6b0610191115r2b020d03o2ddec9c6b3cc3b6b@mail.gmail.com>

On 10/19/06, Anthony Baxter  wrote:
> Anyway, all of the above is open to disagreement or other opinions - if you
> have them, let me know.

My only thought is that you've done a fantastic job pushing through
all the recent releases.

Thanks!

Paul.

From brett at python.org  Thu Oct 19 20:50:54 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 19 Oct 2006 11:50:54 -0700
Subject: [Python-Dev] state of the maintenance branches
In-Reply-To: <79990c6b0610191115r2b020d03o2ddec9c6b3cc3b6b@mail.gmail.com>
References: <200610200258.12460.anthony@interlink.com.au>
	<79990c6b0610191115r2b020d03o2ddec9c6b3cc3b6b@mail.gmail.com>
Message-ID: 

On 10/19/06, Paul Moore  wrote:
>
> On 10/19/06, Anthony Baxter  wrote:
> > Anyway, all of the above is open to disagreement or other opinions - if
> you
> > have them, let me know.
>
> My only thought is that you've done a fantastic job pushing through
> all the recent releases.
>
> Thanks!

Thanks from me as well!  You showed great patience putting up with all of us
during releases.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/9d493102/attachment.html 

From rhettinger at ewtllc.com  Thu Oct 19 22:07:31 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 19 Oct 2006 13:07:31 -0700
Subject: [Python-Dev] Nondeterministic long-to-float coercion
Message-ID: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>

My colleague got an odd result today that is reproducible on his build
of Python (RedHat's distribution of Py2.4.2) but not any other builds
I've checked (including an Ubuntu Py2.4.2 built with a later version of
GCC).  I hypothesized that this was a bug in the underlying GCC
libraries, but the magnitude of the error is so large that that seems
implausible.  Does anyone have a clue what is going-on?  

Raymond

------------------------------------------------
Python 2.4.2 (#1, Mar 29 2006, 11:22:09) [GCC 4.0.2 20051125 (Red Hat
4.0.2-8)] on linux2 Type "help", "copyright", "credits" or "license" for
more information.
>>> set(-19400000000 * (1/100.0) for i in range(10000))
set([-194000000.0, -193995904.0, -193994880.0])

From skip at pobox.com  Thu Oct 19 22:44:23 2006
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 19 Oct 2006 15:44:23 -0500
Subject: [Python-Dev] Nondeterministic long-to-float coercion
In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
Message-ID: <17719.58279.458064.680744@montanaro.dyndns.org>

    Raymond> My colleague got an odd result today that is reproducible on
    Raymond> his build of Python (RedHat's distribution of Py2.4.2) but not
    Raymond> any other builds I've checked (including an Ubuntu Py2.4.2
    Raymond> built with a later version of GCC).  I hypothesized that this
    Raymond> was a bug in the underlying GCC libraries, but the magnitude of
    Raymond> the error is so large that that seems implausible.  Does anyone
    Raymond> have a clue what is going-on?

Not off the top of my head (but then I'm not a guts of the implementation or
gcc whiz).  I noticed that you used both "nondeterministic" and
"reproducible" though.  Does your colleague always get the same result?  If
you remove the set constructor do the oddball values always wind up in the
same spots on repeated calls?  Are the specific values significant (e.g., do
you really need range(10000) to demonstrate the problem)?  Also, I can never
remember exactly, but are even-numbered minor numbers in GCC releases
supposed to be development releases (or is that for the Linux kernel)?

Just a few questions that come to mind.

Skip

From grig.gheorghiu at gmail.com  Thu Oct 19 23:19:40 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Thu, 19 Oct 2006 14:19:40 -0700
Subject: [Python-Dev] Python unit tests failing on Pybots farm
Message-ID: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>

The latest trunk checkin caused almost all Pybots to fail when running
the Python unit tests.

273 tests OK.
12 tests failed:
    test___all__ test_calendar test_capi test_datetime test_email
    test_email_renamed test_imaplib test_mailbox test_strftime
    test_strptime test_time test_xmlrpc

Here's the status page:

http://www.python.org/dev/buildbot/community/trunk/

Not sure why the official Python buildbot farm is all green and
happy....maybe a difference in how the steps are running?

Grig

-- 
http://agiletesting.blogspot.com

From martin at v.loewis.de  Thu Oct 19 23:20:28 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 19 Oct 2006 23:20:28 +0200
Subject: [Python-Dev] Nondeterministic long-to-float coercion
In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
Message-ID: <4537EC1C.1060201@v.loewis.de>

Raymond Hettinger schrieb:
> My colleague got an odd result today that is reproducible on his build
> of Python (RedHat's distribution of Py2.4.2) but not any other builds
> I've checked (including an Ubuntu Py2.4.2 built with a later version of
> GCC).  I hypothesized that this was a bug in the underlying GCC
> libraries, but the magnitude of the error is so large that that seems
> implausible.  Does anyone have a clue what is going-on?  

I'd say it's memory corruption. Look:

r=array.array("d",[-194000000.0, -193995904.0, -193994880.0]).tostring()
print map(ord,r[0:8])
print map(ord,r[8:16])
print map(ord,r[16:24])

gives

[0, 0, 0, 0, 105, 32, 167, 193]
[0, 0, 0, 0, 73, 32, 167, 193]
[0, 0, 0, 0, 65, 32, 167, 193]

It's only one byte that changes, and then that in only two bits (2**3
and 2**5). Could be faulty hardware, too.

Regards,
Martin

From bos at serpentine.com  Thu Oct 19 23:08:33 2006
From: bos at serpentine.com (Bryan O'Sullivan)
Date: Thu, 19 Oct 2006 14:08:33 -0700
Subject: [Python-Dev] Nondeterministic long-to-float coercion
In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
Message-ID: <4537E951.1080107@serpentine.com>

Raymond Hettinger wrote:
> My colleague got an odd result today that is reproducible on his build
> of Python (RedHat's distribution of Py2.4.2) but not any other builds
> I've checked (including an Ubuntu Py2.4.2 built with a later version of
> GCC).  I hypothesized that this was a bug in the underlying GCC
> libraries, but the magnitude of the error is so large that that seems
> implausible.

These errors are due to a bit or two being flipped in either the long or 
double representation of the number.  They could be due to a compiler 
bug, but other potential culprits include bad memory, a bum power supply 
introducing noise, or cooling problems.

Has your colleague run memtest86 or other load tests for a day on their box?

References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
Message-ID: <1f7befae0610191428t249d2007jdc2fea9ecd8410ee@mail.gmail.com>

[Raymond Hettinger]
> My colleague got an odd result today that is reproducible on his build
> of Python (RedHat's distribution of Py2.4.2) but not any other builds
> I've checked (including an Ubuntu Py2.4.2 built with a later version of
> GCC).  I hypothesized that this was a bug in the underlying GCC
> libraries, but the magnitude of the error is so large that that seems
> implausible.
>
> Does anyone have a clue what is going-on?
>
> Python 2.4.2 (#1, Mar 29 2006, 11:22:09) [GCC 4.0.2 20051125 (Red Hat
> 4.0.2-8)] on linux2 Type "help", "copyright", "credits" or "license" for
> more information.
> >>> set(-19400000000 * (1/100.0) for i in range(10000))
> set([-194000000.0, -193995904.0, -193994880.0])

Note that the Hamming distance between -194000000.0 and -193995904.0
is 1, and ditto between -193995904.0 and -193994880.0, when viewed as
IEEE-754 doubles.  That is, 193995904.0 is "missing a bit" from
-194000000.0, and -193994880.0 is missing the same bit plus an
additional bit.  Maybe clearer, writing a function to show the hex
little-endian representation:

>>> def ashex(d):
...     return binascii.hexlify(struct.pack(">> ashex(-194000000)
'000000006920a7c1'
>>> ashex(-193995904)   # "the 2 bit" from "6" is missing, leaving 4
'000000004920a7c1'
>>> ashex(-193994880)   # and "the 8 bit" from "9" is missing, leaving 1
'000000004120a7c1'

More than anything else that suggests flaky memory, or "weak bits" in
a HW register or CPU<->FPU path.  IOW, it looks like a hardware
problem to me.

Note that the missing bits here don't coincide with a "natural"
software boundary -- screwing up a bit "in the middle of" a byte isn't
something software is prone to do.

You could try different inputs and see whether the same bits "go
missing", e.g. starting with a double with a lot of 1 bits lit.  Might
also try using these as keys to a counting dict to see how often they
go missing.

From rhettinger at ewtllc.com  Thu Oct 19 23:22:59 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 19 Oct 2006 14:22:59 -0700
Subject: [Python-Dev] Nondeterministic long-to-float coercion
Message-ID: <34FE2A7A34BC3544BC3127D023DF3D12128753@EWTEXCH.office.bhtrader.com>

> I noticed that you used both "nondeterministic" and
> "reproducible" though.  

LOL.  The nondeterministic part is that the same calculation will give
different answers and there doesn't appear to be a pattern to which of
the several answers will occur.  The reproducible part is that it
happens from session-to-session

> Are the specific values significant (e.g., do
> you really need range(10000) to demonstrate the problem)?  

No, you just need to run the calculation several times at the command
line:

>>> -19400000000 * (1/100.0)
-193994880.0
>>> -19400000000 * (1/100.0)
-194000000.0
>>> -19400000000 * (1/100.0)
-194000000.0

Raymond

-----Original Message-----
From: skip at pobox.com [mailto:skip at pobox.com] 
Sent: Thursday, October 19, 2006 1:44 PM
To: Raymond Hettinger
Cc: python-dev at python.org
Subject: Re: [Python-Dev] Nondeterministic long-to-float coercion

    Raymond> My colleague got an odd result today that is reproducible
on
    Raymond> his build of Python (RedHat's distribution of Py2.4.2) but
not
    Raymond> any other builds I've checked (including an Ubuntu Py2.4.2
    Raymond> built with a later version of GCC).  I hypothesized that
this
    Raymond> was a bug in the underlying GCC libraries, but the
magnitude of
    Raymond> the error is so large that that seems implausible.  Does
anyone
    Raymond> have a clue what is going-on?

Not off the top of my head (but then I'm not a guts of the
implementation or
gcc whiz).  I noticed that you used both "nondeterministic" and
"reproducible" though.  Does your colleague always get the same result?
If
you remove the set constructor do the oddball values always wind up in
the
same spots on repeated calls?  Are the specific values significant
(e.g., do
you really need range(10000) to demonstrate the problem)?  Also, I can
never
remember exactly, but are even-numbered minor numbers in GCC releases
supposed to be development releases (or is that for the Linux kernel)?

Just a few questions that come to mind.

Skip

From facundobatista at gmail.com  Thu Oct 19 23:24:46 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Thu, 19 Oct 2006 18:24:46 -0300
Subject: [Python-Dev] Nondeterministic long-to-float coercion
In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com>
Message-ID: 

2006/10/19, Raymond Hettinger :

> My colleague got an odd result today that is reproducible on his build
> of Python (RedHat's distribution of Py2.4.2) but not any other builds
> ...
> >>> set(-19400000000 * (1/100.0) for i in range(10000))
> set([-194000000.0, -193995904.0, -193994880.0])

I neither can reproduce it in my Ubuntu, but analyzing the problem...
what about this?:

d = {}
for i in range(10000):
  val = -19400000000 * (1/100.0)
  d[val] = d.get(val, 0) + 1

or

d = {}
for i in range(10000):
  val = -19400000000 * (1/100.0)
  d.setdefault(val, []).append(i)

I think that is interesting to know,,,

- if in these structures the problem still happens...
- how many values go for each key, and which values.

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From pandyacus at gmail.com  Fri Oct 20 00:25:01 2006
From: pandyacus at gmail.com (Chetan Pandya)
Date: Thu, 19 Oct 2006 15:25:01 -0700
Subject: [Python-Dev] Python-Dev Digest, Vol 39, Issue 55
In-Reply-To: 
References: 
Message-ID: 

Larry Hastings wrote:

> Chetan Pandya wrote:
> > I don't have a patch build, since I didn't download the revision used
> > by the patch.
> > However, I did look at values in the debugger and it looked like x in
> > your example above had a reference count of 2 or more within
> > string_concat even when there were no other assignments that would
> > account for it.
> It could be the optimizer.  If you concatenate hard-coded strings, the
> peephole optimizer does constant folding.  It says "hey, look, this
> binary operator is performed on two constant objects".  So it evaluates
> the expression itself and substitutes the result, in this case swapping
> (pseudotokens here) [PUSH "a" PUSH "b" PLUS] for [PUSH "ab"].
>
> Oddly, it didn't seem to optimize away the whole expression.  If you say
> "a" + "b" + "c" + "d" + "e", I would have expected the peephole
> optimizer to turn that whole shebang into [PUSH "abcde"].  But when I
> gave it a cursory glance it seemed to skip every-other; it
> constant-folded "a" + "b", then  + "c" and optimized ("a" + "b" + "c") +
> "d", resulting ultimately I believe in [PUSH "ab" PUSH "cd" PLUS PUSH
> "e" PLUS].  But I suspect I missed something; it bears further
> investigation.

I looked at the optimizer, but couldn't find any place where it does
constant folding for strings. However, I an unable to set breakpoints for
some mysterious reason, so investigation is somewhat hard. But  I am not
bothered about it anymore, since it does not behave the way I originally
thought it did.

But this is all academic, as real-world performance of my patch is not
> contingent on what the peephole optimizer does to short runs of
> hard-coded strings in simple test cases.
>
> > The recursion limit seems to be optimistic, given the default stack
> > limit, but of course, I haven't tried it.
> I've tried it, on exactly one computer (running Windows XP).  The depth
> limit was arrived at experimentally.  But it is probably too optimistic
> and should be winched down.

On the other hand, right now when you do x = "a" + x ten zillion times
> there are always two references to the concatenation object stored in x:
> the interpreter holds one, and x itself holds the other.  That means I
> have to build a new concatenation object each time, so it becomes a
> degenerate tree (one leaf and one subtree) recursing down the right-hand
> side.

This is the case I  was thinking of (but not what I wrote).

I plan to fix that in my next patch.  There's already code that says "if
> the next instruction is a store, and the location we're storing to holds
> a reference to the left-hand side of the concatenation, make the
> location drop its reference".  That was an optimization for the
> old-style concat code; when the left side only had one reference it
> would simply resize it and memcpy() in the right side.  I plan to add
> support for dropping the reference when it's the *right*-hand side of
> the concatenation, as that would help prepending immensely.  Once that's
> done, I believe it'll prepend ((depth limit) * (number of items in
> ob_sstrings - 1)) + 1 strings before needing to render.

I am confused as to whether you are referring to the LHS or the
concatenation operation or the assignment operation. But I haven't looked at
how the reference counting optimizations are done yet. In general, there are
caveats about removing references, but I plan to look at that later.

There is another, possibly complimentary way of reducing the recursion
depth. While creating a new concatenation object, instead of inserting the
two string references, the strings they reference can be inserted in the new
object. This can be done if the number of strings they contain is small. In
the x = "a" + x case, for example, this will reduce the recursion depth of
the string tree (but not reduce the allocations).

-Chetan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/ac0c060f/attachment.htm 

From brett at python.org  Fri Oct 20 00:25:15 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 19 Oct 2006 15:25:15 -0700
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>
Message-ID: 

On 10/19/06, Grig Gheorghiu  wrote:
>
> The latest trunk checkin caused almost all Pybots to fail when running
> the Python unit tests.
>
> 273 tests OK.
> 12 tests failed:
>     test___all__ test_calendar test_capi test_datetime test_email
>     test_email_renamed test_imaplib test_mailbox test_strftime
>     test_strptime test_time test_xmlrpc
>
> Here's the status page:
>
> http://www.python.org/dev/buildbot/community/trunk/
>
> Not sure why the official Python buildbot farm is all green and
> happy....maybe a difference in how the steps are running?

Possibly.  If you look at the reason those tests failed it is because
time.strftime is missing for some odd reason.  But none of recent checkins
seem to have anything to do with the 'time' module, let alone with how
methods are added to modules (Martin's recent checkins have been for
PyArg_ParseTuple).

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/8a7688ad/attachment.html 

From grig.gheorghiu at gmail.com  Fri Oct 20 00:30:01 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Thu, 19 Oct 2006 15:30:01 -0700
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: 
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>

Message-ID: <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>

On 10/19/06, Brett Cannon  wrote:
>
> Possibly.  If you look at the reason those tests failed it is because
> time.strftime is missing for some odd reason.  But none of recent checkins
> seem to have anything to do with the 'time' module, let alone with how
> methods are added to modules (Martin's recent checkins have been for
> PyArg_ParseTuple).
>
> -Brett

Could there possible be a side effect of the PyArg_ParseTuple changes?

Grig

From brett at python.org  Fri Oct 20 00:53:45 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 19 Oct 2006 15:53:45 -0700
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>

	<3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>
Message-ID: 

On 10/19/06, Grig Gheorghiu  wrote:
>
> On 10/19/06, Brett Cannon  wrote:
> >
> > Possibly.  If you look at the reason those tests failed it is because
> > time.strftime is missing for some odd reason.  But none of recent
> checkins
> > seem to have anything to do with the 'time' module, let alone with how
> > methods are added to modules (Martin's recent checkins have been for
> > PyArg_ParseTuple).
> >
> > -Brett
>
> Could there possible be a side effect of the PyArg_ParseTuple changes?

I doubt that, especially since I just updated my pristine checkout and
test_time passed fine.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/3ef71908/attachment.htm 

From grig.gheorghiu at gmail.com  Fri Oct 20 01:48:35 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Thu, 19 Oct 2006 16:48:35 -0700
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: 
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>

	<3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>

Message-ID: <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>

On 10/19/06, Brett Cannon  wrote:
>
>
> On 10/19/06, Grig Gheorghiu  wrote:
> > On 10/19/06, Brett Cannon  wrote:
> > >
> > > Possibly.  If you look at the reason those tests failed it is because
> > > time.strftime is missing for some odd reason.  But none of recent
> checkins
> > > seem to have anything to do with the 'time' module, let alone with how
> > > methods are added to modules (Martin's recent checkins have been for
> > > PyArg_ParseTuple).
> > >
> > > -Brett
> >
> > Could there possible be a side effect of the PyArg_ParseTuple changes?
>
> I doubt that, especially since I just updated my pristine checkout and
> test_time passed fine.
>
> -Brett
>
>

OK, I deleted the checkout directory on one of my buidslaves and
re-ran the build steps. The tests passed. So my conclusion is that a
full rebuild is needed for the tests to pass after the last checkins
(which included files such as configure and configure.in).

The Python buildbots are doing full rebuilds every time, that's why
they're green and happy, but the Pybots are just doing incremental
builds.

Maybe the makefiles should be modified so that a full rebuild is
triggered when the configure and configure.in files are changed?

At this point, I'll have to tell all the Pybots owners to delete their
checkout directories and start a new build.

Grig

From warner at lothar.com  Fri Oct 20 01:59:46 2006
From: warner at lothar.com (Brian Warner)
Date: Thu, 19 Oct 2006 16:59:46 -0700
Subject: [Python-Dev] Promoting PCbuild8
In-Reply-To: <45352A4F.2080203@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?=
	=?iso-8859-1?Q?wis's?= message of
	"Tue, 17 Oct 2006 21:09:03 +0200")
References: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc>
	<45352A4F.2080203@v.loewis.de>
Message-ID: <87vemfc0wd.fsf@lothar.com>

"Martin v. L?wis"  writes:

>> But I agree that
>> getting regular builds running would be a good thing.  An x64 box
>> would be ideal to build both the x86 and x64 versions on.  A single
>> bot can manage many platforms, right?
>
> A single machine, and a single buildbot installation, yes. But not
> a single build slave, since there can be only one build procedure
> per slave.

To be precise, you have have as many build procedures per slave as you like,
but if the procedure depends upon running on a particular platform, then it
is unlikely that a single slave can accomodate multiple platforms. Each
Builder object in the buildbot config file is created with a BuildFactory
(which defines the sequence of steps it will execute), and a list of
buildslaves that it can run on. There is a many-to-many mapping from Builders
to buildslaves.

For example, you might have an "all-tests" Builder that does a compile and
runs the unit-test suite, and a second "build-API-docs" Builder that just
runs epydoc or something. Both of these Builders could easily run on the same
slave. But if you have an x86 Builder and a PPC Builder, you'd be hard
pressed to find a single buildslave that could usefully serve for both.

If the x86 and the x64 builds can be run on the same machine, how do you
control which kind of build you're doing? The decision about whether to run
them in the same buildslave or in two separate buildslaves depends upon how
you express this control. One possibility is that you just pass some
different CFLAGS to the configure or compile step.. in that case, putting
them both in the same slave is easy, and the CFLAGS settings will appear in
your BuildFactories. If instead you have to use a separate chroot environment
(or whatever the equivalent is for this issue) for each, then it may be
easiest to run two separate buildslaves (and your BuildFactories might be
identical).

> It's possible to tell the master not to build different branches on a
> single slave (i.e. 2.5 has to wait if trunk is building), but it's not
> possible to tell it that two slaves reside on the same machine (it might be
> possible, but I don't know how to do it).

You could create a MasterLock that is shared by just the two Builders which
use slaves which share the same machine. That would prohibit the two Builders
from running at the same time. (SlaveLocks wouldn't help here, because as you
pointed out there is no way to tell the buildmaster that two slaves share a
host).

cheers,
 -Brian

From brett at python.org  Fri Oct 20 06:00:13 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 19 Oct 2006 21:00:13 -0700
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>

	<3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>

	<3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>
Message-ID: 

On 10/19/06, Grig Gheorghiu  wrote:
>
> On 10/19/06, Brett Cannon  wrote:
> >
> >
> > On 10/19/06, Grig Gheorghiu  wrote:
> > > On 10/19/06, Brett Cannon  wrote:
> > > >
> > > > Possibly.  If you look at the reason those tests failed it is
> because
> > > > time.strftime is missing for some odd reason.  But none of recent
> > checkins
> > > > seem to have anything to do with the 'time' module, let alone with
> how
> > > > methods are added to modules (Martin's recent checkins have been for
> > > > PyArg_ParseTuple).
> > > >
> > > > -Brett
> > >
> > > Could there possible be a side effect of the PyArg_ParseTuple changes?
> >
> > I doubt that, especially since I just updated my pristine checkout and
> > test_time passed fine.
> >
> > -Brett
> >
> >
>
> OK, I deleted the checkout directory on one of my buidslaves and
> re-ran the build steps. The tests passed. So my conclusion is that a
> full rebuild is needed for the tests to pass after the last checkins
> (which included files such as configure and configure.in).
>
> The Python buildbots are doing full rebuilds every time, that's why
> they're green and happy, but the Pybots are just doing incremental
> builds.
>
> Maybe the makefiles should be modified so that a full rebuild is
> triggered when the configure and configure.in files are changed?

Maybe, but I don't know how to do that.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/32ce9baf/attachment.html 

From martin at v.loewis.de  Fri Oct 20 07:51:33 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 Oct 2006 07:51:33 +0200
Subject: [Python-Dev] Promoting PCbuild8
In-Reply-To: <87vemfc0wd.fsf@lothar.com>
References: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc>	<45352A4F.2080203@v.loewis.de>
	<87vemfc0wd.fsf@lothar.com>
Message-ID: <453863E5.6060704@v.loewis.de>

Brian Warner schrieb:
> To be precise, you have have as many build procedures per slave as you like,
> but if the procedure depends upon running on a particular platform, then it
> is unlikely that a single slave can accomodate multiple platforms.

Ah, right, I can have multiple builders per slave. That's good.

For the case of x86 and AMD64, a single slave can indeed accommodate
both platforms.

> If the x86 and the x64 builds can be run on the same machine, how do you
> control which kind of build you're doing? The decision about whether to run
> them in the same buildslave or in two separate buildslaves depends upon how
> you express this control. One possibility is that you just pass some
> different CFLAGS to the configure or compile step.. in that case, putting
> them both in the same slave is easy, and the CFLAGS settings will appear in
> your BuildFactories.

Most likely, there would be different batch files to run, although using
environment variables might also work. So I guess I could use the same
slave for both builders.

> You could create a MasterLock that is shared by just the two Builders which
> use slaves which share the same machine. That would prohibit the two Builders
> from running at the same time. (SlaveLocks wouldn't help here, because as you
> pointed out there is no way to tell the buildmaster that two slaves share a
> host).

Ah, ok.

Regards,
Martin

From martin at v.loewis.de  Fri Oct 20 07:58:13 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 Oct 2006 07:58:13 +0200
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>		<3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>	
	<3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>
Message-ID: <45386575.3000809@v.loewis.de>

Grig Gheorghiu schrieb:
> OK, I deleted the checkout directory on one of my buidslaves and
> re-ran the build steps. The tests passed. So my conclusion is that a
> full rebuild is needed for the tests to pass after the last checkins
> (which included files such as configure and configure.in).

Indeed, you had to re-run configure. There was a bug where -Werror was
added to the build flags, causing several configure tests to fail
(most notably, it would determine that there's no memmove on Linux).

> Maybe the makefiles should be modified so that a full rebuild is
> triggered when the configure and configure.in files are changed?

The makefiles already do that: if configure changes, a plain
"make" will first re-run configure.

> At this point, I'll have to tell all the Pybots owners to delete their
> checkout directories and start a new build.

Not necessarily. You can also ask, at the buildbot GUI, that a
non-existing branch is build. This should cause the checkouts
to be deleted (and then the build to fail); the next regular
build will check out from scratch.

Regards,
Martin

From larry at hastings.org  Fri Oct 20 08:45:31 2006
From: larry at hastings.org (Larry Hastings)
Date: Thu, 19 Oct 2006 23:45:31 -0700
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: 
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>	<452C6FD8.8070403@v.loewis.de>

Message-ID: <4538708B.8070406@hastings.org>

Fredrik Lundh wrote:
> a dynamic registration approach would be even better, with a single entry point
> used to register all methods and hooks your C extension has implemented, and
> code on the other side that builds a properly initialized type descriptor from that
> set, using fallback functions and error stubs where needed.

I knocked out a prototype of this last week, emailed Mr. Lundh about it, 
then forgot about it.  Would anyone be interested in taking a peek at it?

I only changed one file to use this new-style initialization, 
sha256module.c.  The resulting init_sha256() looks like this:

PyMODINIT_FUNC
init_sha256(void)
{
    PyObject *m;

    SHA224type = PyType_New("_sha256.sha224", sizeof(SHAobject), NULL);
    if (SHA224type == NULL)
        return;

    PyType_SetPointer(SHA224type, pte_dealloc, &SHA_dealloc);
    PyType_SetPointer(SHA224type, pte_methods, &SHA_methods);
    PyType_SetPointer(SHA224type, pte_members, &SHA_members);
    PyType_SetPointer(SHA224type, pte_getset, &SHA_getseters);

    if (PyType_Ready(SHA224type) < 0)
        return;

    SHA256type = PyType_New("_sha256.sha256", sizeof(SHAobject), NULL);
    if (SHA256type == NULL)
        return;

    PyType_SetPointer(SHA256type, pte_dealloc, &SHA_dealloc);
    PyType_SetPointer(SHA256type, pte_methods, &SHA_methods);
    PyType_SetPointer(SHA256type, pte_members, &SHA_members);
    PyType_SetPointer(SHA256type, pte_getset, &SHA_getseters);

    if (PyType_Ready(SHA256type) < 0)
        return;

    m = Py_InitModule("_sha256", SHA_functions);
    if (m == NULL)
        return;
}

In a way this wasn't really a good showpiece for my code.  The 
"methods", "members", and "getseters" structs still need to be passed 
in.  However, I did change all four "as_" structures so you can set 
those directly.  For instance, the "concat" as_sequence method for a 
PyString object would be set using
        PyType_SetPointer(PyString_Type, pte_sequence_concat, 
string_concat);
(I actually converted the PyString object to my new code, but had 
chicken-and-egg initialization problems as a result and backed out of 
it.  The code is still in the branch, just commented out.)

Patch available for interested parties,

/larry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/531206b1/attachment.htm 

From theller at ctypes.org  Fri Oct 20 09:37:09 2006
From: theller at ctypes.org (Thomas Heller)
Date: Fri, 20 Oct 2006 09:37:09 +0200
Subject: [Python-Dev] ctypes and win64
In-Reply-To: 
References: 
Message-ID: 

Thomas Heller schrieb (this was before Python 2.5 had been released):
> The _ctypes extension module does currently not even build on Win64.
> 
> I'm (slowly) working on this (for AMD64, not for itanium), but it may
> take a good while before it is stable - It is not even fully implemented
> currently.
> 
> The win64 msi installer installs the ctypes package anyway, but it cannot be
> imported.
> 
> I suggest that it should be removed from the 2.5 win64 msi installers, so that
> at least, when it is ready, can be installed as separate package.

Then, Martin changed the win64 msi installer to exclude the ctypes package
when the _ctypes.pyd extension does not exist because it was not built.

In the meantime I have integrated patches (in the trunk) so that _ctypes
can be built for win64/AMD64, and does even work.

Can these changes be merged into release25-maint?  IMO this is low-risk
because they contain only small changes to the files in Modules/_ctypes/libffi_msvc,
plus *some* changes to support the Windows LP64 model.

I would prefer to merge these changes into release25-maint, because I want to
also release the standalone ctypes packages from this branch (using it with
svn:externals from somewhere else).

The official Python 2.5.x win64/AMD64 windows installers should still *not*
contain the ctypes package, but they could install it separately.

Thanks,
Thomas

From theller at ctypes.org  Fri Oct 20 11:59:14 2006
From: theller at ctypes.org (Thomas Heller)
Date: Fri, 20 Oct 2006 11:59:14 +0200
Subject: [Python-Dev] ctypes and win64
In-Reply-To: 
References:  
Message-ID: <45389DF2.60004@ctypes.org>

[Resent after subscribing to python-dev with this new email address,
sorry if it appears twice]

Thomas Heller schrieb (this was before Python 2.5 had been released):
> > The _ctypes extension module does currently not even build on Win64.
> > 
> > I'm (slowly) working on this (for AMD64, not for itanium), but it may
> > take a good while before it is stable - It is not even fully implemented
> > currently.
> > 
> > The win64 msi installer installs the ctypes package anyway, but it cannot be
> > imported.
> > 
> > I suggest that it should be removed from the 2.5 win64 msi installers, so that
> > at least, when it is ready, can be installed as separate package.

Then, Martin changed the win64 msi installer to exclude the ctypes package
when the _ctypes.pyd extension does not exist because it was not built.

In the meantime I have integrated patches (in the trunk) so that _ctypes
can be built for win64/AMD64, and does even work.

Can these changes be merged into release25-maint?  IMO this is low-risk
because they contain only small changes to the files in Modules/_ctypes/libffi_msvc,
plus *some* changes to support the Windows LP64 model.

I would prefer to merge these changes into release25-maint, because I want to
also release the standalone ctypes packages from this branch (using it with
svn:externals from somewhere else).

The official Python 2.5.x win64/AMD64 windows installers should still *not*
contain the ctypes package, but they could install it separately.

Thanks,
Thomas

From grig.gheorghiu at gmail.com  Fri Oct 20 17:31:45 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Fri, 20 Oct 2006 08:31:45 -0700
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: <45386575.3000809@v.loewis.de>
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>

	<3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>

	<3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>
	<45386575.3000809@v.loewis.de>
Message-ID: <3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com>

On 10/19/06, "Martin v. L?wis"  wrote:
> Grig Gheorghiu schrieb:
> > OK, I deleted the checkout directory on one of my buidslaves and
> > re-ran the build steps. The tests passed. So my conclusion is that a
> > full rebuild is needed for the tests to pass after the last checkins
> > (which included files such as configure and configure.in).
>
> Indeed, you had to re-run configure. There was a bug where -Werror was
> added to the build flags, causing several configure tests to fail
> (most notably, it would determine that there's no memmove on Linux).
>
> > Maybe the makefiles should be modified so that a full rebuild is
> > triggered when the configure and configure.in files are changed?
>
> The makefiles already do that: if configure changes, a plain
> "make" will first re-run configure.

Well, that didn't trigger a full rebuild on the Pybots buildslaves though.

>
> > At this point, I'll have to tell all the Pybots owners to delete their
> > checkout directories and start a new build.
>
> Not necessarily. You can also ask, at the buildbot GUI, that a
> non-existing branch is build. This should cause the checkouts
> to be deleted (and then the build to fail); the next regular
> build will check out from scratch.
>

OK, I'll try that next time. Or I can add an extra 'clean checkout
dir' step to the buildmaster -- but that would trigger a full rebuild
every time, which is not what I want, since some of the buildslaves
take a long time to do that.

Grig

From martin at v.loewis.de  Fri Oct 20 19:56:48 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 Oct 2006 19:56:48 +0200
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: <3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com>
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>	

	<3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>	

	<3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>	
	<45386575.3000809@v.loewis.de>
	<3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com>
Message-ID: <45390DE0.6050104@v.loewis.de>

Grig Gheorghiu schrieb:
>> > Maybe the makefiles should be modified so that a full rebuild is
>> > triggered when the configure and configure.in files are changed?
>>
>> The makefiles already do that: if configure changes, a plain
>> "make" will first re-run configure.
> 
> Well, that didn't trigger a full rebuild on the Pybots buildslaves though.

Can you provide more details? Did it not run configure again, or
did that not cause a rebuild?

There is an issue with setup.py/distutils not doing the rebuilding
properly if header files change; contributions to fix this are welcome
(quick-hacked work-arounds are not).

Regards,
Martin

From martin at v.loewis.de  Fri Oct 20 20:08:24 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 Oct 2006 20:08:24 +0200
Subject: [Python-Dev] ctypes and win64
In-Reply-To: 
References:  
Message-ID: <45391098.8020906@v.loewis.de>

Thomas Heller schrieb:
> I would prefer to merge these changes into release25-maint, because I want to
> also release the standalone ctypes packages from this branch (using it with
> svn:externals from somewhere else).

That's not a good reason for back-porting. If you want a "maintenance"
branch for ctypes, feel free to create one in the subversion, likewise
for tags.

OTOH, I can't comment on whether those changes would be acceptable for
a backport to the 2.5 maintenance branch - if they don't introduce
actual new features, it might be ok.

> The official Python 2.5.x win64/AMD64 windows installers should still *not*
> contain the ctypes package, but they could install it separately.

I don't really understand. Are you planning to back-port PCbuild changes
also? If so, how should including those extensions be suppressed?

Regards,
Martin

From grig.gheorghiu at gmail.com  Fri Oct 20 20:16:49 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Fri, 20 Oct 2006 11:16:49 -0700
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: <45390DE0.6050104@v.loewis.de>
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>

	<3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>

	<3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>
	<45386575.3000809@v.loewis.de>
	<3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com>
	<45390DE0.6050104@v.loewis.de>
Message-ID: <3f09d5a00610201116we7b418fyb7b56fe88042af5@mail.gmail.com>

On 10/20/06, "Martin v. L?wis"  wrote:
> Grig Gheorghiu schrieb:
> >> > Maybe the makefiles should be modified so that a full rebuild is
> >> > triggered when the configure and configure.in files are changed?
> >>
> >> The makefiles already do that: if configure changes, a plain
> >> "make" will first re-run configure.
> >
> > Well, that didn't trigger a full rebuild on the Pybots buildslaves though.
>
> Can you provide more details? Did it not run configure again, or
> did that not cause a rebuild?
>
> There is an issue with setup.py/distutils not doing the rebuilding
> properly if header files change; contributions to fix this are welcome
> (quick-hacked work-arounds are not).
>

Here are the steps that led to the unit test failures, after your
checkin of configure and configure.in.

svn update: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-svn/0

configure: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-configure/0

compile: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-compile/0

test: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-test/0

HTH,

Grig

From martin at v.loewis.de  Fri Oct 20 20:56:40 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 Oct 2006 20:56:40 +0200
Subject: [Python-Dev] Python unit tests failing on Pybots farm
In-Reply-To: <3f09d5a00610201116we7b418fyb7b56fe88042af5@mail.gmail.com>
References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com>		<3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com>		<3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com>	<45386575.3000809@v.loewis.de>	<3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com>	<45390DE0.6050104@v.loewis.de>
	<3f09d5a00610201116we7b418fyb7b56fe88042af5@mail.gmail.com>
Message-ID: <45391BE8.8090902@v.loewis.de>

Grig Gheorghiu schrieb:
> Here are the steps that led to the unit test failures, after your
> checkin of configure and configure.in.
> 
> svn update: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-svn/0
> 
> configure: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-configure/0
> 
> compile: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-compile/0
> 
> test: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-test/0

As you can see, it indeed re-ran configure, and it also rebuilt the
interpreter. It then did not rebuild any of the extensions except for
pyexpat and elementtree.

As I said, contributions to fix that are welcome.

Regards,
Martin

From theller at ctypes.org  Fri Oct 20 20:59:23 2006
From: theller at ctypes.org (Thomas Heller)
Date: Fri, 20 Oct 2006 20:59:23 +0200
Subject: [Python-Dev] ctypes and win64
In-Reply-To: <45391098.8020906@v.loewis.de>
References:  
	<45391098.8020906@v.loewis.de>
Message-ID: <45391C8B.1070605@ctypes.org>

Martin v. L?wis schrieb:
> Thomas Heller schrieb:

[I was talking about patches to make ctypes work on 64-bit windows]

>> I would prefer to merge these changes into release25-maint, because I want to
>> also release the standalone ctypes packages from this branch (using it with
>> svn:externals from somewhere else).
> 
> That's not a good reason for back-porting. If you want a "maintenance"
> branch for ctypes, feel free to create one in the subversion, likewise
> for tags.
> 
> OTOH, I can't comment on whether those changes would be acceptable for
> a backport to the 2.5 maintenance branch - if they don't introduce
> actual new features, it might be ok.
> 
>> The official Python 2.5.x win64/AMD64 windows installers should still *not*
>> contain the ctypes package, but they could install it separately.
> 
> I don't really understand. Are you planning to back-port PCbuild changes
> also? If so, how should including those extensions be suppressed?

Let me try to put it in different words.

The official Python-2.5.amd64.msi does *not* contain ctypes, so
the official Python-2.5.x.amd64.msi should also not contain ctypes (I assume).

Not many people (I assume again) are running 64-bit windows, and use the 64-bit Python
version - but that will probably change soon.

I would like to merge the 64-bit windows related ctypes changes in trunk, as soon as
I'm sure that they work, back into the release25-maint branch.  And also make separate
ctypes releases from the release25-maint source code.  I will only backport these changes
if I'm convinced that they do not change the functionality of tehe current code.

This way win64 Python users could install ctypes from the separate release.
Also this way the source code for ctypes in the separate and the Python bundled
releases are exactly the same, without creating too much work because of the
different repositories.

Hope that makes the plan clear,
Thomas

From skip at pobox.com  Fri Oct 20 21:39:58 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 20 Oct 2006 14:39:58 -0500
Subject: [Python-Dev] OT: fdopen on Windows question
Message-ID: <17721.9742.690489.77309@montanaro.dyndns.org>

Sorry for the off-topic post.  I figured someone here would know the answer
and I don't have access to Windows to check experimentally.

The ocrad program opens its input like so:

    if ( std::strcmp( infile_name, "-" ) == 0 )
        infile = stdin;
    else
        infile = std::fopen( infile_name, "r" );

(SpamBayes is starting to use ocrad and PIL to extract text from image
spam).  Ocrad fails on Windows because the input file is opened in text
mode.  That "r" should be "rb".  What's not clear to me is whether we can do
anything about stdin.  Will this work:

    if ( std::strcmp( infile_name, "-" ) == 0 )
        infile = std::fdopen( std::fileno(stdin), "rb" );
    else
        infile = std::fopen( infile_name, "rb" );

That is, can I change stdin from text to binary this way or is it destined
to always be in text mode?

Thx,

Skip

From skip at pobox.com  Fri Oct 20 22:04:55 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 20 Oct 2006 15:04:55 -0500
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes
Message-ID: <17721.11239.499820.841585@montanaro.dyndns.org>

I'm setting up a buildbot slave for sqlalchemy on one of my Macs at home.
When it builds and tests Python's test suite the sqlite test fails.  When I
ran it alone like this:

    ./python.exe Lib/test/test_sqlite.py

and

    ./python.exe Lib/test/regrtest.py test_sqlite

it succeeded.  When I ran the full test suite it failed.  I then tried
adding -v as the error message suggested.  It hung in test_pty waiting for a
child process to complete.  (Is this a known problem?)  I finally redirected
stdout and stderr like so:

    ./python.exe Lib/test/regrtest.py -l -v > test.out 2>&1

and it completed.  It failed 146 out of 167 tests.  Here is a sample of the
failure messages:

    ...
    CheckClose (sqlite3.test.dbapi.ConnectionTests) ... ERROR
    CheckCommit (sqlite3.test.dbapi.ConnectionTests) ... ERROR
    CheckCommitAfterNoChanges (sqlite3.test.dbapi.ConnectionTests) ... ERROR
    CheckCursor (sqlite3.test.dbapi.ConnectionTests) ... ERROR
    CheckExceptions (sqlite3.test.dbapi.ConnectionTests) ... ERROR
    CheckFailedOpen (sqlite3.test.dbapi.ConnectionTests) ... ERROR
    CheckRollback (sqlite3.test.dbapi.ConnectionTests) ... ERROR
    CheckRollbackAfterNoChanges (sqlite3.test.dbapi.ConnectionTests) ... ERROR
    CheckArraySize (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckClose (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckCursorConnection (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckCursorWrongClass (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteArgFloat (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteArgInt (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteArgString (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteDictMapping (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteDictMappingNoArgs (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteDictMappingTooLittleArgs (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteDictMappingUnnamed (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteIllegalSql (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteManyGenerator (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteManyIterator (sqlite3.test.dbapi.CursorTests) ... ERROR
    CheckExecuteManyNotIterable (sqlite3.test.dbapi.CursorTests) ... ERROR
    ...

A quick check of the tracebacks shows all the errors are of this form
(CheckClose is the first failure):

    ======================================================================
    ERROR: CheckClose (sqlite3.test.dbapi.ConnectionTests)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/Library/Buildbot/pybot/trunk.montanaro-g5/build/Lib/sqlite3/test/dbapi.py", line 85, in setUp
        self.cx = sqlite.connect(":memory:")
    ProgrammingError: library routine called out of sequence

That is, they all raise the same exception and all exceptions are raised on
sqlite.connect(":memory:") calls.  Sometimes there is a second parameter to
the call.

Anybody seen this before?

Skip

From martin at v.loewis.de  Sat Oct 21 00:46:47 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Oct 2006 00:46:47 +0200
Subject: [Python-Dev] svn.python.org down
In-Reply-To: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com>
Message-ID: <453951D7.2000900@v.loewis.de>

Grig Gheorghiu schrieb:
> FYI -- can't do svn checkouts/updates from the trunk at this point.
> 
> starting svn operation
> svn update --revision HEAD
>  in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs)
> svn: PROPFIND request failed on '/projects/python/trunk'
> svn: PROPFIND of '/projects/python/trunk': could not connect to server
> (http://svn.python.org)

It turns out that there was a power surge at the colocation site where
the machines are, and, due to an unfortunate series of wreck, power went
out for about one second. When power came back, the machine rebooted,
but, for some reason, the svn apache server did not.

Regards,
Martin

From larry at hastings.org  Sat Oct 21 04:29:01 2006
From: larry at hastings.org (Larry Hastings)
Date: Fri, 20 Oct 2006 19:29:01 -0700
Subject: [Python-Dev] The "lazy strings" patch [was: PATCH submitted: Speed
 up + for string concatenation, now as fast as "".join(x) idiom]
In-Reply-To: <4523F890.9060804@hastings.org>
References: <4523F890.9060804@hastings.org>
Message-ID: <453985ED.7050303@hastings.org>

I've significantly enhanced my string-concatenation patch, to the point 
where that name is no longer accurate.  So I've redubbed it the "lazy 
strings" patch.

The major new feature is that string *slices* are also represented with 
a lazy-evaluation placeholder for the actual string, just as 
concatenated strings were in my original patch.  The lazy slice object 
stores a reference to the original PyStringObject * it is sliced from, 
and the desired start and stop slice markers.  (It only supports step = 
1.)  Its ob_sval is NULL until the string is rendered--but that rarely 
happens!  Not only does this mean string slices are faster, but I bet 
this generally reduces overall memory usage for slices too.

Now, one rule of the Python programming API is that "all strings are 
zero-terminated".  That part of makes the life of a Python extension 
author sane--they don't have to deal with some exotic Python string 
class, they can just assume C-style strings everywhere.  Ordinarily, 
this means a string slice couldn't simply point into the original 
string; if it did, and you executed
  x = "abcde"
  y = x[1:4]
internally y->ob_sval[3] would not be 0, it would be 'e', breaking the 
API's rule about strings.

However!  When a PyStringObject lives out its life purely within the 
Python VM, the only code that strenuously examines its internals is 
stringobject.c.  And that code almost never needs the trailing zero*.  
So I've added a new static method in stringobject.c:
    char * PyString_AsUnterminatedString(PyStringObject *)
If you call it on a lazy-evaluation slice object, it gives you back a 
pointer into the original string's ob_sval.  The s->ob_size'th element 
of this *might not* be zero, but if you call this function you're saying 
that's a-okay, you promise not to look at it.  (If the PyStringObject * 
is any other variety, it calls into PyString_AsString, which renders 
string concatenation objects then returns ob_sval.)

Again: this behavior is *never* observed by anyone outside of 
stringobject.c.  External users of PyStringObjects call 
PyString_AS_STRING(), which renders all lazy concatenation and lazy 
slices so they look just like normal zero-terminated PyStringObjects.  
With my patch applied, trunk still passes all expected tests.

Of course, lazy slice objects aren't just for literal slices created 
with [x:y].  There are lots of string methods that return what are 
effectively string slices, like lstrip() and split().

With this code in place, string slices that aren't examined by modules 
are very rarely rendered.  I ran "pybench -n 2" (two rounds, warp 10 
(whatever that means)) while collecting some statistics.  When it 
finished, the interpreter had created a total of 640,041 lazy slices, of 
which only *19* were ever rendered.

Apart from lazy slices, there's only one more enhancement when compared 
with v1: string prepending now reuses lazy concatenation objects much 
more often. There was an optimization in string_concatenate 
(Python/ceval.c) that said: "if the left-side string has two references, 
and we're about to overwrite the second reference by storing this 
concatenation to an object, tell that object to drop its reference".  
That often meant the reference on the string dropped to 1, which meant 
PyString_Resize could just resize the left-side string in place and 
append the right-side.  I modified it so it drops the reference to the 
right-hand operand too.  With this change, even with a reduction in the 
allowable stack depth for right-hand recursion (so it's less likely to 
blow the stack), I was able to prepend over 86k strings before it forced 
a render.  (Oh, for the record: I ensure depth limits are enforced when 
combining lazy slices and lazy concatenations, so you still won't blow 
your stack when you mix them together.)

Here are the highlights of a single apples-to-apples pybench run, 2.6 
trunk revision 52413 ("this") versus that same revision with my patch 
applied ("other"):

Test                             minimum run-time        average  run-time
                                 this    other   diff    this    other   
diff
-------------------------------------------------------------------------------
                 ConcatStrings:   204ms    76ms +168.4%   213ms    77ms 
+177.7%
       CreateStringsWithConcat:   159ms   138ms  +15.7%   163ms   142ms  
+15.1%
                 StringSlicing:   142ms    86ms  +65.5%   145ms    88ms  
+64.6%
-------------------------------------------------------------------------------
Totals:                          7976ms  7713ms   +3.4%  8257ms  
7975ms   +3.5%

I also ran this totally unfair benchmark:
    x = "abcde" * (20000) # 100k characters
    for i in xrange(10000000):
        y = x[1:-1]
and found my patched version to be 9759% faster.  (You heard that right, 
98x faster.)

I'm ready to post the patch.  However, as a result of this work, the 
description on the original patch page is really no longer accurate:

http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470
Shall I close/delete that patch and submit a new patch with a more 
modern description?  After all, there's not a lot of activity on the old 
patch page...

Cheers,

/larry/

* As I recall, stringobject.c needs the trailing zero in exactly *one* 
place: when comparing two zero-length strings.  My patch ensures that 
zero-length slices and concatenations still return nullstring, so this 
still works as expected.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061020/0a891b49/attachment.html 

From skip at pobox.com  Sat Oct 21 06:42:05 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 20 Oct 2006 23:42:05 -0500
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ...
	sometimes
Message-ID: <17721.42269.950166.859760@montanaro.dyndns.org>

Following up on my earlier post...

I svn up'd both my g5 and my g4 powerbook (both running OSX 10.4.8, gcc
4.0.0 apple build 5026), built and tested both.  The test suite completed
fine on my powerbook, failed on the g5.  I tried running regrtest.py twice
more on the g5 with the -r flag.  It failed the first time, succeeded the
second.  I then made a series of run with the -f flag (thank you once again
for that Se?or Peters).  I whittled it down to the following reliably
failing pair:

    $ ./python.exe Lib/test/regrtest.py -l -f tests
    test_ctypes
    test_sqlite
    test test_sqlite failed -- errors occurred; run in verbose mode for details
    1 test OK.
    1 test failed:
	test_sqlite

For confirmation, this pair works fine on my g4 powerbook.  I've gone no
further so far.  It's bedtime.  Maybe someone else can at least try to
reproduce what I've come up with so far on other platforms or on another Mac
g5.

Skip

From fredrik at pythonware.com  Sat Oct 21 08:23:17 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 21 Oct 2006 08:23:17 +0200
Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
In-Reply-To: <4538708B.8070406@hastings.org>
References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com>	<452C6FD8.8070403@v.loewis.de>	
	<4538708B.8070406@hastings.org>
Message-ID: 

Larry Hastings wrote:

> I knocked out a prototype of this last week, emailed Mr. Lundh about it, 
> then forgot about it.

It's on my TODO list, so I haven't forgotten about it, but I've been (as 
usual) busy with other stuff.  I'll get there, sooner or later.

Posting this to the patch tracker and posting a note to the Py3K mailing 
list could be a good idea.

From talin at acm.org  Sat Oct 21 08:37:45 2006
From: talin at acm.org (Talin)
Date: Fri, 20 Oct 2006 23:37:45 -0700
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453985ED.7050303@hastings.org>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
Message-ID: <4539C039.90406@acm.org>

Interesting - is it possible that the same technique could be used to 
hide differences in character width? Specifically, if I concatenate an 
ascii string with a UTF-32 string, can the up-conversion to UTF-32 also 
be done lazily? If that could be done efficiently, it would resolve some 
outstanding issues that have come up on the Python-3000 list with 
regards to str/unicode convergence.

Larry Hastings wrote:
> 
> I've significantly enhanced my string-concatenation patch, to the point 
> where that name is no longer accurate.  So I've redubbed it the "lazy 
> strings" patch.
> 
> The major new feature is that string *slices* are also represented with 
> a lazy-evaluation placeholder for the actual string, just as 
> concatenated strings were in my original patch.  The lazy slice object 
> stores a reference to the original PyStringObject * it is sliced from, 
> and the desired start and stop slice markers.  (It only supports step = 
> 1.)  Its ob_sval is NULL until the string is rendered--but that rarely 
> happens!  Not only does this mean string slices are faster, but I bet 
> this generally reduces overall memory usage for slices too.
> 
> Now, one rule of the Python programming API is that "all strings are 
> zero-terminated".  That part of makes the life of a Python extension 
> author sane--they don't have to deal with some exotic Python string 
> class, they can just assume C-style strings everywhere.  Ordinarily, 
> this means a string slice couldn't simply point into the original 
> string; if it did, and you executed
>  x = "abcde"
>  y = x[1:4]
> internally y->ob_sval[3] would not be 0, it would be 'e', breaking the 
> API's rule about strings.
> 
> However!  When a PyStringObject lives out its life purely within the 
> Python VM, the only code that strenuously examines its internals is 
> stringobject.c.  And that code almost never needs the trailing zero*.  
> So I've added a new static method in stringobject.c:
>    char * PyString_AsUnterminatedString(PyStringObject *)
> If you call it on a lazy-evaluation slice object, it gives you back a 
> pointer into the original string's ob_sval.  The s->ob_size'th element 
> of this *might not* be zero, but if you call this function you're saying 
> that's a-okay, you promise not to look at it.  (If the PyStringObject * 
> is any other variety, it calls into PyString_AsString, which renders 
> string concatenation objects then returns ob_sval.)
> 
> Again: this behavior is *never* observed by anyone outside of 
> stringobject.c.  External users of PyStringObjects call 
> PyString_AS_STRING(), which renders all lazy concatenation and lazy 
> slices so they look just like normal zero-terminated PyStringObjects.  
> With my patch applied, trunk still passes all expected tests.
> 
> Of course, lazy slice objects aren't just for literal slices created 
> with [x:y].  There are lots of string methods that return what are 
> effectively string slices, like lstrip() and split().
> 
> With this code in place, string slices that aren't examined by modules 
> are very rarely rendered.  I ran "pybench -n 2" (two rounds, warp 10 
> (whatever that means)) while collecting some statistics.  When it 
> finished, the interpreter had created a total of 640,041 lazy slices, of 
> which only *19* were ever rendered.
> 
> 
> Apart from lazy slices, there's only one more enhancement when compared 
> with v1: string prepending now reuses lazy concatenation objects much 
> more often. There was an optimization in string_concatenate 
> (Python/ceval.c) that said: "if the left-side string has two references, 
> and we're about to overwrite the second reference by storing this 
> concatenation to an object, tell that object to drop its reference".  
> That often meant the reference on the string dropped to 1, which meant 
> PyString_Resize could just resize the left-side string in place and 
> append the right-side.  I modified it so it drops the reference to the 
> right-hand operand too.  With this change, even with a reduction in the 
> allowable stack depth for right-hand recursion (so it's less likely to 
> blow the stack), I was able to prepend over 86k strings before it forced 
> a render.  (Oh, for the record: I ensure depth limits are enforced when 
> combining lazy slices and lazy concatenations, so you still won't blow 
> your stack when you mix them together.)
> 
> 
> Here are the highlights of a single apples-to-apples pybench run, 2.6 
> trunk revision 52413 ("this") versus that same revision with my patch 
> applied ("other"):
> 
> Test                             minimum run-time        average  run-time
>                                 this    other   diff    this    other   
> diff
> ------------------------------------------------------------------------------- 
> 
>                 ConcatStrings:   204ms    76ms +168.4%   213ms    77ms 
> +177.7%
>       CreateStringsWithConcat:   159ms   138ms  +15.7%   163ms   142ms  
> +15.1%
>                 StringSlicing:   142ms    86ms  +65.5%   145ms    88ms  
> +64.6%
> ------------------------------------------------------------------------------- 
> 
> Totals:                          7976ms  7713ms   +3.4%  8257ms  
> 7975ms   +3.5%
> 
> I also ran this totally unfair benchmark:
>    x = "abcde" * (20000) # 100k characters
>    for i in xrange(10000000):
>        y = x[1:-1]
> and found my patched version to be 9759% faster.  (You heard that right, 
> 98x faster.)
> 
> 
> I'm ready to post the patch.  However, as a result of this work, the 
> description on the original patch page is really no longer accurate:
>    
> http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 
> 
> Shall I close/delete that patch and submit a new patch with a more 
> modern description?  After all, there's not a lot of activity on the old 
> patch page...
> 
> 
> Cheers,
> 
> 
> /larry/
> 
> * As I recall, stringobject.c needs the trailing zero in exactly *one* 
> place: when comparing two zero-length strings.  My patch ensures that 
> zero-length slices and concatenations still return nullstring, so this 
> still works as expected.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/talin%40acm.org

From fredrik at pythonware.com  Sat Oct 21 09:10:19 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 21 Oct 2006 09:10:19 +0200
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <4539C039.90406@acm.org>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
	<4539C039.90406@acm.org>
Message-ID: 

Talin wrote:

> Interesting - is it possible that the same technique could be used to 
> hide differences in character width? Specifically, if I concatenate an 
> ascii string with a UTF-32 string, can the up-conversion to UTF-32 also 
> be done lazily?

of course.

and if all you do with the result is write it to an UTF-8 stream, it 
doesn't need to be done at all.  this requires a slightly more elaborate 
C-level API interface than today's PyString_AS_STRING API, though...

(which is why this whole exercise belongs on the Python 3000 lists, not 
on python-dev for 2.X)

From martin at v.loewis.de  Sat Oct 21 09:59:30 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Oct 2006 09:59:30 +0200
Subject: [Python-Dev] The "lazy strings" patch [was: PATCH submitted:
 Speed up + for string concatenation, now as fast as "".join(x) idiom]
In-Reply-To: <453985ED.7050303@hastings.org>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
Message-ID: <4539D362.9010909@v.loewis.de>

Larry Hastings schrieb:
> I've significantly enhanced my string-concatenation patch, to the point
> where that name is no longer accurate.  So I've redubbed it the "lazy
> strings" patch.

It's not clear to me what you want to achieve with these patches,
in particular, whether you want to see them integrated into Python or
not.

> The major new feature is that string *slices* are also represented with
> a lazy-evaluation placeholder for the actual string, just as
> concatenated strings were in my original patch.  The lazy slice object
> stores a reference to the original PyStringObject * it is sliced from,
> and the desired start and stop slice markers.  (It only supports step =
> 1.)

I think this specific approach will find strong resistance. It has been
implemented many times, e.g. (apparently) in NextStep's NSString, and
in Java's string type (where a string holds a reference to a character
array, a start index, and an end index). Most recently, it was discussed
under the name "string view" on the Py3k list, see

http://mail.python.org/pipermail/python-3000/2006-August/003282.html

Traditionally, the biggest objection is that even small strings may
consume insane amounts of memory.

> Its ob_sval is NULL until the string is rendered--but that rarely
> happens!  Not only does this mean string slices are faster, but I bet
> this generally reduces overall memory usage for slices too.

Channeling Guido: what real-world applications did you study with
this patch to make such a claim?

> I'm ready to post the patch.  However, as a result of this work, the
> description on the original patch page is really no longer accurate:
>    
> http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470
> Shall I close/delete that patch and submit a new patch with a more
> modern description?  After all, there's not a lot of activity on the old
> patch page...

Closing the issue and opening a new is fine.

Regards,
Martin

From martin at v.loewis.de  Sat Oct 21 19:24:58 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Oct 2006 19:24:58 +0200
Subject: [Python-Dev] OT: fdopen on Windows question
In-Reply-To: <17721.9742.690489.77309@montanaro.dyndns.org>
References: <17721.9742.690489.77309@montanaro.dyndns.org>
Message-ID: <453A57EA.4020201@v.loewis.de>

skip at pobox.com schrieb:
> That is, can I change stdin from text to binary this way or is it destined
> to always be in text mode?

You can call _setmode on the file descriptor.

Regards,
Martin

From skip at pobox.com  Sat Oct 21 20:03:09 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sat, 21 Oct 2006 13:03:09 -0500
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ...
	sometimes
Message-ID: <17722.24797.806588.338942@montanaro.dyndns.org>

Followup #2...

Yesterday I whittled my problems with test_sqlite on my OSX g5 to
test_ctypes and test_sqlite:

    ./python.exe Lib/test/regrtest.py -l -f tests
    test_ctypes
    test_sqlite
    test test_sqlite failed -- errors occurred; run in verbose mode for details
    1 test OK.
    1 test failed:
        test_sqlite

Today I refined things further.  I renamed all the test_*.py files in
Lib/ctypes/test/ until all I was left with was test_find.py.  It fails if
that's the only ctypes test script run:

    $ ls -l *.py
    -rw-------   1 buildbot  buildbot  6870 Oct 20 06:30 __init__.py
    -rw-------   1 buildbot  buildbot   624 Oct 20 06:30 runtests.py
    -rw-------   1 buildbot  buildbot  3463 Oct 21 12:52 test_find.py
    montanaro:~/pybot/trunk.montanaro-g5/build/Lib/ctypes/test buildbot$ cd -
    /Library/Buildbot/pybot/trunk.montanaro-g5/build
    montanaro:~/pybot/trunk.montanaro-g5/build buildbot$ ./python.exe Lib/test/regrtest.py -l -f tests
    test_ctypes
    test_sqlite
    test test_sqlite failed -- errors occurred; run in verbose mode for details
    1 test OK.
    1 test failed:
        test_sqlite

test_find.py contains checks for three OpenGL libraries on darwin: gl, glu
and glut.  If I comment out all those tests, test_sqlite succeeds.  If any
of them are enabled, test_sqlite fails.

I've taken this about as far as I can.  I submitted a bug report here:

    http://python.org/sf/1581906

Skip

From janssen at parc.com  Sat Oct 21 19:58:33 2006
From: janssen at parc.com (Bill Janssen)
Date: Sat, 21 Oct 2006 10:58:33 PDT
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: Your message of "Fri, 20 Oct 2006 23:37:45 PDT."
	<4539C039.90406@acm.org> 
Message-ID: <06Oct21.105836pdt."58648"@synergy1.parc.xerox.com>

See also the Cedar Ropes work:

http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue12/spe986.pdf

Bill

From martin at v.loewis.de  Sat Oct 21 20:10:10 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Oct 2006 20:10:10 +0200
Subject: [Python-Dev] ctypes and win64
In-Reply-To: <45391C8B.1070605@ctypes.org>
References: 
		<45391098.8020906@v.loewis.de>
	<45391C8B.1070605@ctypes.org>
Message-ID: <453A6282.5090804@v.loewis.de>

Thomas Heller schrieb:
> The official Python-2.5.amd64.msi does *not* contain ctypes, so
> the official Python-2.5.x.amd64.msi should also not contain ctypes (I assume).

That would be good, yes.

> Not many people (I assume again) are running 64-bit windows, and use the 64-bit Python
> version

I also agree.

> - but that will probably change soon.

It speculation either way, but I disagree. It will take several years
until people widely use Win64. For the foreseeable future, there are
too many inconveniences to make it practical.

> I would like to merge the 64-bit windows related ctypes changes in trunk, as soon as
> I'm sure that they work, back into the release25-maint branch.  And also make separate
> ctypes releases from the release25-maint source code.  I will only backport these changes
> if I'm convinced that they do not change the functionality of tehe current code.

I understand this. Still, integrating such changes formally introduces a
new feature to the 2.5 branch (even though the feature isn't exposed
readily). Whether or not this is ok is for the release manager to
decide.

What I don't understand is what the "64-bit windows related ctypes
changes" are. Do they include changes to the PCbuild directory?

Regards,
Martin

From jcarlson at uci.edu  Sun Oct 22 00:02:14 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 21 Oct 2006 15:02:14 -0700
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453985ED.7050303@hastings.org>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
Message-ID: <20061021111107.0A5F.JCARLSON@uci.edu>

Larry Hastings  wrote:
> 
> I've significantly enhanced my string-concatenation patch, to the point 
> where that name is no longer accurate.  So I've redubbed it the "lazy 
> strings" patch.
[snip]

Honestly, I don't believe that pure strings should be this complicated. 
The implementation of the standard string and unicode type should be as
simple as possible.  The current string and unicode implementations are,
in my opinion, as simple as possible given Python's needs.

As such, I don't see a need to go mucking about with the standard string
implementation to make it "lazy" so as to increase performance, reduce
memory consumption, etc.. However, having written a somewhat "lazy"
string slicing/etc operation class I called a "string view", whose
discussion and implementation can be found in the py3k list, I do
believe that having a related type, perhaps with the tree-based
implementation you have written, or a simple pointer + length variant
like I have written, would be useful to have available to Python.

I also believe that it smells like a Py3k feature, which suggests that
you should toss the whole string reliance and switch to unicode, as str
and unicode become bytes and text in Py3k, with bytes being mutable.

 - Josiah

From mark at pandapocket.com  Sun Oct 22 00:54:03 2006
From: mark at pandapocket.com (=?utf-8?Q?Mark=20Roberts?=)
Date: Sat, 21 Oct 2006 22:54:03 +0000
Subject: [Python-Dev] The "lazy strings" patch
Message-ID: <20061021225403.4892.qmail@s402.sureserver.com>

Hmm,

I have not viewed the patch in question, but I'm curious why we wouldn't want to include such a patch if it were transparent to the user (Python based or otherwise).  Especially if it increased performance without sacrificing maintainability or elegance.  Further considering the common usage of strings in usual programming, I fail to see why an implementation like this would not be desirable?

If there's a widely recognized argument against this, a link will likely sate my curiosity.

Thanks,
Mark

>  -------Original Message-------
>  From: Josiah Carlson 
>  Subject: Re: [Python-Dev] The "lazy strings" patch
>  Sent: 21 Oct '06 22:02
>  
>  
>  Larry Hastings  wrote:
>  >
>  > I've significantly enhanced my string-concatenation patch, to the point
>  > where that name is no longer accurate.  So I've redubbed it the "lazy
>  > strings" patch.
>  [snip]
>  
>  Honestly, I don't believe that pure strings should be this complicated.
>  The implementation of the standard string and unicode type should be as
>  simple as possible.  The current string and unicode implementations are,
>  in my opinion, as simple as possible given Python's needs.
>  
>  As such, I don't see a need to go mucking about with the standard string
>  implementation to make it "lazy" so as to increase performance, reduce
>  memory consumption, etc.. However, having written a somewhat "lazy"
>  string slicing/etc operation class I called a "string view", whose
>  discussion and implementation can be found in the py3k list, I do
>  believe that having a related type, perhaps with the tree-based
>  implementation you have written, or a simple pointer + length variant
>  like I have written, would be useful to have available to Python.
>  
>  I also believe that it smells like a Py3k feature, which suggests that
>  you should toss the whole string reliance and switch to unicode, as str
>  and unicode become bytes and text in Py3k, with bytes being mutable.
>  
>  
>  - Josiah
>  
>  _______________________________________________
>  Python-Dev mailing list
>  Python-Dev at python.org
>  http://mail.python.org/mailman/listinfo/python-dev
>  Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40pandapocket.com
>  

From guido at python.org  Sun Oct 22 01:50:12 2006
From: guido at python.org (Guido van Rossum)
Date: Sat, 21 Oct 2006 16:50:12 -0700
Subject: [Python-Dev] Modulefinder
In-Reply-To: 
References: 
Message-ID: 

Could you also prepare a patch for the p3yk branch? It's broken there too...

On 10/13/06, Thomas Heller  wrote:
> I have patched Lib/modulefinder.py to work with absolute and relative imports.
> It also is faster now, and has basic unittests in Lib/test/test_modulefinder.py.
>
> The work was done in a theller_modulefinder SVN branch.
> If nobody objects, I will merge this into trunk, and possibly also into release25-maint, when I have time.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jjl at pobox.com  Sun Oct 22 02:04:10 2006
From: jjl at pobox.com (John J Lee)
Date: Sun, 22 Oct 2006 00:04:10 +0000 (UTC)
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <20061021225403.4892.qmail@s402.sureserver.com>
References: <20061021225403.4892.qmail@s402.sureserver.com>
Message-ID: 

On Sat, 21 Oct 2006, Mark Roberts wrote:
[...]
> If there's a widely recognized argument against this, a link will likely 
> sate my curiosity.

Quoting from Martin v. Loewis earlier on the same day you posted:

"""
I think this specific approach will find strong resistance. It has been
implemented many times, e.g. (apparently) in NextStep's NSString, and
in Java's string type (where a string holds a reference to a character
array, a start index, and an end index). Most recently, it was discussed
under the name "string view" on the Py3k list, see

http://mail.python.org/pipermail/python-3000/2006-August/003282.html

Traditionally, the biggest objection is that even small strings may
consume insane amounts of memory.
"""

John

From ndunn at ndunn.com  Sat Oct 21 16:13:48 2006
From: ndunn at ndunn.com (Neil Dunn)
Date: Sat, 21 Oct 2006 15:13:48 +0100
Subject: [Python-Dev] Optional type checking/pluggable type systems for
	Python
Message-ID: 

Dear All

I'm a Master's student at Imperial College London currently selecting
a Master's thesis subject. I am exploring the possibility of "optional
typing" and "pluggable type systems" (Bracha) for Python. Reading
around I see that PEP 246 (object adaption) was dropped for "something
better". Is this "something better" currently in production for Python
3000 or just a thinking ground.

I'd like to know whether there would be any merit in exploring the
project or whether this is something that is going to appear as
implementation within the next 6 months (the length of my thesis).

If you think it is still something worth exploring I'd plan to pick up
the idea as a research project and explore implementations, probabaly
in CPython or Jython.

Any help with this would be great, could you please reply directly to
ndunn at ndunn.com as I haven't subscribed to python-dev for a while now.

Thanks,
Neil Dunn

From aahz at pythoncraft.com  Sun Oct 22 05:58:25 2006
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 21 Oct 2006 20:58:25 -0700
Subject: [Python-Dev] Optional type checking/pluggable type systems for
	Python
In-Reply-To: 
References: 
Message-ID: <20061022035825.GA20602@panix.com>

On Sat, Oct 21, 2006, Neil Dunn wrote:
>
> Any help with this would be great, could you please reply directly to
> ndunn at ndunn.com as I haven't subscribed to python-dev for a while now.

You should also post this to the python-3000 list; the lists do not all
have the same readership.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you don't know what your program is supposed to do, you'd better not
start writing it."  --Dijkstra

From tjreedy at udel.edu  Sun Oct 22 06:04:34 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 22 Oct 2006 00:04:34 -0400
Subject: [Python-Dev] Optional type checking/pluggable type systems
	forPython
References: 
Message-ID: 

"Neil Dunn"  wrote in message 
news:f56dda5c0610210713k7c500637w25483e473ed263bb at mail.gmail.com...
> Dear All
>
> I'm a Master's student at Imperial College London currently selecting
> a Master's thesis subject. I am exploring the possibility of "optional
> typing" and "pluggable type systems" (Bracha) for Python. Reading
> around I see that PEP 246 (object adaption) was dropped for "something
> better". Is this "something better" currently in production for Python
> 3000 or just a thinking ground.

Thinking, as far as I know.

> I'd like to know whether there would be any merit in exploring the
> project or whether this is something that is going to appear as
> implementation within the next 6 months (the length of my thesis).
>
> If you think it is still something worth exploring I'd plan to pick up
> the idea as a research project and explore implementations, probabaly
> in CPython or Jython.
>
> Any help with this would be great, could you please reply directly to
> ndunn at ndunn.com as I haven't subscribed to python-dev for a while now.

You can follow both python-dev and py3000 lists as newsgroups via
news.gmane.org.  It also has archives.

From ronaldoussoren at mac.com  Sun Oct 22 10:06:05 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 22 Oct 2006 10:06:05 +0200
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX
	...	sometimes
In-Reply-To: <17722.24797.806588.338942@montanaro.dyndns.org>
References: <17722.24797.806588.338942@montanaro.dyndns.org>
Message-ID: <660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com>

On Oct 21, 2006, at 8:03 PM, skip at pobox.com wrote:

> Followup #2...
>
> Yesterday I whittled my problems with test_sqlite on my OSX g5 to
> test_ctypes and test_sqlite:
>
>     ./python.exe Lib/test/regrtest.py -l -f tests
>     test_ctypes
>     test_sqlite
>     test test_sqlite failed -- errors occurred; run in verbose mode  
> for details
>     1 test OK.
>     1 test failed:
>         test_sqlite
>
> Today I refined things further.  I renamed all the test_*.py files in
> Lib/ctypes/test/ until all I was left with was test_find.py.  It  
> fails if
> that's the only ctypes test script run:
>
>     $ ls -l *.py
>     -rw-------   1 buildbot  buildbot  6870 Oct 20 06:30 __init__.py
>     -rw-------   1 buildbot  buildbot   624 Oct 20 06:30 runtests.py
>     -rw-------   1 buildbot  buildbot  3463 Oct 21 12:52 test_find.py
>     montanaro:~/pybot/trunk.montanaro-g5/build/Lib/ctypes/test  
> buildbot$ cd -
>     /Library/Buildbot/pybot/trunk.montanaro-g5/build
>     montanaro:~/pybot/trunk.montanaro-g5/build buildbot$ ./ 
> python.exe Lib/test/regrtest.py -l -f tests
>     test_ctypes
>     test_sqlite
>     test test_sqlite failed -- errors occurred; run in verbose mode  
> for details
>     1 test OK.
>     1 test failed:
>         test_sqlite
>
> test_find.py contains checks for three OpenGL libraries on darwin:  
> gl, glu
> and glut.  If I comment out all those tests, test_sqlite succeeds.   
> If any
> of them are enabled, test_sqlite fails.

According to a comment in (IIRC) the pyOpenGL sources GLUT on OSX  
does a chdir() during initialization, that could be the problem here.

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061022/48cacba4/attachment.bin 

From ronaldoussoren at mac.com  Sun Oct 22 12:54:54 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 22 Oct 2006 12:54:54 +0200
Subject: [Python-Dev] readlink and unicode strings (SF:1580674) Patch
	http://www.python.org/sf/1580674 fixes readlink's behaviour
	w.r.t. Unicode strings: without this patch this function uses
	the system default encoding instead of the filesystem
	encoding to convert Unicode objects to plain strings. Like
	os.listdir,
	os.readlink will now return a Unicode object when the argument is a
	Unicode object. What I'd like to know is if this can be
	backported to the 2.5 branch. The first part of this patch
	(use filesystem encoding instead of the system encoding) is
	IMHO a bugfix,
	the second part might break existing applications (that might not
	expect a unicode result from os.readlink). The reason I did
	this patch is that os.path.realpath currently breaks when the
	path is a unicode string with non-ascii characters and at
	least one element of the path is a symlink. Ronald
Message-ID: 

Patch http://www.python.org/sf/1580674 fixes readlink's behaviour  
w.r.t. Unicode strings: without this patch this function uses the  
system default encoding instead of the filesystem encoding to convert  
Unicode objects to plain strings. Like os.listdir, os.readlink will  
now return a Unicode object when the argument is a Unicode object.

What I'd like to know is if this can be backported to the 2.5 branch.  
The first part of this patch (use filesystem encoding instead of the  
system encoding) is IMHO a bugfix, the second part might break  
existing applications (that might not  expect a unicode result from  
os.readlink).

The reason I did this patch is that os.path.realpath currently breaks  
when the path is a unicode string with non-ascii characters and at  
least one element of the path is a symlink.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061022/b5aee3b7/attachment.bin 

From mal at egenix.com  Sun Oct 22 13:02:21 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 22 Oct 2006 13:02:21 +0200
Subject: [Python-Dev] readlink and unicode strings (SF:1580674) Patch
 http://www.python.org/sf/1580674 fixes readlink's behaviour	w.r.t. Unicode
 strings: without this patch this function uses	the system default encoding
 instead of the filesystem	encoding to convert Unicode objects to plain
 strings. Like os.listdir,
 os.readlink will now return a Unicode object when the argument is a	Unicode
 object. What I'd like to know is if this can be	backported to the 2.5
 branch. The first part of this patch	(use filesystem encoding instead of
 the system encoding) is	IMHO a bugfix,
 the second part might break existing applications (that might not	expect a
 unicode result from os.readlink). The reason I did	this patch is that
 os.path.realpath currently breaks when the path is a unicode string with
 non-ascii characters and at	least one element of the path is a symlink.
 Ronald
In-Reply-To: 
References: 
Message-ID: <453B4FBD.2010504@egenix.com>

Ronald Oussoren wrote:
> Patch http://www.python.org/sf/1580674 fixes readlink's behaviour w.r.t.
> Unicode strings: without this patch this function uses the system
> default encoding instead of the filesystem encoding to convert Unicode
> objects to plain strings. Like os.listdir, os.readlink will now return a
> Unicode object when the argument is a Unicode object.
> 
> What I'd like to know is if this can be backported to the 2.5 branch.
> The first part of this patch (use filesystem encoding instead of the
> system encoding) is IMHO a bugfix, the second part might break existing
> applications (that might not  expect a unicode result from os.readlink).
> 
> The reason I did this patch is that os.path.realpath currently breaks
> when the path is a unicode string with non-ascii characters and at least
> one element of the path is a symlink.

I don't think that an application that passes a Unicode object to
os.readlink() would have problems dealing with a Unicode return
value.

+1 on backporting it to 2.5.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 22 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ronaldoussoren at mac.com  Sun Oct 22 14:16:26 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 22 Oct 2006 14:16:26 +0200
Subject: [Python-Dev] readlink and unicode strings (SF:1580674)
In-Reply-To: 
References: 
Message-ID: 

On Oct 22, 2006, at 12:54 PM, Ronald Oussoren wrote a message with an  
annoyingly large subject...

Sorry about that, I guess it's time to book a course on basic  
computer usage :-(

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061022/c8f8260e/attachment.bin 

From skip at pobox.com  Sun Oct 22 14:51:27 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 22 Oct 2006 07:51:27 -0500
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ...
	sometimes
In-Reply-To: <660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com>
References: <17722.24797.806588.338942@montanaro.dyndns.org>
	<660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com>
Message-ID: <17723.26959.831139.587804@montanaro.dyndns.org>

    Ronald> According to a comment in (IIRC) the pyOpenGL sources GLUT on
    Ronald> OSX does a chdir() during initialization, that could be the
    Ronald> problem here.

How would that explain that it fails on my g5 but not on my powerbook?  They
are at the same revision of the operating system and compiler.  The
checksums on the libraries are different though the file sizes are the same.
The dates on the files are different as well.  I suspect the checksum
difference is caused by the different upgrade dates of the two machines and
the resulting different times the two systems were "optimized".

Is there anyone else with a g5 who can do a vanilla Unix (not framework)
build on an up-to-date g5 from an up-to-date Subversion repository?  It
would be nice if someone else could at least confirm or not confirm this
problem.

Skip

From brett at python.org  Sun Oct 22 17:20:17 2006
From: brett at python.org (Brett Cannon)
Date: Sun, 22 Oct 2006 08:20:17 -0700
Subject: [Python-Dev] PSF Infrastructure has chosen Roundup as the issue
	tracker for Python development
In-Reply-To: 
References: 
Message-ID: 

Forgot to send this to python-dev.  =)

---------- Forwarded message ----------
From: Brett Cannon 
Date: Oct 20, 2006 1:35 PM
Subject: PSF Infrastructure has chosen Roundup as the issue tracker for
Python development
To: python-list at python.org

At the beginning of the month the PSF Infrastructure committee announced
that we had reached the decision that JIRA was our recommendation for the
next issue tracker for Python development.  Realizing, though, that it was a
tough call between JIRA and Roundup we said that we would be willing to
switch our recommendation to Roundup if enough volunteers stepped forward to
help administer the tracker, thus negating Atlassian's offer of free managed
hosting.

Well, the community stepped up to the challenge and we got plenty of
volunteers!  In fact, the call for volunteers has led to an offer for
professional hosting for Roundup from Upfront Systems.  The committee is
currently evaluating that offer and will hopefully have a decision made
soon.  Once a decision has been made we will contact the volunteers as to
whom we have selected to help administer the installation (regardless of who
hosts the tracker).  The administrators and python-dev can then begin
working towards deciding what we want from the tracker and its
configuration.

Once again, thanks to the volunteers for stepping forward to make this
happen!

-Brett Cannon
PSF Infrastructure committee chairman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061022/d560e566/attachment.htm 

From exarkun at divmod.com  Sun Oct 22 17:48:23 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Sun, 22 Oct 2006 11:48:23 -0400
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ...
 sometimes
In-Reply-To: <17723.26959.831139.587804@montanaro.dyndns.org>
Message-ID: <20061022154823.26151.349628906.divmod.quotient.10044@ohm>

On Sun, 22 Oct 2006 07:51:27 -0500, skip at pobox.com wrote:
>
>    Ronald> According to a comment in (IIRC) the pyOpenGL sources GLUT on
>    Ronald> OSX does a chdir() during initialization, that could be the
>    Ronald> problem here.
>
>How would that explain that it fails on my g5 but not on my powerbook?  They
>are at the same revision of the operating system and compiler.  The
>checksums on the libraries are different though the file sizes are the same.
>The dates on the files are different as well.  I suspect the checksum
>difference is caused by the different upgrade dates of the two machines and
>the resulting different times the two systems were "optimized".
>
>Is there anyone else with a g5 who can do a vanilla Unix (not framework)
>build on an up-to-date g5 from an up-to-date Subversion repository?  It
>would be nice if someone else could at least confirm or not confirm this
>problem.

Robert Gravina has seen a problem which bears some resemblance to this one
while using PySQLite in a real application on OS X.  I've pointed him to
this thread; hopefully it's the same issue and a second way of producing
the issue will shed some more light on the matter.

The top of that thread is available here:

http://divmod.org/users/mailman.twistd/pipermail/divmod-dev/2006-October/000707.html

Jean-Paul

From anthony at interlink.com.au  Sun Oct 22 18:03:13 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon, 23 Oct 2006 02:03:13 +1000
Subject: [Python-Dev] PSF Infrastructure has chosen Roundup as the issue
	tracker for Python development
In-Reply-To: 
References: 

Message-ID: <200610230203.16907.anthony@interlink.com.au>

Thanks to the folks involved in this prcocess - I'm looking forward to getting 
the hell away from SF's bug tracker. :-)

Anthony

From talin at acm.org  Sun Oct 22 20:29:26 2006
From: talin at acm.org (Talin)
Date: Sun, 22 Oct 2006 11:29:26 -0700
Subject: [Python-Dev] PSF Infrastructure has chosen Roundup as the issue
 tracker for Python development
In-Reply-To: <200610230203.16907.anthony@interlink.com.au>
References: 	
	<200610230203.16907.anthony@interlink.com.au>
Message-ID: <453BB886.4080402@acm.org>

Anthony Baxter wrote:
> Thanks to the folks involved in this prcocess - I'm looking forward to getting 
> the hell away from SF's bug tracker. :-)

Yes, let us know when the new tracker is up, I want to start using it :)

From barry at python.org  Mon Oct 23 02:53:55 2006
From: barry at python.org (Barry Warsaw)
Date: Sun, 22 Oct 2006 20:53:55 -0400
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ...
	sometimes
In-Reply-To: <17723.26959.831139.587804@montanaro.dyndns.org>
References: <17722.24797.806588.338942@montanaro.dyndns.org>
	<660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com>
	<17723.26959.831139.587804@montanaro.dyndns.org>
Message-ID: <39070ED3-0989-43B2-BC36-35199EF67CC8@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 22, 2006, at 8:51 AM, skip at pobox.com wrote:

> Is there anyone else with a g5 who can do a vanilla Unix (not  
> framework)
> build on an up-to-date g5 from an up-to-date Subversion  
> repository?  It
> would be nice if someone else could at least confirm or not confirm  
> this
> problem.

By "vanilla unix" you mean a pretty simple ./configure; make; make test?

Works for me with Python 2.5 on both my G5s and Intel Macs, all  
running 10.4.8.  Note though that I usually build with CPPFLAGS and  
LDFLAGS pointing to /opt/local in order to pick up DarwinPorts  
readline, and you do the same and have a version of sqlite from there  
you can have problems.

For example, we were seeing some very odd infloops in our sqlite  
layer.  We have our own version of sqlite that we expected to be  
dynamically linked against, but when I used otool -L to check it, I  
realized we were dynamically linked against a version of sqlite in  
DarwinPorts.  Getting rid of the unnecessary DarwinPorts version and  
making sure that we were dynamically linking against our version  
eliminated the infloops.

What do you get when you check _sqlite3?

% otool -L build/lib.macosx-10.3-ppc-2.5/_sqlite3.so
build/lib.macosx-10.3-ppc-2.5/_sqlite3.so:
         /usr/lib/libsqlite3.0.dylib (compatibility version 9.0.0,  
current version 9.6.0)
         /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current  
version 92.0.0)
         /usr/lib/libSystem.B.dylib (compatibility version 1.0.0,  
current version 88.1.7)

Any possibility something like that's going on?
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRTwSqHEjvBPtnXfVAQLvwQP/VuTQwwXwsauiuQt8E3k05scWsykarLaZ
YMJyVwq++DH/X8C5RODG9seYhSMQLF8PKMStmhKWLmlQ9mfFPIobMgsFqXBuI+bD
njUOh74O6vcJw1RNKXaERdQ6ABb2t79S6w+Psu5hGOP1NDy/e9GQazw05HpJWWvG
7Py+bDt24oE=
=9TjL
-----END PGP SIGNATURE-----

From skip at pobox.com  Mon Oct 23 05:24:07 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 22 Oct 2006 22:24:07 -0500
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ...
	sometimes
In-Reply-To: <39070ED3-0989-43B2-BC36-35199EF67CC8@python.org>
References: <17722.24797.806588.338942@montanaro.dyndns.org>
	<660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com>
	<17723.26959.831139.587804@montanaro.dyndns.org>
	<39070ED3-0989-43B2-BC36-35199EF67CC8@python.org>
Message-ID: <17724.13783.912914.43233@montanaro.dyndns.org>

    Barry> What do you get when you check _sqlite3?

    $ otool -L ./build/lib.mac-10.3-ppc-2.6/_sqlite3.so
    ./build/lib.macosx-10.3-ppc-2.6/_sqlite3.so:
            /usr/local/lib/libsqlite3.0.dylib (compatibility version 9.0.0, current version 9.6.0)
            /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 93.0.0)
            /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.1.7)

Which I apparently installed on Oct 15 but seem to have forgotten...
According to the source in my directory, it's sqlite 3.3.8.  On my powerbook
it's linked against /usr/lib/libsqlite3.0.dylib...

Make clean, run the failing test pair, now it's fine.  Otool shows linkage
against /usr/lib/libsqlite3.0.dylib...:

    $ otool -L ./build/lib.macosx-10.3-ppc-2.6/_sqlite3.so
    ./build/lib.macosx-10.3-ppc-2.6/_sqlite3.so:
            /usr/lib/libsqlite3.0.dylib (compatibility version 9.0.0, current version 9.6.0)
            /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 93.0.0)
            /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.1.7)

According to /usr/include/sqlite3.h, what's installed by Apple is 3.1.3.
Aside from the possibility that I somehow compiled against
/usr/include/sqlite3.h and linked against /usr/local/lib/libsqlite3.0.dylib,
what difference should 3.3.8 vs. 3.1.3 have made?

Skip

From barry at python.org  Mon Oct 23 05:52:21 2006
From: barry at python.org (Barry Warsaw)
Date: Sun, 22 Oct 2006 23:52:21 -0400
Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ...
	sometimes
In-Reply-To: <17724.13783.912914.43233@montanaro.dyndns.org>
References: <17722.24797.806588.338942@montanaro.dyndns.org>
	<660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com>
	<17723.26959.831139.587804@montanaro.dyndns.org>
	<39070ED3-0989-43B2-BC36-35199EF67CC8@python.org>
	<17724.13783.912914.43233@montanaro.dyndns.org>
Message-ID: <37C9161F-AD7B-44B7-9B81-2E8DE400E6EB@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 22, 2006, at 11:24 PM, skip at pobox.com wrote:

> According to /usr/include/sqlite3.h, what's installed by Apple is  
> 3.1.3.
> Aside from the possibility that I somehow compiled against
> /usr/include/sqlite3.h and linked against /usr/local/lib/ 
> libsqlite3.0.dylib,
> what difference should 3.3.8 vs. 3.1.3 have made?

Dunno, but as much as I love SQLite, I've also found it to be pretty  
finicky.  For example, I once tried to upgrade us from 3.2.1 to 3.2.8  
but that caused us a world of hurt, so I reverted back to the last  
known good version.  At some point I'll try to get us on the latest  
release, but I'm a little gunshy about it.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRTw8e3EjvBPtnXfVAQJbKgP+MjAz/NfUOaDd+ZEg9haJVr7v5JsKTHEl
i9n7pLLFToIE81RX3iGHMZwIZyIGHqT9d3gqan8INrvcAtL7hxVvkqAAFRJTmX2Z
XVLAjWLYCp9nY6Q3K+yXls798RDoHhZIWvHnNXZJ7Ya2wwSVQoADFdV1GN0pIB07
PnNHa/S83+Q=
=4fX8
-----END PGP SIGNATURE-----

From larry at hastings.org  Mon Oct 23 05:56:31 2006
From: larry at hastings.org (Larry Hastings)
Date: Sun, 22 Oct 2006 20:56:31 -0700
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <4539D362.9010909@v.loewis.de>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
	<4539D362.9010909@v.loewis.de>
Message-ID: <453C3D6F.4060107@hastings.org>

Martin v. L?wis wrote:
> It's not clear to me what you want to achieve with these patches,
> in particular, whether you want to see them integrated into Python or
> not.
>   
I would be thrilled if they were, but it seems less likely with every 
passing day.  If you have some advice on how I might increase the 
patch's chances I would be all ears.

It was/is my understanding that the early days of a new major revision 
was the most judicious time to introduce big changes.  If I had offered 
these patches six months ago for 2.5, they would have had zero chance of 
acceptance.  But 2.6 is in its infancy, and so I assumed now was the 
time to discuss sea-change patches like this.

Anyway, it was my intent to post the patch and see what happened.  Being 
a first-timer at this, and not having even read the core development 
mailing lists for very long, I had no idea what to expect.  Though I 
genuinely didn't expect it to be this brusque.

> I think this specific approach will find strong resistance.
I'd say the "lazy strings" patch is really two approaches, "lazy 
concatenation" and "lazy slices".  You are right, though, *both* have 
"found strong resistance".

> Most recently, it was discussed under the name "string view" on the Py3k list, see
>   http://mail.python.org/pipermail/python-3000/2006-August/003282.html
> Traditionally, the biggest objection is that even small strings may
> consume insane amounts of memory.
>   
Let's be specific: when there is at least one long-lived small lazy 
slice of a large string, and the large string itself would otherwise 
have been dereferenced and freed, and this small slice is never examined 
by code outside of stringobject.c, this approach means the large string 
becomes long-lived too and thus Python consumes more memory overall.  In 
pathological scenarios this memory usage could be characterized as "insane".

True dat.  Then again, I could suggest some scenarios where this would 
save memory (multiple long-lived large slices of a large string), and 
others where memory use would be a wash (long-lived slices containing 
the all or almost all of a large string, or any scenario where slices 
are short-lived).  While I think it's clear lazy slices are *faster* on 
average, its overall effect on memory use in real-world Python is not 
yet known.  Read on.

>> I bet this generally reduces overall memory usage for slices too.
>>     
> Channeling Guido: what real-world applications did you study with
> this patch to make such a claim?
>   
I didn't; I don't have any.  I must admit to being only a small-scale 
Python user.  Memory use remains about the same in pybench, the biggest 
Python app I have handy.  But, then, it was pretty clearly speculation, 
not a claim.  Yes, I *think* it'd use less memory overall.  But I 
wouldn't *claim* anything yet.

The "stringview" discussion you cite was largely speculation, and as I 
recall there were users in both camps ("it'll use more memory overall" 
vs "no it won't").  And, while I saw a test case with microbenchmarks, 
and a "proof-of-concept" where a stringview was a separate object from a 
string, I didn't see any real-word applications tested with this approach.

Rather than start in on speculation about it, I have followed that old 
maxim of "show me the code".  I've produced actual code that works with 
real strings in Python.  I see this as an opportunity for Pythonistas to 
determine the facts for themselves.  Now folks can try the patch with 
these real-world applications you cite and find out how it really 
behaves.  (Although I realize the Python community is under no 
obligation to do so.)

If experimentation is the best thing here, I'd be happy to revise the 
patch to facilitate it.  For instance, I could add command-line 
arguments letting you tweak the run-time behavior of the patch, like 
changing the minimum size of a lazy slice.  Perhaps add code so there's 
a tweakable minimum size of a lazy concatenation too.  Or a tweakable 
minimum *ratio* necessary for a lazy slice.  I'm open to suggestions.

Cheers,

/larry/

From talin at acm.org  Mon Oct 23 06:07:42 2006
From: talin at acm.org (Talin)
Date: Sun, 22 Oct 2006 21:07:42 -0700
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453C3D6F.4060107@hastings.org>
References: <4523F890.9060804@hastings.org>
	<453985ED.7050303@hastings.org>	<4539D362.9010909@v.loewis.de>
	<453C3D6F.4060107@hastings.org>
Message-ID: <453C400E.5070106@acm.org>

Larry Hastings wrote:
> Martin v. L?wis wrote:

> Let's be specific: when there is at least one long-lived small lazy 
> slice of a large string, and the large string itself would otherwise 
> have been dereferenced and freed, and this small slice is never examined 
> by code outside of stringobject.c, this approach means the large string 
> becomes long-lived too and thus Python consumes more memory overall.  In 
> pathological scenarios this memory usage could be characterized as "insane".
> 
> True dat.  Then again, I could suggest some scenarios where this would 
> save memory (multiple long-lived large slices of a large string), and 
> others where memory use would be a wash (long-lived slices containing 
> the all or almost all of a large string, or any scenario where slices 
> are short-lived).  While I think it's clear lazy slices are *faster* on 
> average, its overall effect on memory use in real-world Python is not 
> yet known.  Read on.

I wonder - how expensive would it be for the string slice to have a weak 
reference, and 'normalize' the slice when the big string is collected? 
Would the overhead of the weak reference swamp the savings?

-- Talin

From martin at v.loewis.de  Mon Oct 23 06:48:02 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Oct 2006 06:48:02 +0200
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453C3D6F.4060107@hastings.org>
References: <4523F890.9060804@hastings.org>
	<453985ED.7050303@hastings.org>	<4539D362.9010909@v.loewis.de>
	<453C3D6F.4060107@hastings.org>
Message-ID: <453C4982.80909@v.loewis.de>

Larry Hastings schrieb:
> Anyway, it was my intent to post the patch and see what happened.  Being 
> a first-timer at this, and not having even read the core development 
> mailing lists for very long, I had no idea what to expect.  Though I 
> genuinely didn't expect it to be this brusque.

I could have told you :-) The "problem" really is that you are
suggesting a major, significant change to the implementation of
Python, and one that doesn't fix an obvious bug. The new code
is an order of magnitude more complex than the old one, and the
impact that it will have is unknown - but in the worst case,
it could have serious negative impact, e.g. when the code is
full of errors, and causes Python application to crash in masses.

This is, of course, FUD: it is the fear that this might happen,
the uncertainty about the quality of the code and the doubt
about the viability of the approach.

There are many aspects to such a change, but my experience is
that it primarily takes time. Fredrik Lundh suggested you give
up about Python 2.6, and target Python 3.0 right away; it may
indeed be the case that Python 2.6 is too close for that kind
of change to find enough supporters.

If your primary goal was to contribute to open source, you
might want to look for other areas of Python: there are plenty
of open bugs ("real bugs" :-), unreviewed patches, etc. For
some time, it is more satisfying to work on these, since
the likelihood of success is higher.

Regards,
Martin

From jcarlson at uci.edu  Mon Oct 23 07:00:30 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 22 Oct 2006 22:00:30 -0700
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453C3D6F.4060107@hastings.org>
References: <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org>
Message-ID: <20061022214126.0A6E.JCARLSON@uci.edu>

Larry Hastings  wrote:
> It was/is my understanding that the early days of a new major revision 
> was the most judicious time to introduce big changes.  If I had offered 
> these patches six months ago for 2.5, they would have had zero chance of 
> acceptance.  But 2.6 is in its infancy, and so I assumed now was the 
> time to discuss sea-change patches like this.

It would be a radical change for Python 2.6, and really the 2.x series,
likely requiring nontrivial changes to extension modules that deal with
strings, and the assumptions about strings that have held for over a
decade.  I think 2.6 as an option is a non-starter.  Think Py3k, and
really, think bytes and unicode.

> The "stringview" discussion you cite was largely speculation, and as I 
> recall there were users in both camps ("it'll use more memory overall" 
> vs "no it won't").  And, while I saw a test case with microbenchmarks, 
> and a "proof-of-concept" where a stringview was a separate object from a 
> string, I didn't see any real-word applications tested with this approach.
> 
> Rather than start in on speculation about it, I have followed that old 
> maxim of "show me the code".  I've produced actual code that works with 
> real strings in Python.  I see this as an opportunity for Pythonistas to 
> determine the facts for themselves.  Now folks can try the patch with 
> these real-world applications you cite and find out how it really 
> behaves.  (Although I realize the Python community is under no 
> obligation to do so.)

One of the big concerns brought up in the stringview discussion was that
of users expecting one thing and getting another.  Slicing a larger
string producing a 'view', which then keeps the larger string alive,
would be a surprise.  By making it a separate object that just *knows*
about strings (or really, anything that offers a buffer interface), I
was able to make an object that was 1) flexible, 2) usable in any Python,
3) doesn't change the core assumptions about Python, 4) is expandable to
beyond just *strings*.  Reason #4 was my primary reason for writing it,
because str disappears in Py3k, which is closer to happening than most
of us realize.

> If experimentation is the best thing here, I'd be happy to revise the 
> patch to facilitate it.  For instance, I could add command-line 
> arguments letting you tweak the run-time behavior of the patch, like 
> changing the minimum size of a lazy slice.  Perhaps add code so there's 
> a tweakable minimum size of a lazy concatenation too.  Or a tweakable 
> minimum *ratio* necessary for a lazy slice.  I'm open to suggestions.

I believe that would be a waste of time.  The odds of it making it into
Python 2.x without significant core developer support are pretty close
to None, which in Python 2.x is less than 0.  I've been down that road,
nothing good lies that way.

Want my advice?  Aim for Py3k text as your primary target, but as a
wrapper, not as the core type (I put the odds at somewhere around 0 for
such a core type change).  If you are good, and want to make guys like
me happy, you could even make it support the buffer interface for
non-text (bytes, array, mmap, etc.), unifying (via wrapper) the behavior
of bytes and text.

 - Josiah

From fredrik at pythonware.com  Mon Oct 23 08:03:17 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 23 Oct 2006 08:03:17 +0200
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <20061022214126.0A6E.JCARLSON@uci.edu>
References: <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org>
	<20061022214126.0A6E.JCARLSON@uci.edu>
Message-ID: 

Josiah Carlson wrote:

> It would be a radical change for Python 2.6, and really the 2.x series,
> likely requiring nontrivial changes to extension modules that deal with
> strings, and the assumptions about strings that have held for over a
> decade.

the assumptions hidden in everyone's use of the C-level string API is 
the main concern here, at least for me; radically changing the internal 
format is not a new idea, but it's always been held off because we have 
no idea how people are using the C API.

From ncoghlan at gmail.com  Mon Oct 23 11:49:50 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Oct 2006 19:49:50 +1000
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <20061022214126.0A6E.JCARLSON@uci.edu>
References: <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org>
	<20061022214126.0A6E.JCARLSON@uci.edu>
Message-ID: <453C903E.7060608@gmail.com>

Josiah Carlson wrote:
> Want my advice?  Aim for Py3k text as your primary target, but as a
> wrapper, not as the core type (I put the odds at somewhere around 0 for
> such a core type change).  If you are good, and want to make guys like
> me happy, you could even make it support the buffer interface for
> non-text (bytes, array, mmap, etc.), unifying (via wrapper) the behavior
> of bytes and text.

This is still my preferred approach, too - for local optimisation of an 
algorithm, a string view type strikes me as an excellent idea. For the core 
data type, though, keeping the behaviour comparatively simple and predictable 
counterbalances the desire for more speed.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From skip at pobox.com  Mon Oct 23 14:40:17 2006
From: skip at pobox.com (skip at pobox.com)
Date: Mon, 23 Oct 2006 07:40:17 -0500
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453C4982.80909@v.loewis.de>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
	<4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org>
	<453C4982.80909@v.loewis.de>
Message-ID: <17724.47153.512897.828558@montanaro.dyndns.org>

    >> Anyway, it was my intent to post the patch and see what happened.
    >> Being a first-timer at this, and not having even read the core
    >> development mailing lists for very long, I had no idea what to
    >> expect.  Though I genuinely didn't expect it to be this brusque.

    Martin> I could have told you :-) The "problem" really is that you are
    Martin> suggesting a major, significant change to the implementation of
    Martin> Python, and one that doesn't fix an obvious bug. 

Come on Martin.  Give Larry a break.  Lots of changes have been accepted to
to the Python core which weren't obvious "bug fixes".  In fact, I seem to
recall a sprint held recently in Reykjavik where the whole point was just to
make Python faster.  I believe that was exactly Larry's point in posting the
patch.  The "one obvious way to do" concatenation and slicing for one of the
most heavily used types in python appears to be faster.  That seems like a
win to me.

Skip

From steve at holdenweb.com  Mon Oct 23 15:51:35 2006
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 23 Oct 2006 14:51:35 +0100
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <17724.47153.512897.828558@montanaro.dyndns.org>
References: <4523F890.9060804@hastings.org>
	<453985ED.7050303@hastings.org>	<4539D362.9010909@v.loewis.de>
	<453C3D6F.4060107@hastings.org>	<453C4982.80909@v.loewis.de>
	<17724.47153.512897.828558@montanaro.dyndns.org>
Message-ID: <453CC8E7.1090000@holdenweb.com>

skip at pobox.com wrote:
>     >> Anyway, it was my intent to post the patch and see what happened.
>     >> Being a first-timer at this, and not having even read the core
>     >> development mailing lists for very long, I had no idea what to
>     >> expect.  Though I genuinely didn't expect it to be this brusque.
> 
>     Martin> I could have told you :-) The "problem" really is that you are
>     Martin> suggesting a major, significant change to the implementation of
>     Martin> Python, and one that doesn't fix an obvious bug. 
> 
The "obvious bug" that it fixes is slowness <0.75 wink>.

> Come on Martin.  Give Larry a break.  Lots of changes have been accepted to
> to the Python core which weren't obvious "bug fixes".  In fact, I seem to
> recall a sprint held recently in Reykjavik where the whole point was just to
> make Python faster.  I believe that was exactly Larry's point in posting the
> patch.  The "one obvious way to do" concatenation and slicing for one of the
> most heavily used types in python appears to be faster.  That seems like a
> win to me.
> 
I did point out to Larry when he went to c.l.py with the original patch 
that he would face resistance, so this hasn't blind-sided him. But it 
seems to me that the only major issue is the inability to provide 
zero-byte terminators with this new representation.

Because Larry's proposal for handling this involves the introduction of 
a new API that can't already be in use in extensions it's obviously the 
extension writers who would be given most problems by this patch.

I can understand resistance on that score, and I could understand 
resistance if there were other clear disadvantages to its 
implementation, but in their absence it seems like the extension modules 
are the killers.

If there were any reliable way to make sure these objects never got 
passed to extension modules then I'd say "go for it". Without that it 
does seem like a potentially widespread change to the C API that could 
affect much code outside the interpreter. This is a great shame. I think 
Larry showed inventiveness and tenacity to get this far, and deserves 
credit for his achievements no matter whether or not they get into the core.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From steve at holdenweb.com  Mon Oct 23 15:51:35 2006
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 23 Oct 2006 14:51:35 +0100
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <17724.47153.512897.828558@montanaro.dyndns.org>
References: <4523F890.9060804@hastings.org>
	<453985ED.7050303@hastings.org>	<4539D362.9010909@v.loewis.de>
	<453C3D6F.4060107@hastings.org>	<453C4982.80909@v.loewis.de>
	<17724.47153.512897.828558@montanaro.dyndns.org>
Message-ID: <453CC8E7.1090000@holdenweb.com>

skip at pobox.com wrote:
>     >> Anyway, it was my intent to post the patch and see what happened.
>     >> Being a first-timer at this, and not having even read the core
>     >> development mailing lists for very long, I had no idea what to
>     >> expect.  Though I genuinely didn't expect it to be this brusque.
> 
>     Martin> I could have told you :-) The "problem" really is that you are
>     Martin> suggesting a major, significant change to the implementation of
>     Martin> Python, and one that doesn't fix an obvious bug. 
> 
The "obvious bug" that it fixes is slowness <0.75 wink>.

> Come on Martin.  Give Larry a break.  Lots of changes have been accepted to
> to the Python core which weren't obvious "bug fixes".  In fact, I seem to
> recall a sprint held recently in Reykjavik where the whole point was just to
> make Python faster.  I believe that was exactly Larry's point in posting the
> patch.  The "one obvious way to do" concatenation and slicing for one of the
> most heavily used types in python appears to be faster.  That seems like a
> win to me.
> 
I did point out to Larry when he went to c.l.py with the original patch 
that he would face resistance, so this hasn't blind-sided him. But it 
seems to me that the only major issue is the inability to provide 
zero-byte terminators with this new representation.

Because Larry's proposal for handling this involves the introduction of 
a new API that can't already be in use in extensions it's obviously the 
extension writers who would be given most problems by this patch.

I can understand resistance on that score, and I could understand 
resistance if there were other clear disadvantages to its 
implementation, but in their absence it seems like the extension modules 
are the killers.

If there were any reliable way to make sure these objects never got 
passed to extension modules then I'd say "go for it". Without that it 
does seem like a potentially widespread change to the C API that could 
affect much code outside the interpreter. This is a great shame. I think 
Larry showed inventiveness and tenacity to get this far, and deserves 
credit for his achievements no matter whether or not they get into the core.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From larry at hastings.org  Mon Oct 23 16:58:25 2006
From: larry at hastings.org (Larry Hastings)
Date: Mon, 23 Oct 2006 07:58:25 -0700
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453CC8E7.1090000@holdenweb.com>
References: <4523F890.9060804@hastings.org>	<453985ED.7050303@hastings.org>	<4539D362.9010909@v.loewis.de>	<453C3D6F.4060107@hastings.org>	<453C4982.80909@v.loewis.de>	<17724.47153.512897.828558@montanaro.dyndns.org>
	<453CC8E7.1090000@holdenweb.com>
Message-ID: <453CD891.8020003@hastings.org>

Steve Holden wrote:
> But it seems to me that the only major issue is the inability to provide 
> zero-byte terminators with this new representation.
>   
I guess I wasn't clear in my description of the patch; sorry about that.

Like "lazy concatenation objects", "lazy slices" render when you call 
PyString_AsString() on them.  Before rendering, the lazy slice's ob_sval 
will be NULL. Afterwards it will point to a proper zero-terminated 
string, at which point the object behaves exactly like any other 
PyStringObject.

The only function that *might* return a non-terminated char * is 
PyString_AsUnterminatedString().  This function is static to 
stringobject.c--and I would be shocked if it were ever otherwise.

> If there were any reliable way to make sure these objects never got 
> passed to extension modules then I'd say "go for it".
If external Python extension modules are as well-behaved as the shipping 
Python source tree, there simply wouldn't be a problem.  Python source 
is delightfully consistent about using the macro PyString_AS_STRING() to 
get at the creamy char *center of a PyStringObject *.  When code 
religiously uses that macro (or calls PyString_AsString() directly), all 
it needs is a recompile with the current stringobject.h and it will Just 
Work.

I genuinely don't know how many external Python extension modules are 
well-behaved in this regard.  But in case it helps: I just checked PIL, 
NumPy, PyWin32, and SWIG, and all of them were well-behaved.

Apart from stringobject.c, there was exactly one spot in the Python 
source tree which made assumptions about the structure of 
PyStringObjects (Mac/Modules/macos.c).  It's in the block starting with 
the comment "This is a hack:".  Note that this is unfixed in my patch, 
so just now all code using that self-avowed "hack" will break.

Am I correct in understanding that changing the Python minor revision 
number (2.5 -> 2.6) requires external modules to recompile?  (It 
certainly does on Windows.)  If so, I could mitigate the problem by 
renaming ob_sval.  That way, code making explicit reference to it would 
fail to compile, which I feel is better than silently recompiling unsafe 
code.

Cheers,

/larry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061023/a083bf5c/attachment.htm 

From exarkun at divmod.com  Mon Oct 23 17:28:31 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Mon, 23 Oct 2006 11:28:31 -0400
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453CD891.8020003@hastings.org>
Message-ID: <20061023152831.26151.1913366649.divmod.quotient.11082@ohm>

On Mon, 23 Oct 2006 07:58:25 -0700, Larry Hastings  wrote:
> [snip]
>If external Python extension modules are as well-behaved as the shipping 
>Python source tree, there simply wouldn't be a problem.  Python source is 
>delightfully consistent about using the macro PyString_AS_STRING() to get at 
>the creamy char *center of a PyStringObject *.  When code religiously uses 
>that macro (or calls PyString_AsString() directly), all it needs is a 
>recompile with the current stringobject.h and it will Just Work.
>
>I genuinely don't know how many external Python extension modules are well- 
>behaved in this regard.  But in case it helps: I just checked PIL, NumPy, 
>PyWin32, and SWIG, and all of them were well-behaved.

FWIW, http://www.google.com/codesearch?q=+ob_sval

Jean-Paul

From p.f.moore at gmail.com  Mon Oct 23 17:42:35 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 23 Oct 2006 16:42:35 +0100
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453CD891.8020003@hastings.org>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
	<4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org>
	<453C4982.80909@v.loewis.de>
	<17724.47153.512897.828558@montanaro.dyndns.org>
	<453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org>
Message-ID: <79990c6b0610230842k7a0a0facm3b4cc0a9f546b8ef@mail.gmail.com>

On 10/23/06, Larry Hastings  wrote:
>
>  Steve Holden wrote:
>
>  But it seems to me that the only major issue is the inability to provide
> zero-byte terminators with this new representation.
>
>  I guess I wasn't clear in my description of the patch; sorry about that.
>
>  Like "lazy concatenation objects", "lazy slices" render when you call
> PyString_AsString() on them.  Before rendering, the lazy slice's ob_sval
> will be NULL. Afterwards it will point to a proper zero-terminated string,
> at which point the object behaves exactly like any other PyStringObject.

I had picked up on this comment, and I have to say that I had been a
little surprised by the resistance to the change based on the "code
would break" argument, when you had made such a thorough attempt to
address this. Perhaps others had missed this point, though.

> I genuinely don't know how many external Python extension modules are
> well-behaved in this regard.  But in case it helps: I just checked PIL,
> NumPy, PyWin32, and SWIG, and all of them were well-behaved.

There's code out there which was written to the Python 1.4 API, and
has not been updated since (I know, I wrote some of it!) I wouldn't
call it "well-behaved" (it writes directly into the string's character
buffer) but I don't believe it would fail (it only uses
PyString_AsString to get the buffer address).

    /* Allocate an Python string object, with uninitialised contents. We
     * must do it this way, so that we can modify the string in place
     * later. See the Python source, Objects/stringobject.c for details.
     */
    result = PyString_FromStringAndSize(NULL, len);
    if (result == NULL)
	return NULL;

    p = PyString_AsString(result);

    while (*str)
    {
	if (*str == '\n')
	    *p = '\0';
	else
	    *p = *str;

	++p;
	++str;
    }

>  Am I correct in understanding that changing the Python minor revision
> number (2.5 -> 2.6) requires external modules to recompile?  (It certainly
> does on Windows.)  If so, I could mitigate the problem by renaming ob_sval.
> That way, code making explicit reference to it would fail to compile, which
> I feel is better than silently recompiling unsafe code.

I think you've covered pretty much all the possible backward
compatibility bases. A sufficiently evil extension could blow up, I
guess, but that's always going to be true.

OTOH, I don't have a comment on the desirability of the patch per se,
as (a) I've never been hit by the speed issue, and (b) I'm thoroughly
indoctrinated, so I always use ''.join() :-)

Paul.

From skip at pobox.com  Mon Oct 23 17:49:26 2006
From: skip at pobox.com (skip at pobox.com)
Date: Mon, 23 Oct 2006 10:49:26 -0500
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453CD891.8020003@hastings.org>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
	<4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org>
	<453C4982.80909@v.loewis.de>
	<17724.47153.512897.828558@montanaro.dyndns.org>
	<453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org>
Message-ID: <17724.58502.481218.773345@montanaro.dyndns.org>

    Larry> The only function that *might* return a non-terminated char * is
    Larry> PyString_AsUnterminatedString().  This function is static to
    Larry> stringobject.c--and I would be shocked if it were ever otherwise.

If it's static to stringobject.c it doesn't need a PyString_ prefix.  In
fact, I'd argue that it shouldn't have one so that people reading the code
won't miss the "static" and think it is part of the published API.

    Larry> Am I correct in understanding that changing the Python minor
    Larry> revision number (2.5 -> 2.6) requires external modules to
    Larry> recompile?

Yes, in general, though you can often get away without it if you don't mind
Python screaming at you about version mismatches.

Skip

From fredrik at pythonware.com  Mon Oct 23 17:59:32 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 23 Oct 2006 17:59:32 +0200
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453CD891.8020003@hastings.org>
References: <4523F890.9060804@hastings.org>	<453985ED.7050303@hastings.org>	<4539D362.9010909@v.loewis.de>	<453C3D6F.4060107@hastings.org>	<453C4982.80909@v.loewis.de>	<17724.47153.512897.828558@montanaro.dyndns.org>	<453CC8E7.1090000@holdenweb.com>
	<453CD891.8020003@hastings.org>
Message-ID: 

Larry Hastings wrote:

> Am I correct in understanding that changing the Python minor revision 
> number (2.5 -> 2.6) requires external modules to recompile?

not, in general, on Unix.  it's recommended, but things usually work 
quite well anyway.

From jcarlson at uci.edu  Mon Oct 23 18:07:51 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 23 Oct 2006 09:07:51 -0700
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <79990c6b0610230842k7a0a0facm3b4cc0a9f546b8ef@mail.gmail.com>
References: <453CD891.8020003@hastings.org>
	<79990c6b0610230842k7a0a0facm3b4cc0a9f546b8ef@mail.gmail.com>
Message-ID: <20061023090040.0A7B.JCARLSON@uci.edu>

"Paul Moore"  wrote:
> I had picked up on this comment, and I have to say that I had been a
> little surprised by the resistance to the change based on the "code
> would break" argument, when you had made such a thorough attempt to
> address this. Perhaps others had missed this point, though.

I'm also concerned about future usability.  Word in the Py3k list is
that Python 2.6 will be just about the last Python in the 2.x series,
and by directing his implementation at only Python 2.x strings, he's
just about guaranteeing obsolescence.  By building with unicode and/or
objects with a buffer interface in mind, Larry could build with both 2.x
and 3.x in mind, and his code wouldn't be obsolete the moment it was
released.

 - Josiah

From exarkun at divmod.com  Mon Oct 23 18:31:02 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Mon, 23 Oct 2006 12:31:02 -0400
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <20061023090040.0A7B.JCARLSON@uci.edu>
Message-ID: <20061023163102.26151.1529007871.divmod.quotient.11135@ohm>

On Mon, 23 Oct 2006 09:07:51 -0700, Josiah Carlson  wrote:
>
>"Paul Moore"  wrote:
>> I had picked up on this comment, and I have to say that I had been a
>> little surprised by the resistance to the change based on the "code
>> would break" argument, when you had made such a thorough attempt to
>> address this. Perhaps others had missed this point, though.
>
>I'm also concerned about future usability.

Me too (perhaps in a different way though).

>Word in the Py3k list is
>that Python 2.6 will be just about the last Python in the 2.x series,
>and by directing his implementation at only Python 2.x strings, he's
>just about guaranteeing obsolescence.

People will be using 2.x for a long time to come.  And in the long run,
isn't all software obsolete? :)

>By building with unicode and/or
>objects with a buffer interface in mind, Larry could build with both 2.x
>and 3.x in mind, and his code wouldn't be obsolete the moment it was
>released.

(I'm not sure what the antecedent of "it" is in the above, I'm going to
assume it's Python 3.x.)

Supporting unicode strings and objects providing the buffer interface seems
like a good idea in general, even disregarding Py3k.  Starting with str is
reasonable though, since there's still plenty of code that will benefit from
this change, if it is indeed a beneficial change.

Larry, I'm going to try to do some benchmarks against Twisted using this
patch, but given my current time constraints, you may be able to beat me
to this :)  If you're interested, Twisted trunk at HEAD plus this trial plugin:

  http://twistedmatrix.com/trac/browser/sandbox/exarkun/merit/trunk

will let you do some gross measurements using the Twisted test suite.  I can
give some more specific pointers if this sounds like something you'd want to
mess with.

Jean-Paul

From martin at v.loewis.de  Tue Oct 24 00:11:04 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Oct 2006 00:11:04 +0200
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <17724.47153.512897.828558@montanaro.dyndns.org>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
	<4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org>
	<453C4982.80909@v.loewis.de>
	<17724.47153.512897.828558@montanaro.dyndns.org>
Message-ID: <453D3DF8.2040304@v.loewis.de>

skip at pobox.com schrieb:
>     >> Anyway, it was my intent to post the patch and see what happened.
>     >> Being a first-timer at this, and not having even read the core
>     >> development mailing lists for very long, I had no idea what to
>     >> expect.  Though I genuinely didn't expect it to be this brusque.
> 
>     Martin> I could have told you :-) The "problem" really is that you are
>     Martin> suggesting a major, significant change to the implementation of
>     Martin> Python, and one that doesn't fix an obvious bug. 
> 
> Come on Martin.  Give Larry a break.

I'm seriously not complaining, I'm explaining.

> Lots of changes have been accepted to
> to the Python core which weren't obvious "bug fixes".

Surely many new features have been implemented over time, but in many
cases, they weren't really "big changes", in the sense that you could
ignore them if you don't like them. This wouldn't be so in this case:
as the string type is very fundamental, people feel a higher interest
in its implementation.

> In fact, I seem to
> recall a sprint held recently in Reykjavik where the whole point was just to
> make Python faster.

That's true. I also recall there were serious complaints about the
outcome of this sprint, and the changes to the struct module in
particular. Still, the struct module is of lesser importance than
the string type, so the concerns were smaller.

> I believe that was exactly Larry's point in posting the
> patch.  The "one obvious way to do" concatenation and slicing for one of the
> most heavily used types in python appears to be faster.  That seems like a
> win to me.

Have you reviewed the patch and can vouch for its correctness, even in
boundary cases? Have you tested it in a real application and found
a real performance improvement? I have done neither, so I can't speak
on the advantages of the patch. I didn't actually object to the
inclusion of the patch, either. I was merely stating what I think
the problems with "that kind of" patch are.

Regards,
Martin

From martin at v.loewis.de  Tue Oct 24 00:36:33 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Oct 2006 00:36:33 +0200
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453CD891.8020003@hastings.org>
References: <4523F890.9060804@hastings.org>	<453985ED.7050303@hastings.org>	<4539D362.9010909@v.loewis.de>	<453C3D6F.4060107@hastings.org>	<453C4982.80909@v.loewis.de>	<17724.47153.512897.828558@montanaro.dyndns.org>	<453CC8E7.1090000@holdenweb.com>
	<453CD891.8020003@hastings.org>
Message-ID: <453D43F1.8010104@v.loewis.de>

Larry Hastings schrieb:
> Am I correct in understanding that changing the Python minor revision
> number (2.5 -> 2.6) requires external modules to recompile?  (It
> certainly does on Windows.)

There is an ongoing debate on that. The original intent was that you
normally *shouldn't* have to recompile modules just because the Python
version changes. Instead, you should do so when PYTHON_API_VERSION
changes. Of course, such a change would also cause a change to
PYTHON_API_VERSION.
Then, even if PYTHON_API_VERSION changes, you aren't *required* to
recompile your extension modules. Instead, you get a warning that the
API version is different and *might* require recompilation: it does
require recompilation if the extension module relies on some of the
changed API.
With this change, people not recompiling their extension modules
would likely see Python crash rather quickly after seeing the warning
about incompatible APIs.

Regards,
Martin

From anthony at python.org  Tue Oct 24 04:57:42 2006
From: anthony at python.org (Anthony Baxter)
Date: Tue, 24 Oct 2006 12:57:42 +1000
Subject: [Python-Dev] RELEASED Python 2.3.6, release candidate 1
Message-ID: <200610241257.51761.anthony@python.org>

On behalf of the Python development team and the Python
community, I'm announcing the release of Python 2.3.6
(release candidate 1).

Python 2.3.6 is a security bug-fix release. While Python 2.5
is the latest version of Python, we're making this release for
people who are still running Python 2.3. Unlike the recently
released 2.4.4, this release only contains a small handful of
security-related bugfixes. See the website for more.

*  Python 2.3.6 contains a fix for PSF-2006-001, a buffer overrun
*  in repr() of unicode strings in wide unicode (UCS-4) builds.
*  See http://www.python.org/news/security/PSF-2006-001/ for more.

This is a **source only** release. The Windows and Mac binaries
of 2.3.5 were built with UCS-2 unicode, and are therefore not
vulnerable to the problem outlined in PSF-2006-001. The PCRE fix
is for a long-deprecated module (you should use the 're' module
instead) and the email fix can be obtained by downloading the
standalone version of the email package.

Most vendors who ship Python should have already released a
patched version of 2.3.5 with the above fixes, this release is
for people who need or want to build their own release, but don't
want to mess around with patch or svn.

Assuming no major problems crop up, a final release of Python
2.3.6 will follow in about a week's time.

Python 2.3.6 will complete python.org's response to PSF-2006-001.
If you're still on Python 2.2 for some reason and need to work
with UCS-4 unicode strings, please obtain the patch from the
PSF-2006-001 security advisory page. Python 2.4.4 and Python 2.5
have both already been released and contain the fix for this
security problem.

For more information on Python 2.3.6, including download links
for source archives, release notes, and known issues, please see:

    http://www.python.org/2.3.6

Highlights of this new release include:

  - A fix for PSF-2006-001, a bug in repr() for unicode strings 
    on UCS-4 (wide unicode) builds.
  - Two other, less critical, security fixes.

Enjoy this release,
Anthony

Anthony Baxter
anthony at python.org
Python Release Manager
(on behalf of the entire python-dev team)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061024/2a934c64/attachment.pgp 

From mbk.lists at gmail.com  Tue Oct 24 07:22:51 2006
From: mbk.lists at gmail.com (Mike Krell)
Date: Mon, 23 Oct 2006 22:22:51 -0700
Subject: [Python-Dev] __str__ bug?
Message-ID: 

Is this a bug?  If not, how do I override __str__ on a unicode derived class?

class S(str):
    def __str__(self): return '__str__ overridden'

class U(unicode):
    def __str__(self): return '__str__ overridden'
    def __unicode__(self): return u'__unicode__ overridden'

s = S()
u = U()

print 's:', s
print "str(s):", str(s)
print 's substitued is "%s"\n' % s
print 'u:', u
print "str(u):", str(u)
print 'u substitued is "%s"' % u

-----------------------------------------------------

s: __str__ overridden
str(s): __str__ overridden
s substitued is "__str__ overridden"

u:
str(u): __str__ overridden
u substitued is ""

Results are identical for 2.4.2 and 2.5c2 (running under windows).

   Mike

From Jack.Jansen at cwi.nl  Tue Oct 24 11:09:12 2006
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Tue, 24 Oct 2006 11:09:12 +0200
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <453CD891.8020003@hastings.org>
References: <4523F890.9060804@hastings.org>	<453985ED.7050303@hastings.org>	<4539D362.9010909@v.loewis.de>	<453C3D6F.4060107@hastings.org>	<453C4982.80909@v.loewis.de>	<17724.47153.512897.828558@montanaro.dyndns.org>
	<453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org>
Message-ID: <95509FCF-D8E8-4102-A5B6-15F087723109@cwi.nl>

On  23-Oct-2006, at 16:58 , Larry Hastings wrote:

> I genuinely don't know how many external Python extension modules  
> are well-behaved in this regard.  But in case it helps: I just  
> checked PIL, NumPy, PyWin32, and SWIG, and all of them were well- 
> behaved.
>
> Apart from stringobject.c, there was exactly one spot in the Python  
> source tree which made assumptions about the structure of  
> PyStringObjects (Mac/Modules/macos.c).  It's in the block starting  
> with the comment "This is a hack:".  Note that this is unfixed in  
> my patch, so just now all code using that self-avowed "hack" will  
> break.

As the author of that hack, that gives me an idea for where you  
should look for code that will break: code that tries to expose low- 
level C interfaces to Python. (That hack replaced an even earlier  
worse hack, that took the id() of a string in Python and added a  
fixed number to it to get at the address of the string, to fill it  
into a structure, blush).

Look at packages such as win32, PyObjC, ctypes, bridges between  
Python and other languages, etc. That's where implementors are  
tempted to bend the rules of Official APIs for the benefit of serious  
optimizations.
--
Jack Jansen, , http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061024/fe49d742/attachment.html 

From ronaldoussoren at mac.com  Tue Oct 24 11:28:40 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Tue, 24 Oct 2006 11:28:40 +0200
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <95509FCF-D8E8-4102-A5B6-15F087723109@cwi.nl>
References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org>
	<4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org>
	<453C4982.80909@v.loewis.de>
	<17724.47153.512897.828558@montanaro.dyndns.org>
	<453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org>
	<95509FCF-D8E8-4102-A5B6-15F087723109@cwi.nl>
Message-ID: <3EF991DD-CDE6-4E1A-99E9-6FC24EAF2DCE@mac.com>

On Oct 24, 2006, at 11:09 AM, Jack Jansen wrote:

>
> Look at packages such as win32, PyObjC, ctypes, bridges between  
> Python and other languages, etc. That's where implementors are  
> tempted to bend the rules of Official APIs for the benefit of  
> serious optimizations.

PyObjC should be safe in this regard, I try to conform to the  
official rules :-)

I do use PyString_AS_STRING outside of the GIL in other extensions  
though, the lazy strings patch would break that. My code is of course  
bending the rules here and can easily be fixed by introducing a  
temporary variable.

Ronald

> --
> Jack Jansen, , http://www.cwi.nl/~jack
> If I can't dance I don't want to be part of your revolution -- Emma  
> Goldman
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> ronaldoussoren%40mac.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061024/5370f4fc/attachment-0001.bin 

From ncoghlan at gmail.com  Tue Oct 24 12:59:31 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 Oct 2006 20:59:31 +1000
Subject: [Python-Dev] The "lazy strings" patch
In-Reply-To: <20061023152831.26151.1913366649.divmod.quotient.11082@ohm>
References: <20061023152831.26151.1913366649.divmod.quotient.11082@ohm>
Message-ID: <453DF213.7080300@gmail.com>

Jean-Paul Calderone wrote:
> On Mon, 23 Oct 2006 07:58:25 -0700, Larry Hastings  wrote:
>> [snip]
>> If external Python extension modules are as well-behaved as the shipping 
>> Python source tree, there simply wouldn't be a problem.  Python source is 
>> delightfully consistent about using the macro PyString_AS_STRING() to get at 
>> the creamy char *center of a PyStringObject *.  When code religiously uses 
>> that macro (or calls PyString_AsString() directly), all it needs is a 
>> recompile with the current stringobject.h and it will Just Work.
>>
>> I genuinely don't know how many external Python extension modules are well- 
>> behaved in this regard.  But in case it helps: I just checked PIL, NumPy, 
>> PyWin32, and SWIG, and all of them were well-behaved.
> 
> FWIW, http://www.google.com/codesearch?q=+ob_sval

Possible more enlightening (we *know* string objects play with this field!):

http://www.google.com/codesearch?hl=en&lr=&q=ob_sval+-stringobject.%5Bhc%5D&btnG=Search

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From chtaylo3 at gmail.com  Tue Oct 24 17:18:08 2006
From: chtaylo3 at gmail.com (Christopher Taylor)
Date: Tue, 24 Oct 2006 11:18:08 -0400
Subject: [Python-Dev] Hunting down configure script error
Message-ID: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>

Per my conversation with Martin v. L?wis on the python-list, I think I
have found a problem with the configure script and Makefile.in.

For Python 2.4.4 it seems that the arguement --libdir does not change
the Makefile.  Specifically I need this to change the libdir to
/usr/lib64 for RH on a x86_64 machine.

I'd like to contribute a fix for this, but I'm relatively new so I
would appreciate some guidance.

In the Makefile, I tried setting LIBDIR to $(exec_prefix)/lib64 and
SCRIPTDIR to $(prefix)/lib64 manually.  Unfortuantely that created an
error when I ran python2.4:

Could not find platform independent libraries 
Could not find platform dependent libraries 
Consider setting $PYTHONHOME to [:]
'import site' failed; use -v for traceback

so I edited my /etc/profile and included:
export PYTHONHOME = "/usr"

and reran python2.4 and now the only error is:
'import site' failed; use -v for traceback

I poked around in /Modules/getpath.c and I'm starting to understand
how things are comming together.  My question is:  how does $(prefix)
from the congifure script make it into PREFIX in the c code?  I see on
line 106 of /Modules/getpath.c that it checks to see if PREFIX is
defined and if not set's it to "/usr/local".  So I did a grep on
PREFIX from the Python2.4.4 dir level and it didn't return anything
that looks like PREFIX is being set based on the input to the
configure script.  Where might this be happening?  I'm assuming
there's also a similar disconnect for LIBDIR (even though it never
get's set properly in the Makefile, even when I edited it by hand,
those changes don't make it into the code .... but I don't know where
it should be changed in the code.)

Respectfully,
Christopher Taylor

From ncoghlan at gmail.com  Tue Oct 24 12:04:22 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 Oct 2006 20:04:22 +1000
Subject: [Python-Dev] __str__ bug?
In-Reply-To: 
References: 
Message-ID: <453DE526.4020107@gmail.com>

Mike Krell wrote:
> Is this a bug?  If not, how do I override __str__ on a unicode derived class?

Based on the behaviour of str and the fact that overriding unicode.__repr__ 
works just fine, I'd say file a bug on SF.

I think this bit in PyUnicode_Format needs to use PyUnicode_CheckExact instead 
of PyUnicode_Check:

             case 's':
	    case 'r':
		if (PyUnicode_Check(v) && c == 's') {

The corresponding code in PyString_Format makes a call to _PyObject_Str which 
deals with subclasses correctly.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From mbk.lists at gmail.com  Tue Oct 24 18:02:24 2006
From: mbk.lists at gmail.com (Mike Krell)
Date: Tue, 24 Oct 2006 09:02:24 -0700
Subject: [Python-Dev] __str__ bug?
In-Reply-To: <453DE526.4020107@gmail.com>
References: 
	<453DE526.4020107@gmail.com>
Message-ID: 

> Based on the behaviour of str and the fact that overriding unicode.__repr__
> works just fine, I'd say file a bug on SF.

Done.  This is item 1583863.

   Mike

From chtaylo3 at gmail.com  Tue Oct 24 19:47:33 2006
From: chtaylo3 at gmail.com (Christopher Taylor)
Date: Tue, 24 Oct 2006 13:47:33 -0400
Subject: [Python-Dev] Hunting down configure script error
In-Reply-To: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>
References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>
Message-ID: <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>

Ok, here's what I found:  In addition to the configure script not
taking changing LIBDIR, Modules/getpath.c uses a hardcoded value for
static chat lib_python[] = "lib/python" VERSION;
which appears on line 134.  So even if the configure script changes
LIBDIR it won't do much good because the value is hardcoded in (as
opposed to using LIBDIR).

So I can tell that this would need to be changed ... anyone else know
much about this?

I'm wondering if my posts are going through??

Respectfully,
Christopher Taylor

From skip at pobox.com  Tue Oct 24 20:10:34 2006
From: skip at pobox.com (skip at pobox.com)
Date: Tue, 24 Oct 2006 13:10:34 -0500
Subject: [Python-Dev] Hunting down configure script error
In-Reply-To: <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>
References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>
	<2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>
Message-ID: <17726.22298.840098.330357@montanaro.dyndns.org>

    Christopher> I'm wondering if my posts are going through??

Yup.  Sorry, but I've no useful comments to make on your problems though.

Skip

From steve at holdenweb.com  Tue Oct 24 20:13:07 2006
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 24 Oct 2006 19:13:07 +0100
Subject: [Python-Dev] Hunting down configure script error
In-Reply-To: <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>
References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>
	<2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>
Message-ID: 

Christopher Taylor wrote:
> Ok, here's what I found:  In addition to the configure script not
> taking changing LIBDIR, Modules/getpath.c uses a hardcoded value for
> static chat lib_python[] = "lib/python" VERSION;
> which appears on line 134.  So even if the configure script changes
> LIBDIR it won't do much good because the value is hardcoded in (as
> opposed to using LIBDIR).
> 
> So I can tell that this would need to be changed ... anyone else know
> much about this?
> 
> I'm wondering if my posts are going through??
> 
Your posts are making it. It's just that everyone's ignoring you :)

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From chtaylo3 at gmail.com  Tue Oct 24 20:28:05 2006
From: chtaylo3 at gmail.com (Christopher Taylor)
Date: Tue, 24 Oct 2006 14:28:05 -0400
Subject: [Python-Dev] Hunting down configure script error
In-Reply-To: 
References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>
	<2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>

Message-ID: <2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com>

> Your posts are making it. It's just that everyone's ignoring you :)

I feel loved .....

Seriously, why would somoene ignore this?  this is obviously not a
pebkac problem.....

Respectfully,
Christopher Taylor

From skip at pobox.com  Tue Oct 24 20:38:45 2006
From: skip at pobox.com (skip at pobox.com)
Date: Tue, 24 Oct 2006 13:38:45 -0500
Subject: [Python-Dev] Hunting down configure script error
In-Reply-To: <2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com>
References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>
	<2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>

	<2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com>
Message-ID: <17726.23989.775944.384634@montanaro.dyndns.org>

    >> Your posts are making it. It's just that everyone's ignoring you :)

    Christopher> I feel loved .....

    Christopher> Seriously, why would somoene ignore this?  this is
    Christopher> obviously not a pebkac problem.....

I'm not sure what a "pebkac" problem is.  I will attempt to channel the
other members of the group (OMMMMMM...) and suggest that folks are either
(like me) unfamiliar with the problem domain or too busy at the moment to
look into the details.  Your Best Bet (tm) would be to file a bug report on
SourceForge so it doesn't get completely forgotten.

Skip

From fredrik at pythonware.com  Tue Oct 24 20:46:18 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 24 Oct 2006 20:46:18 +0200
Subject: [Python-Dev] Hunting down configure script error
In-Reply-To: <17726.23989.775944.384634@montanaro.dyndns.org>
References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>	<2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>		<2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com>
	<17726.23989.775944.384634@montanaro.dyndns.org>
Message-ID: 

skip at pobox.com wrote:

> I'm not sure what a "pebkac" problem is.

http://en.wikipedia.org/wiki/PEBKAC

You'll learn some new nonsense every day ;-)

From steve at holdenweb.com  Tue Oct 24 21:32:11 2006
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 24 Oct 2006 20:32:11 +0100
Subject: [Python-Dev] Hunting down configure script error
In-Reply-To: <17726.23989.775944.384634@montanaro.dyndns.org>
References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com>	<2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com>		<2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com>
	<17726.23989.775944.384634@montanaro.dyndns.org>
Message-ID: 

skip at pobox.com wrote:
>     >> Your posts are making it. It's just that everyone's ignoring you :)
> 
>     Christopher> I feel loved .....
> 
>     Christopher> Seriously, why would somoene ignore this?  this is
>     Christopher> obviously not a pebkac problem.....
> 
> I'm not sure what a "pebkac" problem is.

Problem Exists Between Chair And Keyboard

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From RD6T-KJYM at asahi-net.or.jp  Tue Oct 24 21:41:27 2006
From: RD6T-KJYM at asahi-net.or.jp (Tamito KAJIYAMA)
Date: 25 Oct 2006 04:41:27 +0900
Subject: [Python-Dev] __str__ bug?
Message-ID: <453E6C67.125074.001@leopold.j.asahi-net.or.jp>

I believe you've overriden unicode.__str__ as you expect.

class S(str):
    def __str__(self): return "S.__str__"

class U(unicode):
    def __str__(self): return "U.__str__"

print str(S())
print str(U())

This script prints:

S.__str__
U.__str__

Regards,

-- 
KAJIYAMA, Tamito 

>Is this a bug?  If not, how do I override __str__ on a unicode derived class?
>
>class S(str):
>    def __str__(self): return '__str__ overridden'
>
>class U(unicode):
>    def __str__(self): return '__str__ overridden'
>    def __unicode__(self): return u'__unicode__ overridden'
>
>s = S()
>u = U()
>
>print 's:', s
>print "str(s):", str(s)
>print 's substitued is "%s"\n' % s
>print 'u:', u
>print "str(u):", str(u)
>print 'u substitued is "%s"' % u
>
>-----------------------------------------------------
>
>s: __str__ overridden
>str(s): __str__ overridden
>s substitued is "__str__ overridden"
>
>u:
>str(u): __str__ overridden
>u substitued is ""
>
>Results are identical for 2.4.2 and 2.5c2 (running under windows).
>
>   Mike

From mbk.lists at gmail.com  Tue Oct 24 22:14:22 2006
From: mbk.lists at gmail.com (Mike Krell)
Date: Tue, 24 Oct 2006 13:14:22 -0700
Subject: [Python-Dev] __str__ bug?
In-Reply-To: <453E6C67.125074.001@leopold.j.asahi-net.or.jp>
References: <453E6C67.125074.001@leopold.j.asahi-net.or.jp>
Message-ID: 

> class S(str):
>     def __str__(self): return "S.__str__"
>
> class U(unicode):
>     def __str__(self): return "U.__str__"
>
> print str(S())
> print str(U())
>
> This script prints:
>
> S.__str__
> U.__str__

Yes, but "print U()" prints nothing, and the explicit str() should not
be necessary.

   Mike

From bjourne at gmail.com  Wed Oct 25 02:11:48 2006
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Wed, 25 Oct 2006 02:11:48 +0200
Subject: [Python-Dev] PEP 355 status
In-Reply-To: 
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

Message-ID: <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>

On 10/1/06, Guido van Rossum  wrote:
> On 9/30/06, Giovanni Bajo  wrote:
> > It would be terrific if you gave us some clue about what is wrong in PEP355, so
> > that the next guy does not waste his time. For instance, I find PEP355
> > incredibly good for my own path manipulation (much cleaner and concise than the
> > awful os.path+os+shutil+stat mix), and I have trouble understanding what is
> > *so* wrong with it.
> >
> > You said "it's an amalgam of unrelated functionality", but you didn't say what
> > exactly is "unrelated" for you.
>
> Sorry, no time. But others in this thread clearly agreed with me, so
> they can guide you.

I'd like to write a post mortem for PEP 355. But one important
question that haven't been answered is if there is a possibility for a
path-like PEP to succeed in the future? If so, does the path-object
implementation have to prove itself in the wild before it can be
included in Python? From earlier posts it seems like you don't like
the concept of path objects, which others have found very interesting.
If that is the case, then it would be nice to hear it explicitly. :)

-- 
mvh Bj?rn

From amk at amk.ca  Wed Oct 25 03:02:48 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Tue, 24 Oct 2006 21:02:48 -0400
Subject: [Python-Dev] Python 2.4.4 docs?
Message-ID: <20061025010248.GA805@Siri.local>

Does someone need to unpack the 2.4.4 docs in the right place so that 
http://www.python.org/doc/2.4.4/ works?

--amk

From fdrake at acm.org  Wed Oct 25 05:24:47 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 24 Oct 2006 23:24:47 -0400
Subject: [Python-Dev] Python 2.4.4 docs?
In-Reply-To: <20061025010248.GA805@Siri.local>
References: <20061025010248.GA805@Siri.local>
Message-ID: <200610242324.47328.fdrake@acm.org>

On Tuesday 24 October 2006 21:02, A.M. Kuchling wrote:
 > Does someone need to unpack the 2.4.4 docs in the right place so that
 > http://www.python.org/doc/2.4.4/ works?

That would be me, and yes, and done.  Sorry for the delay; life's just been 
busy lately.  Time for me to go look at the release PEP again...

  -Fred

-- 
Fred L. Drake, Jr.   

From talin at acm.org  Wed Oct 25 05:42:59 2006
From: talin at acm.org (Talin)
Date: Tue, 24 Oct 2006 20:42:59 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>	
	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
Message-ID: <453EDD43.3050609@acm.org>

BJ?rn Lindqvist wrote:
> On 10/1/06, Guido van Rossum  wrote:
>> On 9/30/06, Giovanni Bajo  wrote:
>>> It would be terrific if you gave us some clue about what is wrong in PEP355, so
>>> that the next guy does not waste his time. For instance, I find PEP355
>>> incredibly good for my own path manipulation (much cleaner and concise than the
>>> awful os.path+os+shutil+stat mix), and I have trouble understanding what is
>>> *so* wrong with it.
>>>
>>> You said "it's an amalgam of unrelated functionality", but you didn't say what
>>> exactly is "unrelated" for you.
>> Sorry, no time. But others in this thread clearly agreed with me, so
>> they can guide you.
> 
> I'd like to write a post mortem for PEP 355. But one important
> question that haven't been answered is if there is a possibility for a
> path-like PEP to succeed in the future? If so, does the path-object
> implementation have to prove itself in the wild before it can be
> included in Python? From earlier posts it seems like you don't like
> the concept of path objects, which others have found very interesting.
> If that is the case, then it would be nice to hear it explicitly. :)

Let me take a crack at it - I'm always good for spouting off an arrogant 
opinion :)

Part 1: "Amalgam of Unrelated Functionality"

To me, the Path module felt very much like the "swiss army knife" 
anti-pattern - a whole lot of functions that had little in common other 
than the fact that paths were involved.

More specifically, I think its important to separate the notion of paths 
as abstract "reference" objects from filesystem manipulators. When I 
call a function that operates on a path, I want to clearly distinguish 
between a function that merely does a transformation on the path string, 
vs. one that actually hits the disk. This goes along with the "principle 
of least surprise" - it should never be the case that I cause an i/o 
operation to occur when I wasn't expecting it.

For example, a function that computes the parent directory of a path 
should not IMHO be a sibling of a function which tests for the existence 
or readability of a file.

I tend to think of paths and filesystems as broken down into 3 distinct 
domains, which are locators, inodes, and files. I realize that not all 
file systems on all platforms use the term 'inode', and have somewhat 
different semantics, but they all have some object which fulfills that role.

   -- A locator is an abstract description of how to "get to" a 
resource. A file path is a "locator" in exactly the sense that a URL is. 
Locators need not refer to 'real' resources in order to be valid. A 
locator to a non-existent resource still maintains a consistent 
structure, and can be manipulated and transformed without ever actually 
dereferencing it. A locator does not, however, have any properties or 
attributes - you cannot tell, for example, the creation date of a file 
by looking at its locator.

   -- An inode is a descriptor that points to some actual content. It 
actually lives on the filesystem, and has attributes (such as creation 
data, last modified date, permissions, etc.)

   -- 'Files' are raw content streams - they are the actual bytes that 
make up the data within the file. Files do not have 'names' or 'dates' 
directly in of themselves - only the inodes that describe them do.

Now, I don't insist that everyone in the world should classify things 
the way I do - I'm just describing how I see it. Were I to come up with 
my own path-related APIs, they would most likely be divided into 3 
sub-modules corresponding to the 3 subdivisions listed above. I would 
want to make it clear that when you are operating strictly at the 
locator level, you aren't touching inodes or files; When you are 
operating at the inode level, you aren't touching file content.

Part 2: Should paths be objects?

I should mention that while I appreciate the power of OOP, I am also 
very much against the kind of OOP-absolutism that has been taught in 
many schools of software engineering in the last two decades. There are 
a lot of really good, formal, well-thought-out systems of program 
organization, and OOP is only one of many.

A classic example is relational algebra which forms the basis for 
relational databased - the basic notion that all operations on tabular 
data can be "composed" or "chained" in exactly the way that mathematical 
formula can be. In relational algebra, you can take a view of a view of 
a view, or a subquery of a query of a view of a table, and so on. Even 
single, scalar values - such as the count of the number of results of a 
query - are of the same data type as a 'relation', and can be operated 
on as such, or fed as input to a subsequent operation.

I bring up the example of relational algebra because it applies to paths 
as well: There is a kind of "path algebra", where an operation on a path 
results in another path, which can be operated on further.

Now, one way to achieve this kind of path algebra is to make paths an 
object, and to overload the various functions and operators so that 
they, too, return paths.

However, path algebra can be implemented just as easily in a functional 
style as in an object style. Properly done, a functional design 
shouldn't be significantly more bulky or wordy than an object design; 
The fact that the existing legacy API fails this test has more to do 
with history than any inherent advantages of OOP vs. functional style. 
(Actually, the OOP approach has a slight advantage in terms of the 
amount of syntactic sugar available, but that is [a] an artifact of the 
current Python feature set, and [b] not necessarily a good thing if it 
leads to gratuitous, Perl-ish cleverness.)

As a point of comparison, the Java Path API and the C# .Net Path API 
have similar capabilities, however the former is object-based whereas 
the latter is functional and operates on strings. Having used both of 
them extensively, I find I prefer the C# style, mainly due to the ease 
of intra-conversion with regular strings - being able to read strings 
from configuration files, for example, and immediately operate on them 
without having to convert to path form. I don't find "p.GetParent()" 
much harder or easier to type than "Path.GetParent( p )"; but I do 
prefer "Path.GetParent( string )" over "Path( string ).GetParent()".

However, this is only a *mild* preference - I could go either way, and 
wouldn't put up much of a fight about it.

(I should not that the Java Path API does *not* follow my scheme of 
separation between locators and inodes, while the C# API does, which is 
another reason why I prefer the C# approach.)

Part 3: Does this mean that the current API cannot be improved?

Certainly not! I think everyone (well, almost) agrees that there is much 
room for improvement in the current APIs. They certainly need to be 
refactored and recategorized.

But I don't think that the solution is to take all of the path-related 
functions and drop them into a single class, or even a single module.

---

Anyway, I hope that (a) that answers your questions, and (b) isn't too 
divergent from most people's views about Path.

-- Talin

From talin at acm.org  Wed Oct 25 05:51:02 2006
From: talin at acm.org (Talin)
Date: Tue, 24 Oct 2006 20:51:02 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453EDD43.3050609@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>		<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org>
Message-ID: <453EDF26.4040309@acm.org>

(one additional postscript - One thing I would be interested in is an 
approach that unifies file paths and URLs so that there is a consistent 
locator scheme for any resource, whether they be in a filesystem, on a 
web server, or stored in a zip file.)

-- Talin

From stephen at xemacs.org  Wed Oct 25 07:33:22 2006
From: stephen at xemacs.org (stephen at xemacs.org)
Date: Wed, 25 Oct 2006 14:33:22 +0900
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453EDF26.4040309@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org>
Message-ID: <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp>

Talin writes:
 > (one additional postscript - One thing I would be interested in is an 
 > approach that unifies file paths and URLs so that there is a consistent 
 > locator scheme for any resource, whether they be in a filesystem, on a 
 > web server, or stored in a zip file.)

+1

But doesn't file:/// do that for files, and couldn't we do something
like zipfile:///nantoka.zip#foo/bar/baz.txt?  Of course, we'd want to
do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too.  That
way leads to madness....

From scott+python-dev at scottdial.com  Wed Oct 25 07:34:12 2006
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Wed, 25 Oct 2006 01:34:12 -0400
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>		<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>	<453EDD43.3050609@acm.org>
	<453EDF26.4040309@acm.org>
	<17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <453EF754.6040105@scottdial.com>

stephen at xemacs.org wrote:
> Talin writes:
>  > (one additional postscript - One thing I would be interested in is an 
>  > approach that unifies file paths and URLs so that there is a consistent 
>  > locator scheme for any resource, whether they be in a filesystem, on a 
>  > web server, or stored in a zip file.)
> 
> +1
> 
> But doesn't file:/// do that for files, and couldn't we do something
> like zipfile:///nantoka.zip#foo/bar/baz.txt?  Of course, we'd want to
> do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too.  That
> way leads to madness....
> 

It would make more sense to register protocol handlers to this magical 
unification of resource manipulation. But allow me to perform my first 
channeling of Guido.. YAGNI.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From talin at acm.org  Wed Oct 25 07:38:46 2006
From: talin at acm.org (Talin)
Date: Tue, 24 Oct 2006 22:38:46 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>		<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>	<453EDD43.3050609@acm.org>	<453EDF26.4040309@acm.org>
	<17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <453EF866.1060503@acm.org>

stephen at xemacs.org wrote:
> Talin writes:
>  > (one additional postscript - One thing I would be interested in is an 
>  > approach that unifies file paths and URLs so that there is a consistent 
>  > locator scheme for any resource, whether they be in a filesystem, on a 
>  > web server, or stored in a zip file.)
> 
> +1
> 
> But doesn't file:/// do that for files, and couldn't we do something
> like zipfile:///nantoka.zip#foo/bar/baz.txt?  Of course, we'd want to
> do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too.  That
> way leads to madness....

file:/// does indeed to it, but only the network module understands 
strings in that format. Ideally, you should be able to pass 
"file:///..." to a regular "open" function. I wouldn't expect it to be 
able to understand "http://". But the "file:" protocol should always be 
supported.

In other words, I'm not proposing that the built-in file i/o package 
suddenly grow an understanding of network schema types. All I am 
proposing is a unified name space.

- Talin

From stephen at xemacs.org  Wed Oct 25 09:44:59 2006
From: stephen at xemacs.org (stephen at xemacs.org)
Date: Wed, 25 Oct 2006 16:44:59 +0900
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453EF754.6040105@scottdial.com>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org>
	<17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp>
	<453EF754.6040105@scottdial.com>
Message-ID: <17727.5627.756820.797525@uwakimon.sk.tsukuba.ac.jp>

Scott Dial writes:
 > stephen at xemacs.org wrote:
 > > Talin writes:
 > >  > (one additional postscript - One thing I would be interested in is an 
 > >  > approach that unifies file paths and URLs so that there is a consistent 
 > >  > locator scheme for any resource, whether they be in a filesystem, on a 
 > >  > web server, or stored in a zip file.)
 > > 
 > > +1

 > It would make more sense to register protocol handlers to this magical 
 > unification of resource manipulation.

I don't think it's that magical, and it's not manipulation, it's
location.

The question is, register where and on what?  For example on my Mac
there are some PDFs I want to open in Preview and others in Acrobat.
To the extent that I have some classes which are one or the other, I
might want to register the handler to a wildcard path object.

 > But allow me to perform my first channeling of Guido.. YAGNI.

True, but only because when I do need that kind of stuff I'm normally
writing Emacs Lisp, not Python.  We have a wide variety of functions
for manipulating path strings, and they make exactly the distinction
between path and inode/content that Talin does (where a path is being
manipulated, the function has "filename" in its name, where a file or
its metadata is being accessed, the function's name contains "file").
Nonetheless there are two or three places where programmers I respect
have chosen to invent path classes to handle hairy special cases.
These classes are very useful in those special cases.

One place where this gets especially hairy is in the TRAMP package,
which allows you to construct "remote paths" involving (for example)
logging into host A by ssh, from there to host B by ssh, and finally a
"relay download" of the content from host C to the local host by scp.
The net effect is that you can specify the path in your "open file"
dialog, and Emacs does the rest automatically; the only differences
the user sees between that and a local file is the length of the path
string and the time it takes to actually access the contents.

Once you've done that, that process is embedded into Emacs's notion of
the "current directory", so you can list the directory containing the
resource, or access siblings, very conveniently.

I don't expect to reproduce that functionality in Python personally,
but such use cases do exist.  Whether a general path class can be
invented that doesn't accumulate cruft faster than use cases is
another issue.

From talin at acm.org  Wed Oct 25 09:50:49 2006
From: talin at acm.org (Talin)
Date: Wed, 25 Oct 2006 00:50:49 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453EF754.6040105@scottdial.com>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>		<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>	<453EDD43.3050609@acm.org>
	<453EDF26.4040309@acm.org>
	<17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp>
	<453EF754.6040105@scottdial.com>
Message-ID: <453F1759.5050601@acm.org>

Scott Dial wrote:
> stephen at xemacs.org wrote:
>> Talin writes:
>>  > (one additional postscript - One thing I would be interested in is 
>> an  > approach that unifies file paths and URLs so that there is a 
>> consistent  > locator scheme for any resource, whether they be in a 
>> filesystem, on a  > web server, or stored in a zip file.)
>>
>> +1
>>
>> But doesn't file:/// do that for files, and couldn't we do something
>> like zipfile:///nantoka.zip#foo/bar/baz.txt?  Of course, we'd want to
>> do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too.  That
>> way leads to madness....
>>
> 
> It would make more sense to register protocol handlers to this magical 
> unification of resource manipulation. But allow me to perform my first 
> channeling of Guido.. YAGNI.
> 

I'm thinking that it was a tactical error on my part to throw in the 
whole "unified URL / filename namespace" idea, which really has nothing 
to do with the topic. Lets drop it, or start another topic, and let this 
  thread focus on critiques of the path module, which is probably more 
relevant at the moment.

-- Talin

From mal at egenix.com  Wed Oct 25 11:40:06 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 25 Oct 2006 11:40:06 +0200
Subject: [Python-Dev] __str__ bug?
In-Reply-To: 
References: <453E6C67.125074.001@leopold.j.asahi-net.or.jp>

Message-ID: <453F30F6.9060002@egenix.com>

Mike Krell wrote:
>> class S(str):
>>     def __str__(self): return "S.__str__"
>>
>> class U(unicode):
>>     def __str__(self): return "U.__str__"
>>
>> print str(S())
>> print str(U())
>>
>> This script prints:
>>
>> S.__str__
>> U.__str__
> 
> Yes, but "print U()" prints nothing, and the explicit str() should not
> be necessary.

The main difference here is that the string object defines
a tp_print slot, while Unicode doesn't.

As a result, tp_print for the string subtype is called and
this does an extra check for subtypes:

	if (! PyString_CheckExact(op)) {
		int ret;
		/* A str subclass may have its own __str__ method. */
		op = (PyStringObject *) PyObject_Str((PyObject *)op);
		if (op == NULL)
			return -1;
		ret = string_print(op, fp, flags);
		Py_DECREF(op);
		return ret;
	}

For Unicode, the PyObject_Print() API defaults to using
PyObject_Str() which uses the tp_str slot. This maps
directly to a Unicode API that works on the internals
and doesn't apply any extra checks to see if it was called
on a subtype.

Note that this is true for many of the __special__
slot methods you can implement on subtypes of built-in
types - they don't always work as you might expect.

Now in this rather common case, I guess we could add
support to the Unicode object to do extra checks like the
string object does.

Dito for the %-formatting.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ncoghlan at gmail.com  Wed Oct 25 11:47:29 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Oct 2006 19:47:29 +1000
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453EDD43.3050609@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>		<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org>
Message-ID: <453F32B1.4030101@gmail.com>

Talin wrote:
> Part 3: Does this mean that the current API cannot be improved?
> 
> Certainly not! I think everyone (well, almost) agrees that there is much 
> room for improvement in the current APIs. They certainly need to be 
> refactored and recategorized.
> 
> But I don't think that the solution is to take all of the path-related 
> functions and drop them into a single class, or even a single module.

+1 from me.

(for both the fraction I quoted and everything else you said, including the 
locator/inode/file distinction - although I'd also add that 'symbolic link' 
and 'directory' exist at a similar level as 'file').

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From martin at v.loewis.de  Wed Oct 25 13:13:30 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 25 Oct 2006 13:13:30 +0200
Subject: [Python-Dev] __str__ bug?
In-Reply-To: 
References: 	<453DE526.4020107@gmail.com>

Message-ID: <453F46DA.4000700@v.loewis.de>

Mike Krell schrieb:
>> Based on the behaviour of str and the fact that overriding unicode.__repr__
>> works just fine, I'd say file a bug on SF.
> 
> Done.  This is item 1583863.

Of course, it would be even better if you could also include a patch.

Regards,
Martin

From talin at acm.org  Wed Oct 25 18:49:44 2006
From: talin at acm.org (Talin)
Date: Wed, 25 Oct 2006 09:49:44 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453F32B1.4030101@gmail.com>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>		<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com>
Message-ID: <453F95A8.5090201@acm.org>

Nick Coghlan wrote:
> Talin wrote:
>> Part 3: Does this mean that the current API cannot be improved?
>>
>> Certainly not! I think everyone (well, almost) agrees that there is 
>> much room for improvement in the current APIs. They certainly need to 
>> be refactored and recategorized.
>>
>> But I don't think that the solution is to take all of the path-related 
>> functions and drop them into a single class, or even a single module.
> 
> +1 from me.
> 
> (for both the fraction I quoted and everything else you said, including 
> the locator/inode/file distinction - although I'd also add that 
> 'symbolic link' and 'directory' exist at a similar level as 'file').

I would tend towards classifying directory operations as inode-level 
operations, that you are working at the "filesystem as graph" level, 
rather than the "stream of bytes" level. When you iterate over a 
directory, what you are getting back is effectively inodes (well, 
directory entries are distinct from inodes in the underlying filesystem, 
but from Python there's no practical distinction.)

If I could draw a UML diagram in ASCII, I would have "inode --> points 
to --> directory or file" and "directory --> contains * --> inode". That 
would hopefully make things clearer.

Symbolic links, I am not so sure about; In some ways, hard links are 
easier to classify.

---

Having done a path library myself (in C++, for our code base at work), 
the trickiest part is getting the Windows path manipulations right, and 
fitting them into a model that allows writing of platform-agnostic code. 
This is especially vexing when you realize that its often useful to 
manipulate unix-style paths even when running under Win32 and vice 
versa. A prime example is that I have a lot of Python code at work that 
manipulates Perforce client specs files. The path specifications in 
these files are platform-agnostic, and use forward slashes regardless of 
the host platform, so "os.path.normpath" doesn't do the right thing for me.

> Cheers,
> Nick.

From pje at telecommunity.com  Wed Oct 25 18:56:37 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 25 Oct 2006 12:56:37 -0400
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453F95A8.5090201@acm.org>
References: <453F32B1.4030101@gmail.com>
	<20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com>
Message-ID: <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com>

At 09:49 AM 10/25/2006 -0700, Talin wrote:
>Having done a path library myself (in C++, for our code base at work),
>the trickiest part is getting the Windows path manipulations right, and
>fitting them into a model that allows writing of platform-agnostic code.
>This is especially vexing when you realize that its often useful to
>manipulate unix-style paths even when running under Win32 and vice
>versa. A prime example is that I have a lot of Python code at work that
>manipulates Perforce client specs files. The path specifications in
>these files are platform-agnostic, and use forward slashes regardless of
>the host platform, so "os.path.normpath" doesn't do the right thing for me.

You probably want to use the posixpath module directly in that case, though 
perhaps you've already discovered that.

From talin at acm.org  Wed Oct 25 19:16:48 2006
From: talin at acm.org (Talin)
Date: Wed, 25 Oct 2006 10:16:48 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com>
References: <453F32B1.4030101@gmail.com>
	<20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com>
	<5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com>
Message-ID: <453F9C00.3090300@acm.org>

Phillip J. Eby wrote:
> At 09:49 AM 10/25/2006 -0700, Talin wrote:
>> Having done a path library myself (in C++, for our code base at work),
>> the trickiest part is getting the Windows path manipulations right, and
>> fitting them into a model that allows writing of platform-agnostic code.
>> This is especially vexing when you realize that its often useful to
>> manipulate unix-style paths even when running under Win32 and vice
>> versa. A prime example is that I have a lot of Python code at work that
>> manipulates Perforce client specs files. The path specifications in
>> these files are platform-agnostic, and use forward slashes regardless of
>> the host platform, so "os.path.normpath" doesn't do the right thing 
>> for me.
> 
> You probably want to use the posixpath module directly in that case, 
> though perhaps you've already discovered that.

Never heard of it. Its not in the standard library, is it? I don't see 
it in the table of contents or the index.

From fdrake at acm.org  Wed Oct 25 19:36:31 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Oct 2006 13:36:31 -0400
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453F9C00.3090300@acm.org>
References: <453F32B1.4030101@gmail.com>
	<5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com>
	<453F9C00.3090300@acm.org>
Message-ID: <200610251336.31398.fdrake@acm.org>

On Wednesday 25 October 2006 13:16, Talin wrote:
 > Never heard of it. Its not in the standard library, is it? I don't see
 > it in the table of contents or the index.

This is a documentation bug.  :-(  I'd thought they were mentioned 
*somewhere*, but it looks like I'm wrong.

os.path is an alias for one of several different real modules; which is 
selected depends on the platform.  I see the following: macpath, ntpath, 
os3emxpath, riscospath.  (ntpath is used for all Windows versions, not just 
NT.)

  -Fred

-- 
Fred L. Drake, Jr.   

From pje at telecommunity.com  Wed Oct 25 20:19:22 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 25 Oct 2006 14:19:22 -0400
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453F9C00.3090300@acm.org>
References: <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com>
	<453F32B1.4030101@gmail.com>
	<20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com>
	<5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20061025141358.027248d8@sparrow.telecommunity.com>

At 10:16 AM 10/25/2006 -0700, Talin wrote:
>Phillip J. Eby wrote:
> > At 09:49 AM 10/25/2006 -0700, Talin wrote:
> >> Having done a path library myself (in C++, for our code base at work),
> >> the trickiest part is getting the Windows path manipulations right, and
> >> fitting them into a model that allows writing of platform-agnostic code.
> >> This is especially vexing when you realize that its often useful to
> >> manipulate unix-style paths even when running under Win32 and vice
> >> versa. A prime example is that I have a lot of Python code at work that
> >> manipulates Perforce client specs files. The path specifications in
> >> these files are platform-agnostic, and use forward slashes regardless of
> >> the host platform, so "os.path.normpath" doesn't do the right thing
> >> for me.
> >
> > You probably want to use the posixpath module directly in that case,
> > though perhaps you've already discovered that.
>
>Never heard of it. Its not in the standard library, is it? I don't see
>it in the table of contents or the index.

posixpath, ntpath, macpath, et al are the platform-specific path 
manipulation modules that are aliased to os.path.  However, each of these 
modules' string path manipulation functions can be imported and used on any 
platform.  See below:

Linux:

Python 2.3.5 (#1, Aug 25 2005, 09:17:44)
[GCC 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import os
 >>> os.path

 >>> import ntpath
 >>> dir(ntpath)
['__all__', '__builtins__', '__doc__', '__file__', '__name__', 'abspath', 
'altsep', 'basename', 'commonprefix', 'curdir', 'defpath', 'dirname', 
'exists', 'expanduser', 'expandvars', 'extsep', 'getatime', 'getctime', 
'getmtime', 'getsize', 'isabs', 'isdir', 'isfile', 'islink', 'ismount', 
'join', 'normcase', 'normpath', 'os', 'pardir', 'pathsep', 'realpath', 
'sep', 'split', 'splitdrive', 'splitext', 'splitunc', 'stat', 
'supports_unicode_filenames', 'sys', 'walk']

Windows:

Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
 >>> import os
 >>> os.path

 >>> import posixpath
 >>> dir(posixpath)
['__all__', '__builtins__', '__doc__', '__file__', '__name__', '_varprog', 
'abspath', 'altsep', 'basename', 'commonprefix', 'curdir', 'defpath', 
'dirname', 'exists', 'expanduser', 'expandvars', 'extsep', 'getatime', 
'getctime', 'getmtime', 'getsize', 'isabs', 'isdir', 'isfile', 'islink', 
'ismount', 'join', 'normcase', 'normpath', 'os', 'pardir', 'pathsep', 
'realpath', 'samefile', 'sameopenfile', 'samestat', 'sep', 'split', 
'splitdrive', 'splitext', 'stat', 'supports_unicode_filenames', 'sys', 'walk']

Note, therefore, that any "path object" system should also allow you to 
create and manipulate foreign paths.  That is, it should have variants for 
each path type, rather than being locked to the local platform's path 
strings.  Of course, the most common need for this is manipulating posix 
paths on non-posix platforms, but sometimes one must deal with Windows 
paths on Unix, too.

From fredrik at pythonware.com  Wed Oct 25 21:23:32 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 25 Oct 2006 21:23:32 +0200
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453F9C00.3090300@acm.org>
References: <453F32B1.4030101@gmail.com>	<20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>		<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>	<453EDD43.3050609@acm.org>
	<453F32B1.4030101@gmail.com>	<5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com>
	<453F9C00.3090300@acm.org>
Message-ID: 

Talin wrote:

>> You probably want to use the posixpath module directly in that case, 
>> though perhaps you've already discovered that.
> 
> Never heard of it. Its not in the standard library, is it? I don't see 
> it in the table of contents or the index.

http://effbot.org/librarybook/posixpath.htm

From mahs at telcopartners.com  Wed Oct 25 23:06:21 2006
From: mahs at telcopartners.com (Michael Spencer)
Date: Wed, 25 Oct 2006 14:06:21 -0700
Subject: [Python-Dev] Fwd: Re: ANN compiler2 : Produce bytecode from Python
	2.5 AST
In-Reply-To: <453d4890$0$22566$9b622d9e@news.freenet.de>
References: 			
	<453d4890$0$22566$9b622d9e@news.freenet.de>
Message-ID: 

Martin v. L?wis wrote:
> Georg Brandl schrieb:
>> Perhaps you can bring up a discussion on python-dev about your improvements
>> and how they could be integrated into the standard library...
> 
> Let me second this. The compiler package is largely unmaintained and
> was known to be broken (and perhaps still is). A replacement
> implementation, especially if it comes with a new maintainer, would
> be welcome.
> 
> Regards,
> Martin

Hello python-dev.

I use AST-based code inspection and manipulation, and I've been looking forward 
to using v2.5 ASTs for their increased accuracy, consistency and speed. However, 
there is as yet no Python-exposed mechanism for compiling v2.5 ASTs to bytecode.

So to meet my own need and interest I've been implementing 'compiler2', similar 
in scope to the stblib compiler package, but generating code from Python 2.5 
_ast.ASTs.  The code has evolved considerably from the compiler package: in 
aggregate the changes amount to a re-write.  More about the package and its 
status below.

I'm introducing this project here to discuss whether and how these changes 
should be integrated with the stdlib.

I believe there is a prima facie need to have a builtin/stdlib capability for 
compiling v2.5 ASTs from Python, and there is some advantage to having that be 
implemented in Python.  There is also a case for deprecating the v2.4 ASTs to 
ease maintenance and reduce the confusion associated with two different AST formats.

If there is interest, I'm willing make compiler2 stdlib-ready.  I'm also open to 
alternative approaches, including doing nothing.

compiler2 Objectives and Status
===============================
My goal is to get compiler2 to produce identical output to __builtin__.compile 
(at least optionally), while also providing an accessible framework for 
AST-manipulation, experimental compiler optimizations and customization.

compiler2 is not finished - there are some unresolved bugs, and open questions 
on interface design - but already it produces identical output to 
__builtin__.compile for all of the stdlib modules and their tests (except for 
the stackdepth attribute which is different in 12 cases). All but three stdlib 
modules pass their tests after being compiled using compiler2.  More on goals, 
status, known issues etc... in the project readme.txt at: 
http://svn.brownspencer.com/pycompiler/branches/new_ast/readme.txt

Code is available in Subversion at 
http://svn.brownspencer.com/pycompiler/branches/new_ast/

The main test script is test/test_compiler.py which compiles all the modules in 
/Lib and /Lib/test and compares the output with __builtin__.compile.

Best regards

Michael Spencer

From greg.ewing at canterbury.ac.nz  Thu Oct 26 01:39:24 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Oct 2006 12:39:24 +1300
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453EDD43.3050609@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org>
Message-ID: <453FF5AC.4060500@canterbury.ac.nz>

Talin wrote:
> (Actually, the OOP approach has a slight advantage in terms of the 
> amount of syntactic sugar available,

Even if you don't use any operator overloading, there's
still the advantage that an object provides a namespace
for its methods. Without that, you either have to use
fairly verbose function names or keep qualifying them
with a module name. Code that uses the current path
functions tends to contain a lot of
os.path.this(os.path.that(...)) stuff which is quite
tedious to write and read.

Another consideration is that having paths be a
distinct data type allows for the possibility of file
system references that aren't just strings. In
Classic MacOS, for example, the definitive way of
referencing a file is by a (volRefum, dirID, name)
tuple, and textual paths aren't guaranteed to be
unique or even to exist.

> (I should not that the Java Path API does *not* follow my scheme of 
> separation between locators and inodes, while the C# API does, which is 
> another reason why I prefer the C# approach.)

A compromise might be to have all the "path algebra"
operations be methods, and everything else functions
which operate on path objects. That would make sense,
because the path algebra ought to be a closed set
of operations that's tightly coupled to the platform's
path semantics.

--
Greg

From talin at acm.org  Thu Oct 26 04:48:32 2006
From: talin at acm.org (Talin)
Date: Wed, 25 Oct 2006 19:48:32 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453FF5AC.4060500@canterbury.ac.nz>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz>
Message-ID: <45402200.1010308@acm.org>

Greg Ewing wrote:
> Talin wrote:
>> (Actually, the OOP approach has a slight advantage in terms of the 
>> amount of syntactic sugar available,
> 
> Even if you don't use any operator overloading, there's
> still the advantage that an object provides a namespace
> for its methods. Without that, you either have to use
> fairly verbose function names or keep qualifying them
> with a module name. Code that uses the current path
> functions tends to contain a lot of
> os.path.this(os.path.that(...)) stuff which is quite
> tedious to write and read.

Given the flexibility that Python allows in naming the modules that you 
import, I'm not sure that this is a valid objection -- you can make the 
module name as short as you feel comfortable with.

> Another consideration is that having paths be a
> distinct data type allows for the possibility of file
> system references that aren't just strings. In
> Classic MacOS, for example, the definitive way of
> referencing a file is by a (volRefum, dirID, name)
> tuple, and textual paths aren't guaranteed to be
> unique or even to exist.

That's true of textual paths in general - i.e. even on unix, textual 
paths aren't guaranteed to be unique or exist.

Its been a while since I used classic MacOS - how do you handle things 
like configuration files with path names in them?

>> (I should not that the Java Path API does *not* follow my scheme of 
>> separation between locators and inodes, while the C# API does, which 
>> is another reason why I prefer the C# approach.)
> 
> A compromise might be to have all the "path algebra"
> operations be methods, and everything else functions
> which operate on path objects. That would make sense,
> because the path algebra ought to be a closed set
> of operations that's tightly coupled to the platform's
> path semantics.

Personally, this is one of those areas where I am strongly tempted to 
violate TOOWTDI - I can see use cases where string-based paths would be 
more convenient and less typing, and other use cases where object-based 
paths would be more convenient and less typing.

If I were designing a path library, I would create a string-based system 
as the lowest level, and an object based system on top of it (the reason 
for doing it that was is simply so that people who want to use strings 
don't have to suffer the cost of creating temporary path objects to do 
simple things like joins.) Moreover, I would keep the naming conventions 
of the two systems similar, if at all possible possible - thus, the 
object methods would have the same (short) names as the functions within 
the module.

So for example:

    # Import new, refactored module io.path
    from io import path

    # Case 1 using strings
    path1 = path.join( "/Libraries/Frameworks", "Python.Framework" )
    parent = path.parent( path1 )

    # Case 2 using objects
    pathobj = path.Path( "/Libraries/Frameworks" )
    pathobj += "Python.Framework"
    parent = pathobj.parent()

Let me riff on this just a bit more - don't take this all too seriously 
though:

    Refactored organization of path-related modules (under a new name
    so as not to conflict with existing modules):

    io.path -- path manipulations
    io.dir -- directory functions, including dirwalk
    io.fs -- dealing with filesystem objects (inodes, symlinks, etc.)
    io.file -- file read / write streams

    # Import directory module
    import io.dir

    # String based API
    for entry in io.dir.listdir( "/Library/Frameworks" ):
       print entry  # Entry is a string

    # Object based API
    dir = io.dir.Directory( "/Library/Frameworks" )
    for entry in dir: # Iteration protocol on dir object
       print entry  # entry is an obj, but __str__() returns path text

    # Dealing with various filesystems: pass in a format parameter
    dir = io.dir.Directory( "/Library/Frameworks" )
       print entry.path( format="NT" ) # entry printed in NT format

    # Or you can just use a format specifier for PEP 3101 string format:
    print "Path in local system format is {0}".format( entry )
    print "Path in NT format is {0:NT}".format( entry )
    print "Path in OS X format is {0:OSX}".format( entry )

Anyway, off the top of my head, that's what a refactored path API would 
look like if I were doing it :)

(Yes, the names are bad, can't think of better ATM.)

-- Talin

From greg.ewing at canterbury.ac.nz  Thu Oct 26 02:52:29 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Oct 2006 13:52:29 +1300
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <453EF866.1060503@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org>
	<17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp>
	<453EF866.1060503@acm.org>
Message-ID: <454006CD.7000006@canterbury.ac.nz>

Talin wrote:
> Ideally, you should be able to pass 
> "file:///..." to a regular "open" function.

I'm not so sure about that. Consider that "file:///foo.bar"
is a valid relative pathname on Unix to a file called "foo.bar"
in a directory called "file:".

That's not to say there shouldn't be a function available
that understands it, but I wouldn't want it built into
all functions that take pathnames.

--
Greg

From foom at fuhm.net  Thu Oct 26 09:00:57 2006
From: foom at fuhm.net (James Y Knight)
Date: Thu, 26 Oct 2006 03:00:57 -0400
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <45402200.1010308@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz>
	<45402200.1010308@acm.org>
Message-ID: 

On Oct 25, 2006, at 10:48 PM, Talin wrote:
> That's true of textual paths in general - i.e. even on unix, textual
> paths aren't guaranteed to be unique or exist.
>
> Its been a while since I used classic MacOS - how do you handle things
> like configuration files with path names in them?

You aren't supposed to use paths at all. You're supposed to use an  
Alias whenever you're doing long term storage of a reference to a  
file. This allows the user to move the file around on the disk  
without breaking the reference, which is nice. The alias is an opaque  
datastructure which contains a bunch of redundant information used to  
locate the file. In particular, both pathname and (volumeId, dirId,  
name), as well as some other stuff like file size, etc. to help do  
fuzzy matching if the original file can't be found via the obvious  
locators. And for files on a file server, it also contains  
information on how to reconnect to the server if necessary.

Much of the alias infrastructure carries over into OSX, although the  
strictures against using paths have been somewhat watered down. At  
least in OSX, you don't have the issue of the user renaming the boot  
volume and thus breaking every path someone ill-advisedly stored  
(since volume name was part of the path).

For an example of aliases in OSX, open a file in TextEdit, see that  
it gets into the "recent items" menu. Now, move it somewhere else and  
rename it, and notice that it's still accessible from the menu.  
Seperately, try deleting the file and renaming another to the same  
name. Notice that it also succeeds in referencing this new file.

Hm, how's this related to python? I'm not quite sure. :)

James

From greg.ewing at canterbury.ac.nz  Thu Oct 26 10:29:43 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Oct 2006 21:29:43 +1300
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <45402200.1010308@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz>
	<45402200.1010308@acm.org>
Message-ID: <454071F7.70104@canterbury.ac.nz>

Talin wrote:

> That's true of textual paths in general - i.e. even on unix, textual 
> paths aren't guaranteed to be unique or exist.

What I mean is that it's possible for two different
files to have the same pathname (since you can mount
two volumes with identical names at the same time, or
for a file to exist on disk yet not be accessible via
any pathname (because it would exceed 255 characters).
I'm not aware of any analogous situations in unix.

> Its been a while since I used classic MacOS - how do you handle things 
> like configuration files with path names in them?

True native classic MacOS software generally doesn't
use pathnames. Things like textual config files are
really a foreign concept to it. If you wanted to store
config info, you'd probably store an alias, which
points at the moral equivalent of the files inode
number, and use a GUI for editing it.

However all this is probably not very relevant now,
since as far as I know, classic MacOS is no longer
supported in current Python versions. I'm just
pointing out that the flexibility would be there
if any similarly offbeat platform needed to be
supported in the future.

>     # Or you can just use a format specifier for PEP 3101 string format:
>     print "Path in local system format is {0}".format( entry )
>     print "Path in NT format is {0:NT}".format( entry )
>     print "Path in OS X format is {0:OSX}".format( entry )

I don't think that expressing one platform's pathnames
in the format of another is something you can do in
general, e.g. going from Windows to Unix, what do you
do with the drive letter?

You can only really do it if you have some sort of
network file system connection, and then you need
more information than just the path in order to do
the translation.

--
Greg

From talin at acm.org  Thu Oct 26 11:12:15 2006
From: talin at acm.org (Talin)
Date: Thu, 26 Oct 2006 02:12:15 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <454071F7.70104@canterbury.ac.nz>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz>
	<45402200.1010308@acm.org> <454071F7.70104@canterbury.ac.nz>
Message-ID: <45407BEF.7010204@acm.org>

Greg Ewing wrote:
> Talin wrote:
> 
>> That's true of textual paths in general - i.e. even on unix, textual 
>> paths aren't guaranteed to be unique or exist.
> 
> What I mean is that it's possible for two different
> files to have the same pathname (since you can mount
> two volumes with identical names at the same time, or
> for a file to exist on disk yet not be accessible via
> any pathname (because it would exceed 255 characters).
> I'm not aware of any analogous situations in unix.
> 
>> Its been a while since I used classic MacOS - how do you handle things 
>> like configuration files with path names in them?
> 
> True native classic MacOS software generally doesn't
> use pathnames. Things like textual config files are
> really a foreign concept to it. If you wanted to store
> config info, you'd probably store an alias, which
> points at the moral equivalent of the files inode
> number, and use a GUI for editing it.
> 
> However all this is probably not very relevant now,
> since as far as I know, classic MacOS is no longer
> supported in current Python versions. I'm just
> pointing out that the flexibility would be there
> if any similarly offbeat platform needed to be
> supported in the future.

I'm not sure that PEP 355 included any such support - IIRC, the path 
object was a subclass of string. That isn't, however, a defense against 
what you are saying - just because neither the current system or the 
proposed improvement support the kinds of file references you are 
speaking of, doesn't mean it shouldn't be done.

However, this does kind of suck for a cross-platform scripting language 
like Python. It means that any cross-platform app which requires access 
to multiple data files that contain inter-file references essentially 
has to implement its own virtual file system. (Python module imports 
being a case in point.)

One of the things that I really love about Python programming is that I 
can sit down and start hacking on a new project without first having to 
go through an agonizing political decision about what platforms I should 
support. It used to be that I would spend hours ruminating over things 
like "Well...if I want any market share at all, I really should 
implement this as Windows program...but on the other hand, I won't enjoy 
writing it nearly as much." Then comes along Python and removes all of 
that bothersome hacker-angst.

Because of this, I am naturally disinclined to incorporate into my 
programs any concept which doesn't translate to other platforms. I don't 
mind writing some platform-specific code, as long as it doesn't take 
over my program. It seems that any Python program that manipulated paths 
would have to be radically different in the environment that you describe.

How about this: In my ontology of path APIs given earlier, I would tend 
to put the MacOS file reference in the category of "file locator schemes 
other than paths". In other words, what you are describing isn't IMHO a 
path at all, but it is like a path in that it describes how to get to a 
file. (Its almost like an inode or dirent in some ways.)

An alternative approach is to try and come up with an encoding scheme 
that allows you to represent all of that platform-specific semantics in 
a string. This leaves you with the unhappy choice of "inventing" a new 
path syntax for an old platform. however.

>>     # Or you can just use a format specifier for PEP 3101 string format:
>>     print "Path in local system format is {0}".format( entry )
>>     print "Path in NT format is {0:NT}".format( entry )
>>     print "Path in OS X format is {0:OSX}".format( entry )
> 
> I don't think that expressing one platform's pathnames
> in the format of another is something you can do in
> general, e.g. going from Windows to Unix, what do you
> do with the drive letter?

Yeah, probably not. See, I told you not to take it too seriously! But I 
do feel that its important to be able to manipulate posix-style path 
syntax on non-posix platfosm, given how many cross-platform applications 
there are that have a cross-platform path syntax.

In my own work, I find that drive letters are never explicitly specified 
in config files. Any application such as a parser, template generator, 
or resource manager (in other words, any application whose data files 
are routinely checked in to the source control system or shared across a 
network) tend to 'see' only relative paths in their input files, and 
embedding absolute paths is considered an error on the user's part. Of 
course, those same apps *do* internally convert all those relative paths 
to absolute, so that they can be compared and resolved with respect to 
some common base.

Then again, in my opinion, the only *really* absolute paths are 
fully-qualified URLs. So there. :)

> You can only really do it if you have some sort of
> network file system connection, and then you need
> more information than just the path in order to do
> the translation.
> 
> -- 
> Greg

From greg.ewing at canterbury.ac.nz  Fri Oct 27 01:12:20 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Oct 2006 12:12:20 +1300
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <45407BEF.7010204@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz>
	<45402200.1010308@acm.org> <454071F7.70104@canterbury.ac.nz>
	<45407BEF.7010204@acm.org>
Message-ID: <454140D4.3040100@canterbury.ac.nz>

Talin wrote:
> It seems that any Python program that manipulated paths 
> would have to be radically different in the environment that you describe.

I can sympathise with that. The problem is really
inherent in the nature of the platforms -- it's
just not possible to do everything in a native
classic MacOS way and be cross-platform at the
same time. There has to be a compromise somewhere.

With classic MacOS the compromise was usually to
use pathnames and to heck with the consequences.
You could get away with it most of the time.

> In other words, what you are describing isn't IMHO a 
> path at all, but it is like a path in that it describes how to get to a 
> file.

Yes, that's true. Calling it a "path" would be
something of a historical misnomer.

> An alternative approach is to try and come up with an encoding scheme 
> that allows you to represent all of that platform-specific semantics in 
> a string.

Yes, I thought of that, too. That's what you would
have to do under the current scheme if you ever
encountered a platform which truly had no textual
representation of file locations.

But realistically, it seems unlikely that such a
platform will be invented in the foreseeable future
(even classic MacOS *had* a notion of paths, even
if it wasn't the preferred representation). So
all this is probably YAGNI.

--
Greg

From kbk at shore.net  Fri Oct 27 03:30:00 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Thu, 26 Oct 2006 21:30:00 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200610270130.k9R1U0NN007852@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  434 open ( +3) /  3430 closed ( +5) /  3864 total ( +8)
Bugs    :  929 open (+13) /  6285 closed (+12) /  7214 total (+25)
RFE     :  245 open ( +1) /   240 closed ( +0) /   485 total ( +1)

New / Reopened Patches
______________________

various datetime methods fail in restricted mode  (2006-10-17)
       http://python.org/sf/1578643  opened by  lplatypus

PyErr_Format corrections  (2006-10-17)
       http://python.org/sf/1578999  opened by  Martin v. L?wis

posix.readlink doesn't use filesystemencoding  (2006-10-19)
       http://python.org/sf/1580674  opened by  Ronald Oussoren

Duplicated declaration of PyCallable_Check  (2006-10-20)
CLOSED http://python.org/sf/1580872  opened by  Matthias Klose

Allow textwrap to preserve leading and trailing whitespace  (2006-10-20)
       http://python.org/sf/1581073  opened by  Dwayne Bailey

tarfile.py: 100-char filenames are truncated  (2006-10-24)
CLOSED http://python.org/sf/1583506  opened by  Lars Gust?bel

tarfile.py: better use of TarInfo objects with longnames  (2006-10-24)
       http://python.org/sf/1583880  opened by  Lars Gust?bel

Tix: subwidget names (bug #1472877)  (2006-10-25)
       http://python.org/sf/1584712  opened by  Matthias Kievernagel

Patches Closed
______________

patch for building trunk with VC6  (2006-03-24)
       http://python.org/sf/1457736  closed by  loewis

a faster Modulefinder  (2005-11-11)
       http://python.org/sf/1353872  closed by  theller

Duplicated declaration of PyCallable_Check  (2006-10-20)
       http://python.org/sf/1580872  closed by  loewis

Exec stacks in python 2.5  (2006-09-18)
       http://python.org/sf/1560695  closed by  loewis

tarfile.py: 100-char filenames are truncated  (2006-10-24)
       http://python.org/sf/1583506  closed by  gbrandl

New / Reopened Bugs
___________________

2.4.4c1 will not build when cross compiling  (2006-10-16)
CLOSED http://python.org/sf/1578513  opened by  smithj

--disable-sunaudiodev --disable-tk does not work  (2006-10-17)
       http://python.org/sf/1579029  opened by  ThurnerRupert

Segfault provoked by generators and exceptions  (2006-10-17)
       http://python.org/sf/1579370  opened by  Mike Klaas

Use flush() before os.exevp()  (2006-10-18)
       http://python.org/sf/1579477  opened by  Thomas Guettler

Wrong syntax for PyDateTime_IMPORT in documentation  (2006-10-18)
CLOSED http://python.org/sf/1579796  opened by  David Faure

not configured for tk  (2006-10-18)
       http://python.org/sf/1579931  opened by  Carl Wenrich

glob.glob("c:\\[ ]\*) doesn't work  (2006-10-19)
       http://python.org/sf/1580472  opened by  Koblaid

"make install" for Python 2.4.4 not working properly  (2006-10-19)
       http://python.org/sf/1580563  opened by  Andreas Jung

Configure script does not work for RHEL 4 x86_64  (2006-10-19)
       http://python.org/sf/1580726  reopened by  gbrandl

Configure script does not work for RHEL 4 x86_64  (2006-10-19)
       http://python.org/sf/1580726  reopened by  spotvt01

Configure script does not work for RHEL 4 x86_64  (2006-10-19)
       http://python.org/sf/1580726  opened by  Chris

httplib hangs reading too much data  (2006-10-19)
       http://python.org/sf/1580738  opened by  Dustin J. Mitchell

Definition of a "character" is wrong  (2006-10-20)
       http://python.org/sf/1581182  opened by  Adam Olsen

pickle protocol 2 failure on int subclass   (2006-10-20)
       http://python.org/sf/1581183  opened by  Anders J. Munch

missing __enter__ + __getattr__ forwarding  (2006-10-21)
       http://python.org/sf/1581357  opened by  Hirokazu Yamamoto

Text search gives bad count if called from variable trace  (2006-10-20)
       http://python.org/sf/1581476  opened by  Russell Owen

test_sqlite fails on OSX G5 arch if test_ctypes is run  (2006-10-21)
       http://python.org/sf/1581906  opened by  Skip Montanaro

email.header decode within word  (2006-10-22)
       http://python.org/sf/1582282  opened by  Tokio Kikuchi

Python is dumping core after the test test_ctypes  (2006-10-23)
       http://python.org/sf/1582742  opened by  shashi

Bulding source with VC6 fails due to missing files  (2006-10-23)
CLOSED http://python.org/sf/1582856  opened by  Ulrich Hockenbrink

class member inconsistancies  (2006-10-23)
CLOSED http://python.org/sf/1583060  opened by  EricDaigno

Different behavior when stepping through code w/ pdb  (2006-10-24)
       http://python.org/sf/1583276  opened by  John Ehresman

tarfile incorrectly handles long filenames  (2006-10-24)
CLOSED http://python.org/sf/1583537  opened by  Mike Looijmans

yield+break stops tracing  (2006-10-24)
       http://python.org/sf/1583862  opened by  Lukas Lalinsky

__str__ cannot be overridden on unicode-derived classes  (2006-10-24)
       http://python.org/sf/1583863  opened by  Mike K

SSL "issuer" and "server" functions problems - security   (2006-10-24)
       http://python.org/sf/1583946  opened by  John Nagle

remove() during iteration causes items to be skipped  (2006-10-24)
CLOSED http://python.org/sf/1584028  opened by  Kevin Rabsatt

os.tempnam fails on SUSE Linux to accept directory argument  (2006-10-25)
CLOSED http://python.org/sf/1584723  opened by  Andreas

Events in list return None not True on wait()  (2006-10-26)
CLOSED http://python.org/sf/1585135  opened by  SpinMess

Bugs Closed
___________

from_param and _as_parameter_ truncating 64-bit value  (2006-10-12)
       http://python.org/sf/1575945  closed by  theller

2.4.4c1 will not build when cross compiling  (2006-10-16)
       http://python.org/sf/1578513  closed by  loewis

Error with callback function and as_parameter with NumPy ndp  (2006-10-10)
       http://python.org/sf/1574584  closed by  theller

PyThreadState_Clear() docs incorrect  (2003-04-17)
       http://python.org/sf/723205  deleted by  theller

Wrong syntax for PyDateTime_IMPORT in documentation  (2006-10-18)
       http://python.org/sf/1579796  closed by  akuchling

Configure script does not work for RHEL 4 x86_64  (2006-10-19)
       http://python.org/sf/1580726  closed by  loewis

Example typo in section 4 of 'Installing Python Modules'  (2006-10-12)
       http://python.org/sf/1576348  closed by  akuchling

Bulding source with VC6 fails due to missing files  (2006-10-23)
       http://python.org/sf/1582856  closed by  loewis

class member inconsistancies  (2006-10-23)
       http://python.org/sf/1583060  closed by  gbrandl

mac installer profile patch vs. .bash_login  (2006-09-19)
       http://python.org/sf/1561243  closed by  sf-robot

idle in python 2.5c1 freezes on macos 10.3.9  (2006-08-18)
       http://python.org/sf/1542949  closed by  sf-robot

Launcher reset to factory button provides bad command-line  (2006-10-03)
       http://python.org/sf/1570284  closed by  sf-robot

tarfile incorrectly handles long filenames  (2006-10-24)
       http://python.org/sf/1583537  deleted by  cdwave

remove() during iteration causes items to be skipped  (2006-10-24)
       http://python.org/sf/1584028  closed by  rhettinger

os.tempnam fails on SUSE Linux to accept directory argument  (2006-10-25)
       http://python.org/sf/1584723  closed by  gbrandl

Events in list return None not True on wait()  (2006-10-26)
       http://python.org/sf/1585135  closed by  gbrandl

New / Reopened RFE
__________________

Add os.link() and os.symlink() support for Windows  (2006-10-16)
       http://python.org/sf/1578269  opened by  M.-A. Lemburg

From steven.bethard at gmail.com  Fri Oct 27 21:11:27 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 27 Oct 2006 13:11:27 -0600
Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-01 to 2006-09-15
Message-ID: 

Here's the summary for the first half of September.  As always,
comments and corrections are greatly appreciated!

=============
Announcements
=============

----------------------------
QOTF: Quote of the Fortnight
----------------------------

Through a cross-posting slip-up, Jean-Paul Calderone managed to
provide us with some inspiring thoughts on mailing-list archives:

    One could just as easily ask why no one bothers to read mailing list
    archives to see if their question has been answered before.

    No one will ever know, it is just one of the mysteries of the universe.

Contributing thread:

- `[Twisted-Python] Newbie question
`__

-------------------------
Monthly Arlington sprints
-------------------------

Jeffrey Elkner has arranged for monthly Arlington Python sprints. See
the `Arlington sprint wiki`_ for more details.

.. _Arlington sprint wiki: http://wiki.python.org/moin/ArlingtonSprint

Contributing thread:

- `Arlington sprints to occur monthly
`__

=========
Summaries
=========

-----------------------------------------
Signals, threads and blocking C functions
-----------------------------------------

Gustavo Carneiro explained a problem that pygtk was running into.
Their main loop function, ``gtk_main()``, blocks forever. If there are
threads in the program, they cannot receive signals because Python
catches the signal and calls ``Py_AddPendingCall()``, relying on the
main thread to call ``Py_MakePendingCalls()``.  Since with pygtk, the
main thread is blocked calling a C function, it has no way other than
polling to decide when ``Py_MakePendingCalls()`` needs to be called.
Gustavo was hoping for some sort of API so that his blocking thread
could get notified when ``Py_AddPendingCall()`` had been called.

There was a long discussion about the feasibility of this and other
solutions to his problem. One of the main problems is that almost
nothing can safely be done from a signal handler context, so some
people felt like having Python invoke arbitrary third-party code was a
bad idea. Gustavo was reasonably confident that he could write to a
pipe within that context, which was all he needed to do to solve his
problem, but Nick Maclaren explained in detail some of the problems,
e.g. writing proper synchronization primitives that are signal-handler
safe.

Jan Kanis suggested that threads in a pygtk program should
occasionally check the signal handler flags and calls PyGTK's callback
to wake up the main thread. But Gustavo explained that things like the
GnomeVFS library have their own thread pools and know nothing about
Python so can't make such a callback.

Adam Olsen that Python could create a single non-blocking pipe for all
signals. When a signal was handled, the signal number would be written
to that pipe as a single byte. Third-party libraries, like pygtk,
could poll the appropriate file descriptor, waking up and handing
control back to Python when a signal was received. There were some
disadvantages to this approach, e.g. if there is a large burst of
signals, some of them would be lost, but folks seemed to think that
these kinds of things would not cause many real-world problems.
Gustavo and Adam then worked out the code in a little more detail.

The `Py_signal_pipe patch`_ was posted to SourceForge.

.. _Py_signal_pipe patch: http://bugs.python.org/1564547

Contributing thread:

- `Signals, threads, blocking C functions
`__

------------------------
API for str.rpartition()
------------------------

Raymond Hettinger pointed out that in cases where the separator was
not found, ``str.rpartition()`` was putting the remainder of the
string in the wrong spot, e.g. ``str.rpartition()`` worked like::

    'axbxc'.rpartition('x') == ('axb', 'x', 'c')
    'axb'.rpartition('x') == ('a', 'x', 'b')
    'a'.rpartition('x') == ('a', '', '')  # should be ('', '', 'a')

Thus code that used ``str.rpartition()`` in a loop or recursively
would likely never terminate. Raymond checked in a fix for this,
spawning an enormous discussion about how the three bits
``str.rpartition()`` returns should be named.  There was widespread
disagreement on which side was the "head" and which side was the
"tail", and the only unambiguous one seemed to be "left, sep, right".
Raymond and others were not as happy with this version because it was
no longer suggestive of the use cases, but it looked like this might
be the best compromise.

Contributing threads:

- `Problem withthe API for str.rpartition()
`__
- `Fwd: Problem withthe API for str.rpartition()
`__

---------------
Unicode Imports
---------------

Kristj?n V. J?nsson submitted a `unicode import patch`_ that would
allow unicode paths in sys.path and use the unicode file API on
Windows. It got a definite "no" from the Python 2.5 release managers
since it was already too late in the release process. Nonetheless
there was a long discussion about whether or not it should be
considered a bug or a feature. Martin v. L?wis explained that it was
definitely a feature because it would break existing introspection
tools expecting things like __file__ to be 8-bit strings (not unicode
strings as they would be with the patch).

.. _unicode import patch: http://bugs.python.org/1552880

Contributing thread:

- `Unicode Imports
`__

-------------------------
Exception and __unicode__
-------------------------

Marcin 'Qrczak' Kowalczyk reported a `TypeError from unicode()`_ when
applied to an Exception class. Brett Cannon explained the source of
this: BaseException defined a ``__unicode__`` descriptor which was
complaining when it was handed a class, not an instance. The easiest
solution seemed to be the best for Python 2.5: simply rip out the
``__unicode__`` method entirely. M.-A. Lemburg suggested that for
Python 2.6 this should be fixed by introducing a tp_unicode slot.

.. _TypeError from unicode(): http://bugs.python.org/1551432

Contributing thread:

- `2.5 status `__

--------------------------
Slowdown in inspect module
--------------------------

Fernando Perez reported an enormous slowdown in Python 2.5's inspect
module. Nick Coghlan figured out that this was a result of
``inspect.findsource()`` calling ``os.path.abspath()`` and
``os.path.normpath()`` repeatedly on the module's file name. Nick
provided a `patch to speed things up`_ by caching the absolute,
normalized file names.

.. _patch to speed things up: http://bugs.python.org/1553314

Contributing thread:

- `inspect.py very slow under 2.5
`__

--------------------------------
Cross-platform float consistency
--------------------------------

Andreas Raab asked about trying to minimize some of the cross-platform
differences in floating-point calcuations, by using something like
fdlibm_. Tim Peters pointed him to a `previous thread on this issue`_
and suggested that best route was probably to package a Python wrapper
for fdlibm_ and see how much interest there was.

.. _fdlibm: http://www.netlib.org/fdlibm/
.. _previous thread on this issue:
http://mail.python.org/pipermail/python-list/2005-July/290164.html

Contributing thread:

- `Cross-platform math functions?
`__

-----------------------------------
Refcounting and errors in functions
-----------------------------------

Mihai Ibanescu pointed out that refcount status for functions that can
fail is generally poorly documented. Greg Ewing explained that
refcounting behavior should be independent of whether the call
succeeds or fails, but it was clear that this was not always the case.
Mihai promised to file a low-severity bug so that this problem
wouldn't be lost.

Contributing thread:

- `Py_BuildValue and decref
`__

------------
Python 2.3.6
------------

Barry Warsaw offered to push out a Python 2.3.6 if folks were
interested in getting some bugfixes out to the platforms which were
still running Python 2.3.  After an underwhelming response, he
retracted the offer.

Contributing threads:

- `Interest in a Python 2.3.6?
`__
- `Interest in a Python 2.3.6?
`__
- `Python 2.4.4 was: Interest in a Python 2.3.6?
`__

-----------------------------------
Effbot Python library documentation
-----------------------------------

Johann C. Rocholl asked about the status of http://effbot.org/lib/,
Fredrik Lundh's alternative format and rendering for the Python
library documentation. Fredrik indicated that due to the pushback from
some folks on python-dev, they've been working mainly "under the
radar" on this. (At least until some inconsiderate soul put them in
the summary...) ;-)

Contributing threads:

- `That library reference, yet again
`__
- `That library reference, yet again
`__

================
Deferred Threads
================
- `IronPython and AST branch
`__

==================
Previous Summaries
==================
- `Py2.5 issue: decimal context manager misimplemented, misdesigned,
and misdocumented
`__
- `Error while building 2.5rc1 pythoncore_pgo on VC8
`__
- `gcc 4.2 exposes signed integer overflows
`__
- `no remaining issues blocking 2.5 release
`__
- `new security doc using object-capabilities
`__

===============
Skipped Threads
===============
- `A test suite for unittest
`__
- `Fwd: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc
`__
- `Weekly Python Patch/Bug Summary
`__
- `Windows build slave down until Tuesday-ish
`__
- `[Python-checkins] TRUNK IS UNFROZEN, available for 2.6 work if you
are so inclined
`__
- `Exception message for invalid with statement usage
`__
- `buildbot breakage
`__
- `Change in file() behavior in 2.5
`__
- `'with' bites Twisted
`__
- `What windows tool chain do I need for python 2.5 extensions?
`__
- `2.5c2 `__
- `_PyGILState_NoteThreadState should be static or not?
`__
- `BRANCH FREEZE: release25-maint, 00:00UTC 12 September 2006
`__
- `datetime's strftime implementation: by design or bug
`__
- `Subversion 1.4
`__
- `RELEASED Python 2.5 (release candidate 2)
`__
- `Maybe we should have a C++ extension for testing...
`__
- `.pyc file has different result for value "1.79769313486232e+308"
than .py file `__
- `release is done, but release25-maint branch remains near-frozen
`__
- `fun threading problem
`__
- `Thank you all
`__

From theller at ctypes.org  Fri Oct 27 21:24:40 2006
From: theller at ctypes.org (Thomas Heller)
Date: Fri, 27 Oct 2006 21:24:40 +0200
Subject: [Python-Dev] Modulefinder
In-Reply-To: 
References: 

Message-ID: <45425CF8.7030606@ctypes.org>

> On 10/13/06, Thomas Heller  wrote:
>> I have patched Lib/modulefinder.py to work with absolute and relative imports.
>> It also is faster now, and has basic unittests in Lib/test/test_modulefinder.py.
>>
>> The work was done in a theller_modulefinder SVN branch.
>> If nobody objects, I will merge this into trunk, and possibly also into release25-maint, when I have time.
> 

Guido van Rossum schrieb:
> Could you also prepare a patch for the p3yk branch? It's broken there too...
> 

I'm currently looking into this now.  IIUC, 'import foo' is an absolute
import now - is this the only change to the import machinery?

Thomas

From tjreedy at udel.edu  Fri Oct 27 21:45:43 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 27 Oct 2006 15:45:43 -0400
Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-01 to
	2006-09-15
References: 
Message-ID: 

> Adam Olsen that Python could create a single non-blocking pipe for a

/that/suggested that/

From theller at ctypes.org  Fri Oct 27 21:54:53 2006
From: theller at ctypes.org (Thomas Heller)
Date: Fri, 27 Oct 2006 21:54:53 +0200
Subject: [Python-Dev] Modulefinder
In-Reply-To: 
References: 

Message-ID: <4542640D.9030104@ctypes.org>

> On 10/13/06, Thomas Heller  wrote:
>> I have patched Lib/modulefinder.py to work with absolute and relative imports.
>> It also is faster now, and has basic unittests in Lib/test/test_modulefinder.py.
>>
>> The work was done in a theller_modulefinder SVN branch.
>> If nobody objects, I will merge this into trunk, and possibly also into release25-maint, when I have time.
> 
Guido van Rossum schrieb:
> Could you also prepare a patch for the p3yk branch? It's broken there too...
> 

Patch uploaded, and assigned to you.
http://www.python.org/sf/1585966

Oh, and BTW: py3k SVN doesn't compile under windows.

Thomas

From oliphant.travis at ieee.org  Fri Oct 27 22:05:31 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Fri, 27 Oct 2006 14:05:31 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
Message-ID: 

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pep_dtypes.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20061027/77410d52/attachment-0001.txt 

From steven.bethard at gmail.com  Sat Oct 28 00:23:50 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 27 Oct 2006 16:23:50 -0600
Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-16 to 2006-09-30
Message-ID: 

Thanks to all of those who have already given me feedback on the last
summary.  Here's the next one (for the second half of September).  I
found the "OS X universal binaries" and "Finer-grained locking than
the GIL" discussions particularly hard to follow, so I'd especially
appreciate corrections on those.

Thanks!

=========
Summaries
=========

---------------
Import features
---------------

Fabio Zadrozny ran into the `previously reported relative import
issues`_ where a ``from . import xxx`` always fails from a top-level
module. This is because relative imports rely on the ``__name__`` of a
module, so when it is just ``"__main__"``, they can't handle it
properly.

On the subject of imports, Guido said that one of the missing import
features was to be able to say "*this* package lives *here*". Paul
Moore whipped up a Python API to an import hook that could do this,
but indicated that a full mechanism would need to pay more attention
to the environment (e.g. PYTHONPATH and .pth files).

There was also some discussion about trying to have a sort of
per-module ``sys.path`` so that you could have multiple versions of
the same module present, with different modules importing different
versions. Phillip J. Eby suggested that this was probably not a very
common need, and that implementing it would be quite difficult with
things like C extensions only being able to be loaded once.

In general, people seemed interested in a pure-Python implementation
of the import mechanism so that they could play with some of these
approaches. It looked like Brett Cannon would probably be working on
that.

.. _previously reported relative import issues:
http://www.python.org/dev/summary/2006-06-16_2006-06-30/#relative-imports-and-pep-338-executing-modules-as-scripts

Contributing thread:

- `New relative import issue
`__

----------------------------
Python library documentation
----------------------------

A less-trolly-than-usual post from Xah Lee started a discussion about
the Python documentation.  Greg Ewing and others suggested following
the documentation style of the Inside Macintosh series: first an
"About this module" narrative explaining the concepts and how they fit
together, followed by the extensive API reference. Most people agreed
that simply extracting the documentation from the docstrings was a bad
idea -- it lacks the high-level overview and gives equal importance to
all functions, regardless of their use.

Contributing thread:

- `Python Doc problems
`__

-----------------------
OS X universal binaries
-----------------------

Jack Howarth asked about creating universal binaries for OS X that
would support 32-bit or 64-bit on both PPC and x86. Ronald Oussoren
pointed out that the 32-bit part of this was already supported, but
indicated that adding 64-bit support simultaneously might be more
difficult. Ronald seemed to think that modifications to pyconfig.h.in
might solve the problem, though he was worried that this might cause
distutils to detect some architecture features incorrectly.

Contributing thread:

- `python, lipo and the future?
`__

----------------------------------
Finer-grained locking than the GIL
----------------------------------

Martin Devera was looking into replacing the global interpreter lock
(GIL) with finer-grained locking, tuned to minimize locking by
assuming that most objects were used only by a single thread. For
objects that were shared across multiple threads, this approach would
allow non-blocking reads, but require all threads to "come home"
before modifications could be made. Phillip J. Eby pointed out that
most object accesses in Python are actually modifications too, due to
reference counting, so it looked like Martin's proposal wouldn't work
well with the current refcounting implementation of Python. After
Martin v. L?wis found a bug in the locking algorithm, Martin Devera
decided to take his idea back to the drawing board.

Contributing thread:

- `deja-vu .. python locking
`__

---------------------------
OS X and ssize_t formatting
---------------------------

The buildbots spotted an OS X error in the itertools module. After
Jack Diederich fixed a bug where ``size_t`` had been used instead of
``ssize_t``, Neal Norwitz noticed some problems with ``%zd`` on OS X.
Despite documentation to the contrary in both the man page and the C99
Standard, using that specifier on OS X treats a negative number as an
unsigned number. Ronald Oussoren and others reported the bug to Apple.

Contributing thread:

- `test_itertools fails for trunk on x86 OS X machine
`__

-------------------
itertools.flatten()
-------------------

Michael Foord asked about including a flatten function that would take
a sequence with sub-sequences nested to an arbitrary depth and create
a simple non-nested sequence from that. People were strongly opposed
to adding this as a builtin, but even as an itertools function, there
was disagreement. How should strings, dicts and other arbitrary
iterables be flattened? Since there wasn't one clear answer, it looked
like the proposal didn't have much of a chance.

Contributing thread:

- `Suggestion for a new built-in - flatten
`__

-------------------------------
Class definition syntax changes
-------------------------------

Fabio Zadrozny noted that in Python 2.5, classes can now be declared as::

    class C():
        ...

Some folks wanted the result to be a new-style class, but the presence
or absence of ``()`` was deemed too subtle of a cue to make the
new-style/old-style distinction. For the Python 2.X series, explicit
subclassing of ``object`` will still be necessary.

Contributing thread:

- `Grammar change in classdef
`__

----------------------
Python 2.5 and GCC 4.2
----------------------

Armin Rigo found some more signed integer overflows when using GCC 4.2
like the ones `reported earlier`_. Because Python 2.5 final was
scheduled to be released in 24 hours, and it looked like there
wouldn't be too many people affected these problems, they were
deferred until 2.5.1. For the moment at least, the README indicates
that GCC 4.1 and 4.2 shouldn't be used to compile Python.

.. _reported earlier:
http://www.python.org/dev/summary/2006-08-16_2006-08-31/#gcc-4-2-and-integer-overflows

Contributing threads:

- `Before 2.5 - More signed integer overflows
`__
- `GCC 4.x incompatibility
`__

----------------------------------
Discard method for dicts and lists
----------------------------------

Gustavo Niemeyer and Greg Ewing suggested adding ``dict.discard()``
and ``list.discard()`` along the lines of ``set.discard()``. Fred L.
Drake, Jr. explained that ``dict.discard(foo)`` is currently supported
with ``dict.pop(foo, None)``. There was more debate about the ``list``
version, but most people seemed to think that wrapping
``list.remove()`` with the appropriate if-statement or try-except was
fine.

Contributing threads:

- `dict.discard
`__
- `list.discard? (Re: dict.discard)
`__

--------------------
weakref enhancements
--------------------

tomer filiba offered some additions to the weakref module, weakattr_
and weakmethod_. Raymond Hettinger questioned how frequently these
would be useful in the real world, but both tomer and Alex Martelli
assured him that they had real-world use-cases for these. However,
there didn't generally seem to be enough support for them to include
them in the standard library.

.. _weakattr: http://sebulba.wikispaces.com/recipe+weakattr
.. _weakmethod: http://sebulba.wikispaces.com/recipe+weakmethod

Contributing thread:

- `weakref enhancements
`__

------------------------
AST structure guarantees
------------------------

Anthony Baxter asked that the AST structure get the same guarantees as
the byte-code format, that is, that it would change as little as
possible so that people who wanted to hack it wouldn't have to change
their code for each release. Pretty much everyone agreed that this was
a good idea.

In a related thread, Sanghyeon Seo asked if the AST structure should
become part of the Python specification so that other implementations
like IronPython_ would use it as well.  While most people felt like it
would be good if the various specifications had similar AST
representations, it seemed like mandating it as part of the
implementation would lock things down too much.

.. _IronPython: http://www.codeplex.com/Wiki/View.aspx?ProjectName=IronPython

Contributing threads:

- `IronPython and AST branch
`__
- `IronPython and AST branch
`__
- `AST structure and maintenance branches
`__

-----------------------------
PEP 302: phase 2 import hooks
-----------------------------

For his dissertation work, Brett Cannon needed to implement phase 2 of
the `PEP 302`_ import hooks. He asked for feedback on whether it would
be easier to do this within the current C code, or whether it would be
better to rewrite the import mechanisms in Python first. Phillip J.
Eby gave some advice on how to restructure things, and suggested that
the C code was somewhat delicate and having a Python implementation
around would be a Good Thing. Armin Rigo strongly recommended
rewriting things in Python.

.. _PEP 302: http://www.python.org/dev/peps/pep-0302/

Contributing thread:

- `difficulty of implementing phase 2 of PEP 302 in Python source
`__

----------------------------------------------------
Testsuite fails on Windows if a space is in the path
----------------------------------------------------

Martin v. L?wis was trying to fix some bugs where spaces in Windows
paths caused some of the testsuite to fail. For example, test_popen
was getting an error because ``os.popen`` invoked::

    cmd.exe /c "c:\Program Files\python25\python.exe" -c "import
sys;print sys.version"

which failed complaining that c:\Program is not a valid executable.
Jean-Paul Calderon and Tim Peters explained that the ``cmd.exe`` part
is necessary to force proper cmd.exe-style argument parsing and to
allow environment variable substitution. After scrutinizing the MS
quoting rules, it seemed like fixing this for Python 2.5 was too
likely to introduce incompatibilities, so it was postponed to 2.6.

Contributing thread:

- `Testsuite fails on Windows if a space is in the path
`__

-----------------------------------------
PEP 353: Backwards-compatibility #defines
-----------------------------------------

David Abrahams suggested a modification to the suggested
backwards-compatibility #define incantation of `PEP 353`_ so that the
PY_SSIZE_T_MAX and PY_SSIZE_T_MIN would only ever get defined once.
There was some discussion about whether or not this was absolutely
necessary, but everyone agreed that the change was probably sensible
regardless.

.. _PEP 353: http://www.python.org/dev/peps/pep-0353/

Contributing thread:

- `Pep 353: Py_ssize_t advice
`__

-----------------
Bare-bones Python
-----------------

Milan Krcmar asked about what he could drop from Python to make it
small enough to fit on a platform with only 2 MiB of flash ROM and 16
MiB of RAM. Giovanni Bajo suggested dropping the CJK codecs (which
account for about 800K), though he also noted that after that there
weren't any really low-hanging fruit. Martin v. L?wis suggested that
he might also get a gain out of dropping support for dynamic loading
of extension modules, and linking all necessary modules statically.
Gustavo Niemeyer pointed him to `Python for S60`_ and `Python for
Maemo`_ which had to undergo similar stripping down.

.. _Python for S60: http://opensource.nokia.com/projects/pythonfors60/
.. _Python for Maemo: http://pymaemo.sf.net

Contributing thread:

- `Minipython `__

================
Deferred Threads
================
- `Removing __del__
`__
- `Caching float(0.0)
`__
- `PEP 355 status
`__
- `PEP 351 - do while
`__

==================
Previous Summaries
==================
- `Signals, threads, blocking C functions
`__

===============
Skipped Threads
===============
- `Thank you all
`__
- `BRANCH FREEZE/IMMINENT RELEASE: Python 2.5 (final). 2006-09-19,
00:00UTC `__
- `RELEASED Python 2.5 (FINAL)
`__
- `release25-maint branch - please keep frozen for a day or two more.
`__
- `Download URL typo
`__
- `Exceptions and slicing
`__
- `Weekly Python Patch/Bug Summary
`__
- `release25-maint is UNFROZEN
`__
- `Small Py3k task: fix modulefinder.py
`__
- `win32 - results from Lib/test - 2.5 release-maint
`__
- `Weekly Python Patch/Bug Summary ** REVISED **
`__
- `[Python-checkins] release25-maint is UNFROZEN
`__
- `Python network Programmign
`__
- `Relative import bug?
`__
- `GCC patch for catching errors in PyArg_ParseTuple
`__
- `Typo.pl scan of Python 2.5 source code
`__
- `Maybe we should have a C++ extension for testing...
`__
- `Python 2.5 bug? Changes in behavior of traceback module
`__
- `Need help with C - problem in sqlite3 module
`__
- `PyErr_CheckSignals error return value
`__
- `python-dev summary for 2006-08-01 to 2006-08-15
`__
- `2.4.4c1 October 11, 2.4.4 final October 18
`__
- `[SECUNIA] "buffer overrun in repr() for unicode strings" Potential
Vulnerability (fwd)
`__
- `List of candidate 2.4.4 bugs?
`__
- `openssl - was: 2.4.4c1 October 11, 2.4.4 final October 18
`__
- `Collecting 2.4.4 fixes
`__
- `os.unlink() closes file?
`__
- `Tix not included in 2.5 for Windows
`__
- `Possible semantic changes for PEP 352 in 2.6
`__

From martin at v.loewis.de  Sat Oct 28 01:44:21 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 28 Oct 2006 01:44:21 +0200
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 
Message-ID: <454299D5.4090804@v.loewis.de>

Travis E. Oliphant schrieb:
>     The datatype is an object that specifies how a certain block of
>     memory should be interpreted as a basic data-type. 
> 
>       >>> datatype(float)
>       datatype('float64')

I can't speak on the specific merits of this proposal, or whether this
kind of functionality is desirable. However, I'm -1 on the addition of
a builtin for this functionality (the PEP doesn't actually say that
there is another builtin, but the examples suggest so). Instead, putting
it into the sys, array, struct, or ctypes modules might be more
appropriate, as might be the introduction of another module.

Regards,
Martin

From anthony at interlink.com.au  Sat Oct 28 01:48:44 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat, 28 Oct 2006 09:48:44 +1000
Subject: [Python-Dev] [Python-checkins] r52482 - in
	python/branches/release25-maint: Lib/urllib.py Lib/urllib2.py
	Misc/NEWS
In-Reply-To: <20061027171334.28AD01E4003@bag.python.org>
References: <20061027171334.28AD01E4003@bag.python.org>
Message-ID: <200610280948.45016.anthony@interlink.com.au>

On Saturday 28 October 2006 03:13, andrew.kuchling wrote:
> 2.4 backport candidate, probably.

FWIW, I'm not planning on doing any more "collect all the bugfixes" releases 
of 2.4. It's now in the same category as 2.3 - that is, only really serious 
bugs (in particular, security related bugs) will get a new release, and then 
only with the serious bugfixes applied. 

One active maintenance branch is quite enough to deal with, IMHO.

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From martin at v.loewis.de  Sat Oct 28 01:50:03 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 28 Oct 2006 01:50:03 +0200
Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-16 to
	2006-09-30
In-Reply-To: 
References: 
Message-ID: <45429B2B.6070404@v.loewis.de>

Steven Bethard schrieb:
> Jack Howarth asked about creating universal binaries for OS X that
> would support 32-bit or 64-bit on both PPC and x86. Ronald Oussoren
> pointed out that the 32-bit part of this was already supported, but
> indicated that adding 64-bit support simultaneously might be more
> difficult. Ronald seemed to think that modifications to pyconfig.h.in
> might solve the problem, though he was worried that this might cause
> distutils to detect some architecture features incorrectly.

Ronald can surely speak for himself, but I think the problem is slightly
different. There were different strategies discussed for changing
pyconfig.h (with an include, or with #ifdefs), and in all cases,
distutils would fail to detect the architecture properly. That's not
really a problem of pyconfig.h, but of the way that distutils uses
to detect bitsizes - which inherently cannot work for universal
binaries (i.e. you should look at the running interpreter, not
at pyconfig.h).

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Sat Oct 28 02:23:26 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 28 Oct 2006 13:23:26 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 
Message-ID: <4542A2FE.9070409@canterbury.ac.nz>

Travis E. Oliphant wrote:
> PEP: 
> Title: Adding data-type objects to the standard library

Not sure about having 3 different ways to specify
the structure -- it smacks of Too Many Ways To Do
It to me.

Also, what if I want to refer to fields by name
but don't want to have to work out all the offsets
(which is tedious, error-prone and hostile to
modification)?

--
Greg

From typo_pl at hotmail.com  Sat Oct 28 05:35:48 2006
From: typo_pl at hotmail.com (Johnny Lee)
Date: Sat, 28 Oct 2006 03:35:48 +0000
Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code
Message-ID: 

I grabbed the latest Python2.5 code via subversion and ran my typo script on it.

Weeding out the obvious false positives and Neal's comments leaves about 129 typos.

See http://www.geocities.com/typopl/typoscan.htm

Should I enter the typos as bugs in the Python bug db?
J

> Date: Fri, 22 Sep 2006 21:51:38 -0700> From: nnorwitz at gmail.com> To: typo_pl at hotmail.com> Subject: Re: [Python-Dev] Typo.pl scan of Python 2.5 source code> CC: python-dev at python.org> > On 9/22/06, Johnny Lee  wrote:> >> > Hello,> > My name is Johnny Lee. I have developed a *ahem* perl script which scans> > C/C++ source files for typos.> > Hi Johnny.> > Thanks for running your script, even if it is written in Perl and ran> on Windows. :-)> > > The Python 2.5 typos can be classified into 7 types.> >> > 2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size);> > If realloc() fails, it will return NULL. If you assign the return value to> > the same variable you passed into realloc,> > then you've overwritten the variable and possibly leaked the memory that the> > variable pointed to.> > A bunch of these warnings were accurate and a bunch were not. There> were 2 reasons for the false positives. 1) The pointer was aliased,> thus not lost, 2) On failure, we exited (Parser/*.c)> > > 4) if ((X!=0) || (X!=1))> > These 2 cases occurred in binascii. I have no idea if the warning is> wright or the code is.> > > 6) XX;;> > Just being anal here. Two semicolons in a row. Second one is extraneous.> > I already checked in a fix for these on HEAD. Hard for even me to> screw up those fixes. :-)> > > 7) extraneous test for non-NULL ptr> > Several memory calls that free memory accept NULL ptrs.> > So testing for NULL before calling them is redundant and wastes code space.> > Now some codepaths may be time-critical, but probably not all, and smaller> > code usually helps.> > I ignored these as I'm not certain all the platforms we run on accept> free(NULL).> > Below is my categorization of the warnings except #7. Hopefully> someone will fix all the real problems in the first batch.> > Thanks again!> > n> --> > # Problems> Objects\fileobject.c (338): realloc overwrite src if NULL; 17:> file->f_setbuf=(char*)PyMem_Realloc(file->f_setbuf,bufsize)> Objects\fileobject.c (342): using PyMem_Realloc result w/no check> 30: setvbuf(file->f_fp, file->f_setbuf, type, bufsize);> [file->f_setbuf]> Objects\listobject.c (2619): using PyMem_MALLOC result w/no check> 30: garbage[i] = selfitems[cur]; [garbage]> Parser\myreadline.c (144): realloc overwrite src if NULL; 17:> p=(char*)PyMem_REALLOC(p,n+incr)> Modules\_csv.c (564): realloc overwrite src if NULL; 17:> self->field=PyMem_Realloc(self->field,self->field_size)> Modules\_localemodule.c (366): realloc overwrite src if NULL; 17:> buf=PyMem_Realloc(buf,n2)> Modules\_randommodule.c (290): realloc overwrite src if NULL; 17:> key=(unsigned#long*)PyMem_Realloc(key,bigger*sizeof(*key))> Modules\arraymodule.c (1675): realloc overwrite src if NULL; 17:> self->ob_item=(char*)PyMem_REALLOC(self->ob_item,itemsize*self->ob_size)> Modules\cPickle.c (536): realloc overwrite src if NULL; 17:> self->buf=(char*)realloc(self->buf,n)> Modules\cPickle.c (592): realloc overwrite src if NULL; 17:> self->buf=(char*)realloc(self->buf,bigger)> Modules\cPickle.c (4369): realloc overwrite src if NULL; 17:> self->marks=(int*)realloc(self->marks,s*sizeof(int))> Modules\cStringIO.c (344): realloc overwrite src if NULL; 17:> self->buf=(char*)realloc(self->buf,self->buf_size)> Modules\cStringIO.c (380): realloc overwrite src if NULL; 17:> oself->buf=(char*)realloc(oself->buf,oself->buf_size)> Modules\_ctypes\_ctypes.c (2209): using PyMem_Malloc result w/no> check 30: memset(obj->b_ptr, 0, dict->size); [obj->b_ptr]> Modules\_ctypes\callproc.c (1472): using PyMem_Malloc result w/no> check 30: strcpy(conversion_mode_encoding, coding);> [conversion_mode_encoding]> Modules\_ctypes\callproc.c (1478): using PyMem_Malloc result w/no> check 30: strcpy(conversion_mode_errors, mode);> [conversion_mode_errors]> Modules\_ctypes\stgdict.c (362): using PyMem_Malloc result w/no> check 30: memset(stgdict->ffi_type_pointer.elements, 0,> [stgdict->ffi_type_pointer.elements]> Modules\_ctypes\stgdict.c (376): using PyMem_Malloc result w/no> check 30: memset(stgdict->ffi_type_pointer.elements, 0,> [stgdict->ffi_type_pointer.elements]> > # No idea if the code or tool is right.> Modules\binascii.c (1161)> Modules\binascii.c (1231)> > # Platform specific files. I didn't review and won't fix without testing.> Python\thread_lwp.h (107): using malloc result w/no check 30:> lock->lock_locked = 0; [lock]> Python\thread_os2.h (141): using malloc result w/no check 30:> (long)sem)); [sem]> Python\thread_os2.h (155): using malloc result w/no check 30:> lock->is_set = 0; [lock]> Python\thread_pth.h (133): using malloc result w/no check 30:> memset((void *)lock, '\0', sizeof(pth_lock)); [lock]> Python\thread_solaris.h (48): using malloc result w/no check 30:> funcarg->func = func; [funcarg]> Python\thread_solaris.h (133): using malloc result w/no check 30:> if(mutex_init(lock,USYNC_THREAD,0)) [lock]> > # Who cares about these modules.> Modules\almodule.c:182> Modules\svmodule.c:547> > # Not a problem.> Parser\firstsets.c (76)> Parser\grammar.c (40)> Parser\grammar.c (59)> Parser\grammar.c (83)> Parser\grammar.c (102)> Parser\node.c (95)> Parser\pgen.c (52)> Parser\pgen.c (69)> Parser\pgen.c (126)> Parser\pgen.c (438)> Parser\pgen.c (462)> Parser\tokenizer.c (797)> Parser\tokenizer.c (869)> Modules\_bsddb.c (2633)> Modules\_csv.c (1069)> Modules\arraymodule.c (1871)> Modules\gcmodule.c (1363)> Modules\zlib\trees.c (375)
_________________________________________________________________
Get the new Windows Live Messenger!
http://get.live.com/messenger/overview
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061028/46c37e2d/attachment.htm 

From ncoghlan at gmail.com  Sat Oct 28 06:31:29 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Oct 2006 14:31:29 +1000
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4542A2FE.9070409@canterbury.ac.nz>
References:  <4542A2FE.9070409@canterbury.ac.nz>
Message-ID: <4542DD21.6000100@gmail.com>

Greg Ewing wrote:
> Travis E. Oliphant wrote:
>> PEP: 
>> Title: Adding data-type objects to the standard library

I've used 'datatype' below for consistency, but can we please call them 
something other than data types? Data layouts? Data formats? Binary layouts? 
Binary formats? 'type' is already a meaningful term in Python, and having to 
check whether 'data type' meant a type definition or a data format definition 
could get annoying.

> Not sure about having 3 different ways to specify
> the structure -- it smacks of Too Many Ways To Do
> It to me.

There are actually 5 ways, but the different mechanisms all have different use 
case (and I'm going to suggest getting rid of the dictionary form).

Type-object:
   Simple conversion of the builtin types (would be good for instances to be 
able to hook this as with other type conversion functions).

2-tuple:
   Makes it easy to specify a contiguous C-style array of a given data type. 
However, rather than doing type-based dispatch here, I would prefer to see 
this version handled via an optional 'shape' argument, so that all sequences 
can be handled consistently (more on that below).
       >>> datatype(int, 5) # short for datatype([(int, 5)])
       datatype('int32', (5,))
       # describes a 5*4=20-byte block of memory laid out as
       #  a[0], a[1], a[2], a[3], a[4]

String-object:
   The basic formatting definition (I'd be interested in the differences 
between this definition scheme and the struct definition scheme - one definite 
goal for an implementation would be an update to the struct module to accept 
datatype objects, or at least a conversion mechanism for creating a struct 
layout description from a datatype definition)

List object:
   As for string object, but permits naming of each of the fields. I don't 
like treating tuples differently from lists, so I'd prefer for this handling 
applied to be applied to all iterables that don't meet one of the other 
special cases (direct conversion, string, dictionary).

   I'd also prefer the meta-information to come *after* the name, and for the 
name to be completely optional (leaving the corresponding field unnamed). So 
the possible sequence entries would be:
     datatype
     (name, datatype)
     (name, datatype, shape)
   where name must be a string or 2-tuple, datatype must be acceptable as a 
constructor argument, and the shape must be an integer or tuple.
    For example:
       datatype(([(('coords', [1,2]), 'f4')),
                  ('address', 'S30'),
                 ])

       datatype([('simple', 'i4'),
                 ('nested', [('name', 'S30'),
                             ('addr', 'S45'),
                             ('amount', 'i4')
                            ]
                  ),
                 ])

       >>> datatype(['V8', ('var2', 'i1'), 'V3', ('var3', 'f8')]
       datatype([('', '|V8'), ('var2', '|i1'), ('', '|V3'), ('var3', ' Also, what if I want to refer to fields by name
> but don't want to have to work out all the offsets
> (which is tedious, error-prone and hostile to
> modification)?

Use the list definition form. In the current PEP, you would need to define 
names for all of the uninteresting fields. With the changes I've suggested 
above, you wouldn't even have to name the fields you don't care about - just 
describe them.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Sat Oct 28 06:37:26 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Oct 2006 14:37:26 +1000
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4542DD21.6000100@gmail.com>
References:  <4542A2FE.9070409@canterbury.ac.nz>
	<4542DD21.6000100@gmail.com>
Message-ID: <4542DE86.6000706@gmail.com>

Nick Coghlan wrote:
> There are actually 5 ways, but the different mechanisms all have different use 
> case (and I'm going to suggest getting rid of the dictionary form).

D'oh, I though I deleted that parenthetical comment... obviously, I changed my 
mind on this point :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Sat Oct 28 06:40:17 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Oct 2006 14:40:17 +1000
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <454299D5.4090804@v.loewis.de>
References:  <454299D5.4090804@v.loewis.de>
Message-ID: <4542DF31.5090408@gmail.com>

Martin v. L?wis wrote:
> Travis E. Oliphant schrieb:
>>     The datatype is an object that specifies how a certain block of
>>     memory should be interpreted as a basic data-type. 
>>
>>       >>> datatype(float)
>>       datatype('float64')
> 
> I can't speak on the specific merits of this proposal, or whether this
> kind of functionality is desirable. However, I'm -1 on the addition of
> a builtin for this functionality (the PEP doesn't actually say that
> there is another builtin, but the examples suggest so). Instead, putting
> it into the sys, array, struct, or ctypes modules might be more
> appropriate, as might be the introduction of another module.

I'd say the answer to where we put it will be dependent on what happens to the 
idea of adding a NumArray style fixed dimension array type to the standard 
library. If that gets exposed through the array module as array.dimarray, then 
it would make sense to expose the associated data layout descriptors as 
array.datatype.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From oliphant.travis at ieee.org  Sat Oct 28 08:45:55 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sat, 28 Oct 2006 00:45:55 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <454299D5.4090804@v.loewis.de>
References:  <454299D5.4090804@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Travis E. Oliphant schrieb:
>>     The datatype is an object that specifies how a certain block of
>>     memory should be interpreted as a basic data-type. 
>>
>>       >>> datatype(float)
>>       datatype('float64')
> 
> I can't speak on the specific merits of this proposal, or whether this
> kind of functionality is desirable. However, I'm -1 on the addition of
> a builtin for this functionality (the PEP doesn't actually say that
> there is another builtin, but the examples suggest so).

I was intentionally vague.  I don't see a need for it to be a built-in, 
but didn't know where exactly to "put it,"  I should have made it a 
question for discussion.

-Travis

From oliphant.travis at ieee.org  Sat Oct 28 08:49:41 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sat, 28 Oct 2006 00:49:41 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4542A2FE.9070409@canterbury.ac.nz>
References:  <4542A2FE.9070409@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:
> Travis E. Oliphant wrote:
>> PEP: 
>> Title: Adding data-type objects to the standard library
> 
> Not sure about having 3 different ways to specify
> the structure -- it smacks of Too Many Ways To Do
> It to me.

You might be right, but they all have use-cases.  I've actually removed 
most of the multiple ways that NumPy allows for creating data-types.

> 
> Also, what if I want to refer to fields by name
> but don't want to have to work out all the offsets

I don't know what you mean.   You just use the list-style to define a 
data-format with fields.  The offsets are worked out for you.   The only 
use for offsets was the dictionary form.  The dictionary form stems from 
a desire to use the fields dictionary of a data-type as a data-type 
specification (which it is essentially is).

-Travis

From ronaldoussoren at mac.com  Sat Oct 28 09:50:42 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sat, 28 Oct 2006 09:50:42 +0200
Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-16
	to	2006-09-30
In-Reply-To: <45429B2B.6070404@v.loewis.de>
References: 
	<45429B2B.6070404@v.loewis.de>
Message-ID: <9C891890-24DA-43B0-AB0E-A04D3063306E@mac.com>

On Oct 28, 2006, at 1:50 AM, Martin v. L?wis wrote:

> Steven Bethard schrieb:
>> Jack Howarth asked about creating universal binaries for OS X that
>> would support 32-bit or 64-bit on both PPC and x86. Ronald Oussoren
>> pointed out that the 32-bit part of this was already supported, but
>> indicated that adding 64-bit support simultaneously might be more
>> difficult. Ronald seemed to think that modifications to pyconfig.h.in
>> might solve the problem, though he was worried that this might cause
>> distutils to detect some architecture features incorrectly.
>
> Ronald can surely speak for himself, but I think the problem is  
> slightly
> different. There were different strategies discussed for changing
> pyconfig.h (with an include, or with #ifdefs), and in all cases,
> distutils would fail to detect the architecture properly. That's not
> really a problem of pyconfig.h, but of the way that distutils uses
> to detect bitsizes - which inherently cannot work for universal
> binaries (i.e. you should look at the running interpreter, not
> at pyconfig.h).

That depends on what you want to do. If you want to use the  
information about byteorder and bitsizes to drive the build of an  
extension you're better of using pyconfig.h instead of using the  
values of the currently running interpreter. If you want to use the  
information to generate raw data files in the platform byteorder and  
bitsizes you're better of using the struct module, so there's really  
no good reason to look at WORDS_BIGENDIAN and the various SIZEOF_  
macros through distutils.

An example of this was the build of expat: before I merged the  
universal binary patches setup.py looked at sys.byteorder and then  
added a define to the build flags for expat. With the universal  
patches I changed this to an include-file that looks at the value in  
pyconfig.h and sets the define that expat expects. This is needed  
because with universal binaries the byteorder and bitsizes are no  
longer configure-time constants but are compile-time constants.

Note that adding support for universal builds for 32-bit architecturs  
was relatively easy because only one variable in pyconfig.h needed to  
be patched and GCC has explicit support for getting the required  
information.

The patch for 32-bit/64-bit builds will probably require sniffing the  
current architecture (e.g. "#ifdef __i386__") and settings values  
that way. The cleanest way to do that is in introduction of an  
additional include file. It also requires changes to setup.py because  
all mac-specific modules won't build in 64-bit code in released  
versions of the OS (because OSX only has a 64-bit unix-layer in 10.4,  
10.5 will be 64-bit throughout).

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061028/96560ce2/attachment.bin 

From mal at egenix.com  Sat Oct 28 10:40:01 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 28 Oct 2006 10:40:01 +0200
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 
Message-ID: <45431761.1020401@egenix.com>

Travis E. Oliphant wrote:
> 
> 
> ------------------------------------------------------------------------
> 
> PEP: 
> Title: Adding data-type objects to the standard library
>   Attributes
> 
>      kind      --  returns the basic "kind" of the data-type. The basic kinds
>                      are:
>                        't' - bit, 
>                        'b' - bool, 
>                        'i' - signed integer, 
>                        'u' - unsigned integer,
>                        'f' - floating point,                  
>                        'c' - complex floating point, 
>                        'S' - string (fixed-length sequence of char),
>                        'U' - fixed length sequence of UCS4,

Shouldn't this read "fixed length sequence of Unicode" ?!
The underlying code unit format (UCS2 and UCS4) depends on the
Python version.

>                        'O' - pointer to PyObject,
>                        'V' - Void (anything else).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 28 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From oliphant.travis at ieee.org  Sat Oct 28 11:32:09 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sat, 28 Oct 2006 03:32:09 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45431761.1020401@egenix.com>
References:  <45431761.1020401@egenix.com>
Message-ID: 

M.-A. Lemburg wrote:
> Travis E. Oliphant wrote:
>>
>> ------------------------------------------------------------------------
>>
>> PEP: 
>> Title: Adding data-type objects to the standard library
>>   Attributes
>>
>>      kind      --  returns the basic "kind" of the data-type. The basic kinds
>>                      are:
>>                        't' - bit, 
>>                        'b' - bool, 
>>                        'i' - signed integer, 
>>                        'u' - unsigned integer,
>>                        'f' - floating point,                  
>>                        'c' - complex floating point, 
>>                        'S' - string (fixed-length sequence of char),
>>                        'U' - fixed length sequence of UCS4,
> 
> Shouldn't this read "fixed length sequence of Unicode" ?!
> The underlying code unit format (UCS2 and UCS4) depends on the
> Python version.

Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
my questions at the bottom which talk about how to handle this.  A 
data-format does not necessarily have to correspond to something Python 
represents with an Object.

-Travis

From g.brandl at gmx.net  Sat Oct 28 15:39:09 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 28 Oct 2006 15:39:09 +0200
Subject: [Python-Dev] build bots, log output
Message-ID: 

Hi,

I wonder if it's possible that the build bot notification mails that go
to python-checkins include the last 10-15 lines from the log. This would
make it much easier to decide whether a buildbot failure is an old,
esoteric one (e.g.

test_wait4
sem_post: Success
make: *** [buildbottest] Killed

on the hppa one) or a new one, really caused by one's checkin.

The alternative would be to fix the tests/buildbots not to have these
esoteric failures anymore 

Georg

From arigo at tunes.org  Sat Oct 28 15:54:15 2006
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 28 Oct 2006 15:54:15 +0200
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 
Message-ID: <20061028135415.GA13049@code0.codespeak.net>

Hi Travis,

On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote:
>     This PEP proposes adapting the data-type objects from NumPy for
>     inclusion in standard Python, to provide a consistent and standard
>     way to discuss the format of binary data. 

How does this compare with ctypes?  Do we really need yet another,
incompatible way to describe C-like data structures in the standard
library?

A bientot,

Armin

From mal at egenix.com  Sat Oct 28 20:10:31 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 28 Oct 2006 20:10:31 +0200
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References:  <45431761.1020401@egenix.com>

Message-ID: <45439D17.5010306@egenix.com>

Travis E. Oliphant wrote:
> M.-A. Lemburg wrote:
>> Travis E. Oliphant wrote:
>>> ------------------------------------------------------------------------
>>>
>>> PEP: 
>>> Title: Adding data-type objects to the standard library
>>>   Attributes
>>>
>>>      kind      --  returns the basic "kind" of the data-type. The basic kinds
>>>                      are:
>>>                        't' - bit, 
>>>                        'b' - bool, 
>>>                        'i' - signed integer, 
>>>                        'u' - unsigned integer,
>>>                        'f' - floating point,                  
>>>                        'c' - complex floating point, 
>>>                        'S' - string (fixed-length sequence of char),
>>>                        'U' - fixed length sequence of UCS4,
>> Shouldn't this read "fixed length sequence of Unicode" ?!
>> The underlying code unit format (UCS2 and UCS4) depends on the
>> Python version.
> 
> Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
> my questions at the bottom which talk about how to handle this.  A 
> data-format does not necessarily have to correspond to something Python 
> represents with an Object.

Ok, but why are you being specific about UCS4 (which is an internal
storage format), while you are not specific about e.g. the
internal bit size of the integers (which could be 32 or 64 bit) ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 28 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From jcarlson at uci.edu  Sat Oct 28 20:42:36 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 28 Oct 2006 11:42:36 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45439D17.5010306@egenix.com>
References:  <45439D17.5010306@egenix.com>
Message-ID: <20061028113844.0B08.JCARLSON@uci.edu>

"M.-A. Lemburg"  wrote:
> 
> Travis E. Oliphant wrote:
> > M.-A. Lemburg wrote:
> >> Travis E. Oliphant wrote:
> >>> ------------------------------------------------------------------------
> >>>
> >>> PEP: 
> >>> Title: Adding data-type objects to the standard library
> >>>   Attributes
> >>>
> >>>      kind      --  returns the basic "kind" of the data-type. The basic kinds
> >>>                      are:
> >>>                        't' - bit, 
> >>>                        'b' - bool, 
> >>>                        'i' - signed integer, 
> >>>                        'u' - unsigned integer,
> >>>                        'f' - floating point,                  
> >>>                        'c' - complex floating point, 
> >>>                        'S' - string (fixed-length sequence of char),
> >>>                        'U' - fixed length sequence of UCS4,
> >> Shouldn't this read "fixed length sequence of Unicode" ?!
> >> The underlying code unit format (UCS2 and UCS4) depends on the
> >> Python version.
> > 
> > Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
> > my questions at the bottom which talk about how to handle this.  A 
> > data-format does not necessarily have to correspond to something Python 
> > represents with an Object.
> 
> Ok, but why are you being specific about UCS4 (which is an internal
> storage format), while you are not specific about e.g. the
> internal bit size of the integers (which could be 32 or 64 bit) ?

I think that even on 64 bit platforms, using 'int' or 'long' generally
means 32 bit.  In order to get 64 bit ints, one needs to use 'long long'. 
Sharing some of the codes with the struct module, though arbitrary,
doesn't seem like a bad idea to me.  Of course offering specifically 32
and 64 bit ints would make sense to me.

 - Josiah

From fredrik at pythonware.com  Sat Oct 28 20:42:49 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 28 Oct 2006 20:42:49 +0200
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <20061028113844.0B08.JCARLSON@uci.edu>
References:  <45439D17.5010306@egenix.com>
	<20061028113844.0B08.JCARLSON@uci.edu>
Message-ID: 

Josiah Carlson wrote:

> I think that even on 64 bit platforms, using 'int' or 'long' generally
> means 32 bit.  In order to get 64 bit ints, one needs to use 'long long'.

real 64-bit platforms use the LP64 standard, where long and pointers are 
both 64 bits:

     http://www.unix.org/version2/whatsnew/lp64_wp.html

From oliphant.travis at ieee.org  Sat Oct 28 21:10:49 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sat, 28 Oct 2006 13:10:49 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45439D17.5010306@egenix.com>
References: 
	<45431761.1020401@egenix.com>	
	<45439D17.5010306@egenix.com>
Message-ID: 

M.-A. Lemburg wrote:
> Travis E. Oliphant wrote:
>> M.-A. Lemburg wrote:
>>> Travis E. Oliphant wrote:
>>>> ------------------------------------------------------------------------
>>>>
>>>> PEP: 
>>>> Title: Adding data-type objects to the standard library
>>>>   Attributes
>>>>
>>>>      kind      --  returns the basic "kind" of the data-type. The basic kinds
>>>>                      are:
>>>>                        't' - bit, 
>>>>                        'b' - bool, 
>>>>                        'i' - signed integer, 
>>>>                        'u' - unsigned integer,
>>>>                        'f' - floating point,                  
>>>>                        'c' - complex floating point, 
>>>>                        'S' - string (fixed-length sequence of char),
>>>>                        'U' - fixed length sequence of UCS4,
>>> Shouldn't this read "fixed length sequence of Unicode" ?!
>>> The underlying code unit format (UCS2 and UCS4) depends on the
>>> Python version.
>> Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
>> my questions at the bottom which talk about how to handle this.  A 
>> data-format does not necessarily have to correspond to something Python 
>> represents with an Object.
> 
> Ok, but why are you being specific about UCS4 (which is an internal
> storage format), while you are not specific about e.g. the
> internal bit size of the integers (which could be 32 or 64 bit) ?
> 

The 'kind' does not specify how "big" the data-type (data-format) is.  A 
number is needed to represent the number of bytes.

In this case, the 'kind' does not specify how large the data-type is. 
You can have 'u1', 'u2', 'u4', etc.

The same is true with Unicode.  You can have 10-character unicode 
elements, 20-character, etc.  But, we have to be clear about what a 
"character" is in the data-format.

-Travis

From oliphant.travis at ieee.org  Sat Oct 28 21:21:35 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sat, 28 Oct 2006 13:21:35 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <20061028135415.GA13049@code0.codespeak.net>
References: 
	<20061028135415.GA13049@code0.codespeak.net>
Message-ID: 

Armin Rigo wrote:
> Hi Travis,
> 
> On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote:
>>     This PEP proposes adapting the data-type objects from NumPy for
>>     inclusion in standard Python, to provide a consistent and standard
>>     way to discuss the format of binary data. 
> 
> How does this compare with ctypes?  Do we really need yet another,
> incompatible way to describe C-like data structures in the standard
> library?

Part of what the data-type, data-format object is trying to do is bring 
together all the disparate ways to represent data that *already* exists 
in the standard library.

What is needed is a definitive way to describe data and then have

array
struct
ctypes

all be compatible with that same method.    That's why I'm proposing the 
PEP.  It's a unification effort not yet-another-method.  One of the big 
reasons for it is to move something like the array interface into 
Python.  There are tens to hundreds of people mostly in the scientific 
computing community that want to see Python grow more support for 
NumPy-like things.  I keep getting requests to "do something" to make 
Python more aware of arrays.   This PEP is part of that effort.

In particular, something like the array interface should be available in 
Python.  The easiest way to do this is to extend the buffer protocol to 
allow objects to share information about shape, strides, and data-format 
of a block of memory.

But, how do you represent data-format in Python?  What will the objects 
pass back and forth to each other to do it?  C-types has a solution 
which creates multiple objects to do it.  This is an un-wieldy 
over-complicated solution for the array interface.

The array objects have a solution using the a single object that carries 
the data-format information. The solution we have for arrays deserves 
consideration.  It could be placed inside the array module if desired, 
but again, I'm really looking for something that would allow the extend 
buffer protocol (to be proposed soon) to share data-type information.

That could be done with the array-interface objects (strings, lists, and 
tuples), but then every body who uses the interface will have to write 
their own "decoders" to process the data-format information.

I actually think ctypes would benefit from this data-format 
specification too.

Recognizing all these diverging ways to essentially talk about the same 
thing is part of what prompted this PEP.

-Travis

From martin at v.loewis.de  Sat Oct 28 21:24:55 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 28 Oct 2006 21:24:55 +0200
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<45431761.1020401@egenix.com>		<45439D17.5010306@egenix.com>

Message-ID: <4543AE87.7080909@v.loewis.de>

Travis E. Oliphant schrieb:
> In this case, the 'kind' does not specify how large the data-type is. 
> You can have 'u1', 'u2', 'u4', etc.
> 
> The same is true with Unicode.  You can have 10-character unicode 
> elements, 20-character, etc.  But, we have to be clear about what a 
> "character" is in the data-format.

That is certainly confusing. In u1, u2, u4, the digit seems to indicate
the size of a single value (1 byte, 2 bytes, 4 bytes). Right? Yet,
in U20, it does *not* indicate the size of a single value but of an
array? And then, it's not the size, but the number of elements?

Regards,
Martin

From martin at v.loewis.de  Sat Oct 28 21:25:32 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 28 Oct 2006 21:25:32 +0200
Subject: [Python-Dev] build bots, log output
In-Reply-To: 
References: 
Message-ID: <4543AEAC.7090005@v.loewis.de>

Georg Brandl schrieb:
> I wonder if it's possible that the build bot notification mails that go
> to python-checkins include the last 10-15 lines from the log. This would
> make it much easier to decide whether a buildbot failure is an old,
> esoteric one (e.g.

It should be possible to implement that. To do so, one would have to
modify the source of the buildbot master.

Regards,
Martin

From mal at egenix.com  Sat Oct 28 21:31:34 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 28 Oct 2006 21:31:34 +0200
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<45431761.1020401@egenix.com>		<45439D17.5010306@egenix.com>

Message-ID: <4543B016.7070002@egenix.com>

Travis E. Oliphant wrote:
> M.-A. Lemburg wrote:
>> Travis E. Oliphant wrote:
>>> M.-A. Lemburg wrote:
>>>> Travis E. Oliphant wrote:
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> PEP: 
>>>>> Title: Adding data-type objects to the standard library
>>>>>   Attributes
>>>>>
>>>>>      kind      --  returns the basic "kind" of the data-type. The basic kinds
>>>>>                      are:
>>>>>                        't' - bit, 
>>>>>                        'b' - bool, 
>>>>>                        'i' - signed integer, 
>>>>>                        'u' - unsigned integer,
>>>>>                        'f' - floating point,                  
>>>>>                        'c' - complex floating point, 
>>>>>                        'S' - string (fixed-length sequence of char),
>>>>>                        'U' - fixed length sequence of UCS4,
>>>> Shouldn't this read "fixed length sequence of Unicode" ?!
>>>> The underlying code unit format (UCS2 and UCS4) depends on the
>>>> Python version.
>>> Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
>>> my questions at the bottom which talk about how to handle this.  A 
>>> data-format does not necessarily have to correspond to something Python 
>>> represents with an Object.
>> Ok, but why are you being specific about UCS4 (which is an internal
>> storage format), while you are not specific about e.g. the
>> internal bit size of the integers (which could be 32 or 64 bit) ?
>>
> 
> The 'kind' does not specify how "big" the data-type (data-format) is.  A 
> number is needed to represent the number of bytes.
> 
> In this case, the 'kind' does not specify how large the data-type is. 
> You can have 'u1', 'u2', 'u4', etc.
> 
> The same is true with Unicode.  You can have 10-character unicode 
> elements, 20-character, etc.  But, we have to be clear about what a 
> "character" is in the data-format.

I understand and that's why I'm asking why you made the range
explicit in the definition.

The definition should talk about Unicode code points.
The number of bytes then determines whether you can only
represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only)
or UCS4 (4 bytes, all currently assigned code points).

This is similar to the range for integers (ie. ZZ_0), where
the number of bytes determines the range of numbers that can
be represented.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 28 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From talin at acm.org  Sat Oct 28 21:34:39 2006
From: talin at acm.org (Talin)
Date: Sat, 28 Oct 2006 12:34:39 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	<2mk63lfu6j.fsf@starship.python.net>		<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>	
	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
Message-ID: <4543B0CF.1000300@acm.org>

BJ?rn Lindqvist wrote:
> I'd like to write a post mortem for PEP 355. But one important
> question that haven't been answered is if there is a possibility for a
> path-like PEP to succeed in the future? If so, does the path-object
> implementation have to prove itself in the wild before it can be
> included in Python? From earlier posts it seems like you don't like
> the concept of path objects, which others have found very interesting.
> If that is the case, then it would be nice to hear it explicitly. :)

So...how's that post mortem coming along? Did you get a sufficient 
answer to your questions?

And the more interesting question is, will the effort to reform Python's 
path functionality continue? From reading all the responses to your 
post, I feel that the community is on the whole supportive of the idea 
of refactoring os.path and friends, but they prefer a different 
approach; And several of the responses sketch out some suggestions for 
what that approach might be.

So what happens next?

-- Talin

From g.brandl at gmx.net  Sat Oct 28 22:07:27 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 28 Oct 2006 22:07:27 +0200
Subject: [Python-Dev] build bots, log output
In-Reply-To: <4543AEAC.7090005@v.loewis.de>
References:  <4543AEAC.7090005@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Georg Brandl schrieb:
>> I wonder if it's possible that the build bot notification mails that go
>> to python-checkins include the last 10-15 lines from the log. This would
>> make it much easier to decide whether a buildbot failure is an old,
>> esoteric one (e.g.
> 
> It should be possible to implement that. To do so, one would have to
> modify the source of the buildbot master.

I'd volunteer to do it if I knew where the source of the buildbot master
can be found :)

Georg

From oliphant.travis at ieee.org  Sun Oct 29 02:18:04 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sat, 28 Oct 2006 18:18:04 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4543AE87.7080909@v.loewis.de>
References: 	<45431761.1020401@egenix.com>		<45439D17.5010306@egenix.com>	
	<4543AE87.7080909@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Travis E. Oliphant schrieb:
>> In this case, the 'kind' does not specify how large the data-type is. 
>> You can have 'u1', 'u2', 'u4', etc.
>>
>> The same is true with Unicode.  You can have 10-character unicode 
>> elements, 20-character, etc.  But, we have to be clear about what a 
>> "character" is in the data-format.
> 
> That is certainly confusing. In u1, u2, u4, the digit seems to indicate
> the size of a single value (1 byte, 2 bytes, 4 bytes). Right? Yet,
> in U20, it does *not* indicate the size of a single value but of an
> array? And then, it's not the size, but the number of elements?
> 

Good point.  In NumPy, unicode support was added "in parallel" with 
string arrays where there is not the ambiguity.   So, yes, it's true 
that the unicode case is a special-case.

The other way to handle it would be to describe the 'code'-point size 
(i.e. 'U1', 'U2', 'U4' for UCS-1, UCS-2, UCS-4) and then have the length 
be encoded as an "array" of those types.

This was not the direction we took with NumPy (which is what I'm using 
as a reference) because I wanted Unicode and string arrays to look the 
same and thought of strings differently.

How to handle unicode data-formats could definitely be improved. 
Suggestions are welcome.

-Travis

From greg.ewing at canterbury.ac.nz  Sun Oct 29 02:15:40 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Oct 2006 13:15:40 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4542DD21.6000100@gmail.com>
References:  <4542A2FE.9070409@canterbury.ac.nz>
	<4542DD21.6000100@gmail.com>
Message-ID: <4543F2AC.4020909@canterbury.ac.nz>

Nick Coghlan wrote:

> Greg Ewing wrote:

>> Also, what if I want to refer to fields by name
>> but don't want to have to work out all the offsets

> Use the list definition form. With the changes I've 
> suggested above, you wouldn't even have to name the fields you don't 
> care about - just describe them.

That would be okay.

I still don't see a strong justification for having a
one-big-string form as well as a list/tuple/dict form,
though.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun Oct 29 02:17:46 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Oct 2006 13:17:46 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4542DF31.5090408@gmail.com>
References:  <454299D5.4090804@v.loewis.de>
	<4542DF31.5090408@gmail.com>
Message-ID: <4543F32A.3000705@canterbury.ac.nz>

Nick Coghlan wrote:
> I'd say the answer to where we put it will be dependent on what happens to the 
> idea of adding a NumArray style fixed dimension array type to the standard 
> library. If that gets exposed through the array module as array.dimarray, then 
> it would make sense to expose the associated data layout descriptors as 
> array.datatype.

Seem to me that arrays are a sub-concept of binary data,
not the other way around. So maybe both arrays and data
types should be in a module called 'binary' or some such.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun Oct 29 02:10:58 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Oct 2006 14:10:58 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References:  <45431761.1020401@egenix.com>
	 <45439D17.5010306@egenix.com>

Message-ID: <4543FFA2.30002@canterbury.ac.nz>

Travis E. Oliphant wrote:

> The 'kind' does not specify how "big" the data-type (data-format) is.

What exactly does "bit" mean in that context?

--
Greg

From greg.ewing at canterbury.ac.nz  Sun Oct 29 02:25:26 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Oct 2006 14:25:26 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References:  <45431761.1020401@egenix.com>
	 <45439D17.5010306@egenix.com>
	 <4543AE87.7080909@v.loewis.de>

Message-ID: <45440306.3050805@canterbury.ac.nz>

Travis E. Oliphant wrote:

> How to handle unicode data-formats could definitely be improved. 
> Suggestions are welcome.

'U4*10'      string of 10 4-byte Unicode chars

Then for consistency you'd want 'S*10' rather than
just 'S10' (or at least allow it as an alternative).

--
Greg

From oliphant.travis at ieee.org  Sun Oct 29 08:46:39 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sun, 29 Oct 2006 01:46:39 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4543F32A.3000705@canterbury.ac.nz>
References: 
	<454299D5.4090804@v.loewis.de>	<4542DF31.5090408@gmail.com>
	<4543F32A.3000705@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:
> Nick Coghlan wrote:
>> I'd say the answer to where we put it will be dependent on what happens to the 
>> idea of adding a NumArray style fixed dimension array type to the standard 
>> library. If that gets exposed through the array module as array.dimarray, then 
>> it would make sense to expose the associated data layout descriptors as 
>> array.datatype.
> 
> Seem to me that arrays are a sub-concept of binary data,
> not the other way around. So maybe both arrays and data
> types should be in a module called 'binary' or some such.

Yes, very good point.

That's probably one reason I'm proposing the data-type first before the 
array interface in the extended buffer protocol.

-Travis

From oliphant.travis at ieee.org  Sun Oct 29 08:50:38 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sun, 29 Oct 2006 01:50:38 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4543FFA2.30002@canterbury.ac.nz>
References: 
	<45431761.1020401@egenix.com>	
	<45439D17.5010306@egenix.com>	
	<4543FFA2.30002@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:
> Travis E. Oliphant wrote:
> 
>> The 'kind' does not specify how "big" the data-type (data-format) is.
> 
> What exactly does "bit" mean in that context?   

Do you mean "big" ?  It's how many bytes the kind is using.

So, 'u4' is a 4-byte unsigned integer and 'u2' is a 2-byte unsigned 
integer.

-Travis

From oliphant.travis at ieee.org  Sun Oct 29 08:48:23 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sun, 29 Oct 2006 01:48:23 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4543F2AC.4020909@canterbury.ac.nz>
References: 
	<4542A2FE.9070409@canterbury.ac.nz>	<4542DD21.6000100@gmail.com>
	<4543F2AC.4020909@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> Greg Ewing wrote:
> 
>>> Also, what if I want to refer to fields by name
>>> but don't want to have to work out all the offsets
> 
>> Use the list definition form. With the changes I've 
>> suggested above, you wouldn't even have to name the fields you don't 
>> care about - just describe them.
> 
> That would be okay.
> 
> I still don't see a strong justification for having a
> one-big-string form as well as a list/tuple/dict form,
> though.

Compaction of representation is all. It's used quite a bit in numarray, 
   which is where most of the 'kind' names came from as well.   When you 
don't want to name fields it is a really nice feature (but it doesn't 
nest well).

-Travis

From martin at v.loewis.de  Sun Oct 29 08:57:02 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Oct 2006 08:57:02 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<20061028135415.GA13049@code0.codespeak.net>

Message-ID: <45445ECE.9050504@v.loewis.de>

Travis E. Oliphant schrieb:
> What is needed is a definitive way to describe data and then have
> 
> array
> struct
> ctypes
> 
> all be compatible with that same method.    That's why I'm proposing the 
> PEP.  It's a unification effort not yet-another-method.

As I unification mechanism, I think it is insufficient. I doubt it
can express all the concepts that ctypes supports.

Regards,
Martin

From oliphant.travis at ieee.org  Sun Oct 29 08:49:18 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sun, 29 Oct 2006 01:49:18 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45440306.3050805@canterbury.ac.nz>
References: 
	<45431761.1020401@egenix.com>	
	<45439D17.5010306@egenix.com>	
	<4543AE87.7080909@v.loewis.de>	
	<45440306.3050805@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:
> Travis E. Oliphant wrote:
> 
>> How to handle unicode data-formats could definitely be improved. 
>> Suggestions are welcome.
> 
> 'U4*10'      string of 10 4-byte Unicode chars
> 

I like that.  Thanks.

-Travis

From robert.kern at gmail.com  Sun Oct 29 09:11:47 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 29 Oct 2006 02:11:47 -0600
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45445ECE.9050504@v.loewis.de>
References: 	<20061028135415.GA13049@code0.codespeak.net>	
	<45445ECE.9050504@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Travis E. Oliphant schrieb:
>> What is needed is a definitive way to describe data and then have
>>
>> array
>> struct
>> ctypes
>>
>> all be compatible with that same method.    That's why I'm proposing the 
>> PEP.  It's a unification effort not yet-another-method.
> 
> As I unification mechanism, I think it is insufficient. I doubt it
> can express all the concepts that ctypes supports.

What do you think is missing that can't be added?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

From martin at v.loewis.de  Sun Oct 29 09:18:12 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Oct 2006 09:18:12 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<45431761.1020401@egenix.com>		<45439D17.5010306@egenix.com>		<4543AE87.7080909@v.loewis.de>

Message-ID: <454463C4.1080009@v.loewis.de>

Travis E. Oliphant schrieb:
> How to handle unicode data-formats could definitely be improved. 

As before, I'm doubtful what the actual needs are. For example, is
it desired to support generation of ID3v2 tags with such a data
format? The tag is specified here:

http://www.id3.org/id3v2.4.0-structure.txt

In ID3v1, text fields have a specified width, and are supposed
to be encoded in Latin-1, and padded with zero bytes.

In ID3v2, text fields start with an encoding declaration
(say, \x03 for UTF-8), then followed with a null-terminated
sequence of UTF-8 bytes.

Is it the intent of this PEP to support such data structures,
and allow the user to fill in a Unicode object, and then the
processing is automatic? (i.e. in ID3v1, the string gets
automatically Latin-1-encoded and zero-padded, in ID3v2, it
gets automatically UTF-8 encoded, and null-terminated)

If that is not to be supported, what are the use cases?

Regards,
Martin

From oliphant.travis at ieee.org  Sun Oct 29 09:26:38 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sun, 29 Oct 2006 01:26:38 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45445ECE.9050504@v.loewis.de>
References: 	<20061028135415.GA13049@code0.codespeak.net>	
	<45445ECE.9050504@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Travis E. Oliphant schrieb:
>> What is needed is a definitive way to describe data and then have
>>
>> array
>> struct
>> ctypes
>>
>> all be compatible with that same method.    That's why I'm proposing the 
>> PEP.  It's a unification effort not yet-another-method.
> 
> As I unification mechanism, I think it is insufficient. I doubt it
> can express all the concepts that ctypes supports.
> 

Please clarify what you mean.

Are you saying that a single object can't carry all the information 
about binary data that ctypes allows with it's multi-object approach?

I don't agree with you, if that is the case.  Sure, perhaps I've not 
included certain cases, so give an example.

Besides, I don't think this is the right view of "unification".  I'm not 
saying that ctypes should get rid of it's many objects used for 
interfacing with C-functions.

I'm saying we should introduce a single-object mechanism for describing 
binary data so that the many-object approach of c-types does not become 
some kind of de-facto standard.  C-types can "translate" this 
object-instance to its internals if and when it needs to.

In the mean-time, how are other packages supposed to communicate binary 
information about data with each other?

Remember the context that the data-format object is presented in.  Two 
packages need to share a chunk of memory (the package authors do not 
know each other and only have and Python as a common reference).  They 
both want to describe that the memory they are sharing has some 
underlying binary structure.

How do they do that? Please explain to me how the buffer protocol can be 
extended so that information about "what is in the memory" can be shared 
without a data-format object?

-Travis

From oliphant.travis at ieee.org  Sun Oct 29 09:53:09 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Sun, 29 Oct 2006 01:53:09 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <454463C4.1080009@v.loewis.de>
References: 	<45431761.1020401@egenix.com>		<45439D17.5010306@egenix.com>		<4543AE87.7080909@v.loewis.de>	
	<454463C4.1080009@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Travis E. Oliphant schrieb:
>> How to handle unicode data-formats could definitely be improved. 
> 
> As before, I'm doubtful what the actual needs are. For example, is
> it desired to support generation of ID3v2 tags with such a data
> format? The tag is specified here:
> 

Perhaps I was not clear enough about what I'm try to do.   For a long 
time a lot of people have wanted something like Numeric in Python 
itself.  There have been many hurdles to that goal.

After discussions at SciPy 2006 with Guido, we decided that the best way 
to proceed at this point was to extend the buffer protocol to allow 
packages to share array-like information with each-other.

There are several things missing from the buffer protocol that NumPy 
needs in order to be able to really understand the (fixed-size) memory 
another package has allocated and is sharing.

The most important of these is

1) Shape information
2) Striding information
3) Data-format information  (how is each element perceived).

Shape and striding information can be shared with a C-array of integers.

How is data-format information supposed to be shared?

We've come up with a very flexible way to do this in NumPy using a 
single Python object.  This Python object supports describing the layout 
of any fixed-size chunk of memory (right now in units of bytes --- bit 
fields could be added, though).

I'm proposing to add this object to Python so that the buffer protcol 
has a fast and efficient way to share #3.   That's really all I'm after.

It also bothers me that so many ways to describe binary data are being 
used out there.  This is a problem that deserves being solved.  And, no, 
ctypes hasn't solved it (we can't directly use the ctypes solution). 
Perhaps this PEP doesn't hit all the corners, but a data-format object 
*is* a useful thing to consider.

The array object in Python already has a PyArray_Descr * structure that 
is a watered-down version of what I'm talking about.   In fact, this is 
what Numeric built from (or vice-versa actually).  And NumPy has greatly 
enhanced this object for any conceivable structure.

Guido seemed to think the data-type objects were nice when he saw them 
at SciPy 2006, and so I'm presenting a PEP.

Without the data-format object, I'm don't know how to extend the buffer 
protocol to communicate data-format information.  Do you have a better 
idea?

I have no trouble limiting the data-type object to the buffer protocol 
extension PEP, but I do think it could gain wider use.

> 
> Is it the intent of this PEP to support such data structures,
> and allow the user to fill in a Unicode object, and then the
> processing is automatic? (i.e. in ID3v1, the string gets
> automatically Latin-1-encoded and zero-padded, in ID3v2, it
> gets automatically UTF-8 encoded, and null-terminated)
>

No, the point of the data-format object is to communicate information 
about data-formats not to encode or decode anything.   Users of the 
data-format object could decide what they wanted to do with that 
information.   We just need a standard way to communicate it through the 
buffer protocol.

-Travis

From martin at v.loewis.de  Sun Oct 29 11:04:04 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Oct 2006 11:04:04 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<45431761.1020401@egenix.com>		<45439D17.5010306@egenix.com>		<4543AE87.7080909@v.loewis.de>		<454463C4.1080009@v.loewis.de>

Message-ID: <45447C94.2000309@v.loewis.de>

Travis E. Oliphant schrieb:
> I'm proposing to add this object to Python so that the buffer protcol 
> has a fast and efficient way to share #3.   That's really all I'm after.

I admit that I don't understand this objective. Why is it desirable to
support such an extended buffer protocol? What specific application
would be made possible if it was available and implemented in the
relevant modules and data types? What are the relevant modules and data
types that should implement it?

> It also bothers me that so many ways to describe binary data are being 
> used out there.  This is a problem that deserves being solved.  And, no, 
> ctypes hasn't solved it (we can't directly use the ctypes solution). 
> Perhaps this PEP doesn't hit all the corners, but a data-format object 
> *is* a useful thing to consider.

IMO, it is only useful if it realistically can support all the use cases
that it intends to support. If this PEP is about defining the elements
of arrays, I doubt it can realistically support everything you can
express in ctypes. There is no support for pointers (except for
PyObject*), no support for incomplete (recursive) types, no support
for function pointers, etc.

Vice versa: why exactly can't you use the data type system of ctypes?
If I want to say "int[10]", I do

py> ctypes.c_long * 10

To rewrite the examples from the PEP:

datatype(float) => ctypes.c_double
datatype(int)   => ctypes.c_long
datatype((int, 5)) => ctypes.c_long * 5
datatype((float, (3,2)) => (ctypes.c_double * 3) * 2

struct {
      int  simple;
      struct nested {
           char name[30];
           char addr[45];
           int  amount;
      }
=>
py> from ctypes import *
py> class nested(Structure):
...  _fields_ = [("name", c_char*30), ("addr", c_char*45), ("amount",
c_long)]
...
py> class struct(Structure):
...   _fields_ = [("simple", c_int), ("nested", nested)]
...

> Guido seemed to think the data-type objects were nice when he saw them 
> at SciPy 2006, and so I'm presenting a PEP.

I have no objection to including NumArray as-is into Python. I just
wonder were the rationale for this PEP comes from, i.e. why do you
need to exchange this information across different modules?

> Without the data-format object, I'm don't know how to extend the buffer 
> protocol to communicate data-format information.  Do you have a better 
> idea?

See above: I can't understand where the need for an extended buffer
protocol comes from. I can see why NumArray needs reflection, and
needs to keep information to interpret the bytes in the array.
But why is it important that the same information is exposed by
other data types?

>> Is it the intent of this PEP to support such data structures,
>> and allow the user to fill in a Unicode object, and then the
>> processing is automatic? (i.e. in ID3v1, the string gets
>> automatically Latin-1-encoded and zero-padded, in ID3v2, it
>> gets automatically UTF-8 encoded, and null-terminated)
>>
> 
> No, the point of the data-format object is to communicate information 
> about data-formats not to encode or decode anything.   Users of the 
> data-format object could decide what they wanted to do with that 
> information.   We just need a standard way to communicate it through the 
> buffer protocol.

This was actually a different sub-thread: why do you need to support
the 'U' code (or the 'S' code, for that matter)? In what application
do you have fixed size Unicode arrays, as opposed to Unicode strings?

Regards,
Martin

From martin at v.loewis.de  Sun Oct 29 11:10:22 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Oct 2006 11:10:22 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<20061028135415.GA13049@code0.codespeak.net>		<45445ECE.9050504@v.loewis.de>

Message-ID: <45447E0E.6050106@v.loewis.de>

Travis E. Oliphant schrieb:
>> As I unification mechanism, I think it is insufficient. I doubt it
>> can express all the concepts that ctypes supports.
>>
> 
> Please clarify what you mean.
> 
> Are you saying that a single object can't carry all the information 
> about binary data that ctypes allows with it's multi-object approach?

I'm not sure what you mean by "single object". If I use the tuple
syntax, e.g.

datatype((float, (3,2))

There are also multiple objects (the float, the 3, and the 2). You
get a single "root" object back, but so do you in ctypes.

But this isn't really what I meant. Instead, I think the PEP lacks
various concepts from C data types, such as pointers, unions,
function pointers, alignment/packing.

> In the mean-time, how are other packages supposed to communicate binary 
> information about data with each other?

This is my other question. Why should they?

> Remember the context that the data-format object is presented in.  Two 
> packages need to share a chunk of memory (the package authors do not 
> know each other and only have and Python as a common reference).  They 
> both want to describe that the memory they are sharing has some 
> underlying binary structure.

Can you please give an example of such two packages, and an application
that needs them share data?

Regards,
Martin

From martin at v.loewis.de  Sun Oct 29 11:20:08 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 29 Oct 2006 11:20:08 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<20061028135415.GA13049@code0.codespeak.net>		<45445ECE.9050504@v.loewis.de>

Message-ID: <45448058.5020700@v.loewis.de>

Robert Kern schrieb:
>> As I unification mechanism, I think it is insufficient. I doubt it
>> can express all the concepts that ctypes supports.
> 
> What do you think is missing that can't be added?

I can factually only report what is missing. Whether it can be added,
I don't know. As I just wrote in a few other messages: pointers,
unions, functions pointers, packed structs, incomplete/recursive
types. Also "flexible array members" (i.e. open-ended arrays).

While it may be possible to come up with a string syntax to describe
all these things (*), I wonder whether it should be done, and whether
NumArray can then support this extended data model.

Regards,
Martin

(*) perhaps with the exception of incomplete types: C needs forward
references in its own syntax.

From anthony at interlink.com.au  Sun Oct 29 12:20:12 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sun, 29 Oct 2006 21:20:12 +1000
Subject: [Python-Dev] build bots, log output
In-Reply-To: 
References: 
Message-ID: <200610292220.14695.anthony@interlink.com.au>

On Saturday 28 October 2006 23:39, Georg Brandl wrote:
> Hi,
>
> I wonder if it's possible that the build bot notification mails that go
> to python-checkins include the last 10-15 lines from the log. This would
> make it much easier to decide whether a buildbot failure is an old,
> esoteric one (e.g.

A better solution (awaiting sufficient round-tuits) would be to add an option 
to regrtest that's used by the buildslaves that uses particularly markup 
around success/fail indications. The buildmaster can pick those up, and keep 
track of existing pass/fails. Then it could send an email only when one 
changes. We might also add a daily or every couple of days reminder 
saying "The following tests are failing on the following platforms, and have 
been for X days now". 

Buildmaster code is on dinsdale, in (I think) ~buildbot. It's also in SVN.

This solution doesn't require changes to the buildslave code at all - only to 
the buildmaster and to regrtest.

-- 
Anthony Baxter     
It's never too late to have a happy childhood.

From amk at amk.ca  Sun Oct 29 14:47:11 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Sun, 29 Oct 2006 08:47:11 -0500
Subject: [Python-Dev] PyCon: proposals due by Tuesday 10/31
Message-ID: <20061029134711.GA15254@rogue.amk.ca>

Final reminder: if you want to submit a proposal to PyCon, you should
do it by end of Tuesday, October 31st.

 for more info

The deadline for tutorials is November 15th:

http://us.pycon.org/TX2007/CallForTutorials

PyCon is the Python community conference, held next February 23-25
near Dallas; a tutorial day will be on February 22.  See
http://us.pycon.org/ for more info.

--amk

From edcjones at comcast.net  Sun Oct 29 17:05:00 2006
From: edcjones at comcast.net (Edward C. Jones)
Date: Sun, 29 Oct 2006 11:05:00 -0500
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 
Message-ID: <4544D12C.4020209@comcast.net>

Travis E. Oliphant wrote:
 > It also bothers me that so many ways to describe binary data are
 > being used out there.  This is a problem that deserves being solved.

Is there a survey paper somewhere about binary formats? What formats are 
used in particle physics, bio-informatics, astronomy, etc? What software 
is used to read and write binary data? What descriptive languages are 
used for data (SQL, XML, etc)?

From ndbecker2 at gmail.com  Sun Oct 29 17:50:57 2006
From: ndbecker2 at gmail.com (Neal Becker)
Date: Sun, 29 Oct 2006 11:50:57 -0500
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
References: 
Message-ID: 

I have watched numpy with interest for a long time.  My own interest is to
possibly use the c-api to wrap c++ algorithms to use from python.

One thing that has concerned me, and continues to concern me with this
proposal, is that it seems to suffer from a very fat interface.  I
certainly have not studied the options in any depth, but my gut feeling is
that the interface is too fat and too complex.  I wonder if it's possible
to avoid this.  I wonder if this is an example of all the methods sinking
to the base class.

From p.f.moore at gmail.com  Sun Oct 29 18:00:25 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 29 Oct 2006 17:00:25 +0000
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <45447E0E.6050106@v.loewis.de>
References: 
	<20061028135415.GA13049@code0.codespeak.net>
	 <45445ECE.9050504@v.loewis.de>
	 <45447E0E.6050106@v.loewis.de>
Message-ID: <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com>

On 10/29/06, "Martin v. L?wis"  wrote:
> Travis E. Oliphant schrieb:
> > Remember the context that the data-format object is presented in.  Two
> > packages need to share a chunk of memory (the package authors do not
> > know each other and only have and Python as a common reference).  They
> > both want to describe that the memory they are sharing has some
> > underlying binary structure.
>
> Can you please give an example of such two packages, and an application
> that needs them share data?

Here's an example. PIL handles images (in various formats) in memory,
as blocks of binary image data. NumPy provides methods for
manipulating in-memory blocks of data. Now, if I want to use NumPy to
manipulate that data in place (for example, to cap the red component
at 128, and equalise the range of the green component) my code needs
to know the format of the memory block that PIL exposes. I am assuming
that in-place manipulation is better, because there is no need for
repeated copies of the data to be made (this would be true for large
images).

If PIL could expose a descriptor for its data structure, NumPy code
could manipulate it in place without fear of corrupting it. Of course,
this can be done by the end user reading the PIL documentation and
transcribing the documented format into the NumPy code. But I would
argue that it's better if the PIL block is self-describing in a way
that avoids the need for a manual transcription of the format.

To do this *without* needing the PIL and NumPy developers to
co-operate needs an independent standard, which is what I assume this
PEP is intended to provide.

Paul.

From jcarlson at uci.edu  Sun Oct 29 18:35:57 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 29 Oct 2006 10:35:57 -0700
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com>
References: <45447E0E.6050106@v.loewis.de>
	<79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com>
Message-ID: <20061029101933.0B0E.JCARLSON@uci.edu>

"Paul Moore"  wrote:
> On 10/29/06, "Martin v. L?wis"  wrote:
> > Travis E. Oliphant schrieb:
> > > Remember the context that the data-format object is presented in.  Two
> > > packages need to share a chunk of memory (the package authors do not
> > > know each other and only have and Python as a common reference).  They
> > > both want to describe that the memory they are sharing has some
> > > underlying binary structure.
> >
> > Can you please give an example of such two packages, and an application
> > that needs them share data?
> 
> To do this *without* needing the PIL and NumPy developers to
> co-operate needs an independent standard, which is what I assume this
> PEP is intended to provide.

One could also toss wxPython, VTK, or any one of the other GUI libraries
into the mix for visualizing those images, of which wxPython just
acquired no-copy display of PIL images, and being able to manipulate
them with numpy (of which some wxPython built in classes use numpy to
speed up manipulation) would be very useful.

Of all of the intended uses, I'd say that zero-copy sharing of
information on the graphics/visualization front is the most immediate
'people will be using it tomorrow' feature.

I personally don't have my pulse on the Scientific Python community, so
I don't know about other uses, but in regards to Martin's list of
missing features: "pointers, unions, function pointers,
alignment/packing [, etc.]" I'm going to go out on a limb and say for
the majority of those YAGNI, or really, NOHAFIAFACT (no one has asked
for it, as far as I can tell).  Someone who knows the scipy community,
feel free to correct me.

 - Josiah

From martin at v.loewis.de  Sun Oct 29 19:48:46 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Oct 2006 19:48:46 +0100
Subject: [Python-Dev] build bots, log output
In-Reply-To: <200610292220.14695.anthony@interlink.com.au>
References: 
	<200610292220.14695.anthony@interlink.com.au>
Message-ID: <4544F78E.8010101@v.loewis.de>

Anthony Baxter schrieb:
> A better solution (awaiting sufficient round-tuits) would be to add an option 
> to regrtest that's used by the buildslaves that uses particularly markup 
> around success/fail indications. The buildmaster can pick those up, and keep 
> track of existing pass/fails. Then it could send an email only when one 
> changes. We might also add a daily or every couple of days reminder 
> saying "The following tests are failing on the following platforms, and have 
> been for X days now". 

As yet another alternative, we could put the names of the builders on
which builds are expected to fail (or the system names of these systems)
into the test cases, and then report "expected failures"; regrtest would
give a "success" status if all failures are expected.

The consequence would be that these systems would appear "green" on
the buildbot page, and you'd have to look into the log file to find
out which of the expected failures actually happened. This all could
work without changes to buildbot at all.

Regards,
Martin

From martin at v.loewis.de  Sun Oct 29 20:30:26 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Oct 2006 20:30:26 +0100
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com>
References: 	<20061028135415.GA13049@code0.codespeak.net>	
	<45445ECE.9050504@v.loewis.de>	
	<45447E0E.6050106@v.loewis.de>
	<79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com>
Message-ID: <45450152.5000706@v.loewis.de>

Paul Moore schrieb:
> Here's an example. PIL handles images (in various formats) in memory,
> as blocks of binary image data. NumPy provides methods for
> manipulating in-memory blocks of data. Now, if I want to use NumPy to
> manipulate that data in place (for example, to cap the red component
> at 128, and equalise the range of the green component) my code needs
> to know the format of the memory block that PIL exposes. I am assuming
> that in-place manipulation is better, because there is no need for
> repeated copies of the data to be made (this would be true for large
> images).

Thanks, that looks like a good example. Is it possible to elaborate
that? E.g. what specific image format would I use (could that work
for jpeg, even though this format has compression in it), and
what specific NumPy routines would I use to implement the capping
and equalising? What would the datatype description look like that
those tools need to exchange?

Looking at this in more detail, PIL in-memory images (ImagingCore
objects) either have the image8 UINT8**, or the image32 INT32**;
they have separate fields for pixelsize and linesize. In the image8
case, there are three options:
- each value is an 8-bit integer (IMAGING_TYPE_UINT8) (1)
- each value is a 16-bit integer, either little (2) or big endian (3)
  (IMAGING_TYPE_SPECIAL, mode either I;16 or I;16B)
In the image32 case, there are five options:
- two 8-bit values per four bytes, namely byte 0 and byte 3 (4)
- three 8-bit values (bytes 0, 1, 2) (5)
- four 8-bit values (6)
- a single 32-bit int (7)
- a single 32-bit float (8)

Now, what would be the algorithm in NumPy that I could use to
implement capping and equalising?

> If PIL could expose a descriptor for its data structure, NumPy code
> could manipulate it in place without fear of corrupting it. Of course,
> this can be done by the end user reading the PIL documentation and
> transcribing the documented format into the NumPy code. But I would
> argue that it's better if the PIL block is self-describing in a way
> that avoids the need for a manual transcription of the format.

Without digging further, I think some of the formats simply don't allow
for the kind of manipulation you suggest, namely all palette formats
(which are the single-valued ones, plus the two-band version with
 a palette number and an alpha value), and greyscale images. So
in any case, the application has to look at the mode of the image to
find out whether the operation is even meaningful. And then, the
application has to tell NumPy somehow what fields to operate on.

> To do this *without* needing the PIL and NumPy developers to
> co-operate needs an independent standard, which is what I assume this
> PEP is intended to provide.

Ok, I now understand the goal, although I still like to understand
this usecase better.

Regards,
Martin

From bjourne at gmail.com  Sun Oct 29 20:33:13 2006
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Sun, 29 Oct 2006 20:33:13 +0100
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <4543B0CF.1000300@acm.org>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>
	<4543B0CF.1000300@acm.org>
Message-ID: <740c3aec0610291133i225e9d8er2b8f7f8afac03bd5@mail.gmail.com>

On 10/28/06, Talin  wrote:
> BJ?rn Lindqvist wrote:
> > I'd like to write a post mortem for PEP 355. But one important
> > question that haven't been answered is if there is a possibility for a
> > path-like PEP to succeed in the future? If so, does the path-object
> > implementation have to prove itself in the wild before it can be
> > included in Python? From earlier posts it seems like you don't like
> > the concept of path objects, which others have found very interesting.
> > If that is the case, then it would be nice to hear it explicitly. :)
>
> So...how's that post mortem coming along? Did you get a sufficient
> answer to your questions?

Yes and no. All posts have very exhaustively explained why the
implementation in PEP 355 is far from optimal. And I can see why it
is. However, what I am uncertain of is Guido's opinion on the
background and motivation of the PEP:

"Many have felt that the API for manipulating file paths as offered in
the os.path module is inadequate."

"Currently, Python has a large number of different functions scattered
over half a dozen modules for handling paths.  This makes it hard for
newbies and experienced developers to to choose the right method."

IMHO, the current API is very messy. But when it comes to PEPs, it is
mostly Guido's opinion that counts. :) Unless he sees a problem with
the current situation, then there is no point in writing more PEPs.

> And the more interesting question is, will the effort to reform Python's
> path functionality continue?

I certainly hope so. But maybe it is better to target Python 3000, or
maybe the Python devs already have ideas for how they want the path
APIs to look like?

> So what happens next?

I really hope that Guido will give his input when he has more time.

Mvh Bj?rn

From martin at v.loewis.de  Sun Oct 29 20:37:09 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Oct 2006 20:37:09 +0100
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <20061029101933.0B0E.JCARLSON@uci.edu>
References: <45447E0E.6050106@v.loewis.de>	<79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com>
	<20061029101933.0B0E.JCARLSON@uci.edu>
Message-ID: <454502E5.80202@v.loewis.de>

Josiah Carlson schrieb:
> One could also toss wxPython, VTK, or any one of the other GUI libraries
> into the mix for visualizing those images, of which wxPython just
> acquired no-copy display of PIL images, and being able to manipulate
> them with numpy (of which some wxPython built in classes use numpy to
> speed up manipulation) would be very useful.

I'm doubtful that this PEP alone would allow zero-copy sharing of images
for display. Often, the libraries need the data in a different format.
So they need to copy, even if they could understand the other format.
However, the PEP won't allow "understanding" the format. If I know I
have an array of 4-byte values: which of them is R, G, B, and A?

Regards,
Martin

From talin at acm.org  Sun Oct 29 20:56:19 2006
From: talin at acm.org (Talin)
Date: Sun, 29 Oct 2006 11:56:19 -0800
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <740c3aec0610291133i225e9d8er2b8f7f8afac03bd5@mail.gmail.com>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>	
	<2mk63lfu6j.fsf@starship.python.net>	

	<021c01c6e4de$7b1a6d80$9a4c2a97@bagio>	

	<740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com>	
	<4543B0CF.1000300@acm.org>
	<740c3aec0610291133i225e9d8er2b8f7f8afac03bd5@mail.gmail.com>
Message-ID: <45450763.1030000@acm.org>

BJ?rn Lindqvist wrote:
> On 10/28/06, Talin  wrote:
>> BJ?rn Lindqvist wrote:
>> > I'd like to write a post mortem for PEP 355. But one important
>> > question that haven't been answered is if there is a possibility for a
>> > path-like PEP to succeed in the future? If so, does the path-object
>> > implementation have to prove itself in the wild before it can be
>> > included in Python? From earlier posts it seems like you don't like
>> > the concept of path objects, which others have found very interesting.
>> > If that is the case, then it would be nice to hear it explicitly. :)
>>
>> So...how's that post mortem coming along? Did you get a sufficient
>> answer to your questions?
> 
> Yes and no. All posts have very exhaustively explained why the
> implementation in PEP 355 is far from optimal. And I can see why it
> is. However, what I am uncertain of is Guido's opinion on the
> background and motivation of the PEP:
> 
> "Many have felt that the API for manipulating file paths as offered in
> the os.path module is inadequate."
> 
> "Currently, Python has a large number of different functions scattered
> over half a dozen modules for handling paths.  This makes it hard for
> newbies and experienced developers to to choose the right method."
> 
> IMHO, the current API is very messy. But when it comes to PEPs, it is
> mostly Guido's opinion that counts. :) Unless he sees a problem with
> the current situation, then there is no point in writing more PEPs.
> 
>> And the more interesting question is, will the effort to reform Python's
>> path functionality continue?
> 
> I certainly hope so. But maybe it is better to target Python 3000, or
> maybe the Python devs already have ideas for how they want the path
> APIs to look like?

I think targeting Py3K is a good idea. The whole purpose of Py3K is to 
"clean up the messes" of past decisions, and to that end, a certain 
amount of backwards-compatibility breakage will be allowed (although if 
that can be avoided, so much the better.)

And to the second point, having been following the Py3K list, I don't 
anyone has expressed any preconceived notions of how they want things to 
look (well, except I know I do, but I'm not a core dev :) :).

>> So what happens next?
> 
> I really hope that Guido will give his input when he has more time.

First bit of advice is, don't hold your breath.

Second bit of advice is, if you really do want Guido's feedback (or the 
core python devs), start my creating a (short) list of the outstanding 
points of controversy to be resolved. Once those issues have been 
decided, then proceed to the next stage, building consensus by increments.

Basically, anything that requires Guido to read more than a page of 
material isn't going to get done quickly. At least, in my experience :)

> Mvh Bj?rn

From jcarlson at uci.edu  Sun Oct 29 20:13:19 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 29 Oct 2006 12:13:19 -0700
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <454502E5.80202@v.loewis.de>
References: <20061029101933.0B0E.JCARLSON@uci.edu> <454502E5.80202@v.loewis.de>
Message-ID: <20061029120129.0B1E.JCARLSON@uci.edu>

"Martin v. L?wis"  wrote:
> Josiah Carlson schrieb:
> > One could also toss wxPython, VTK, or any one of the other GUI libraries
> > into the mix for visualizing those images, of which wxPython just
> > acquired no-copy display of PIL images, and being able to manipulate
> > them with numpy (of which some wxPython built in classes use numpy to
> > speed up manipulation) would be very useful.
> 
> I'm doubtful that this PEP alone would allow zero-copy sharing of images
> for display. Often, the libraries need the data in a different format.
> So they need to copy, even if they could understand the other format.
> However, the PEP won't allow "understanding" the format. If I know I
> have an array of 4-byte values: which of them is R, G, B, and A?

...in the cases I have seen, which includes BMP, TGA, uncompressed TIFF,
a handful of platform-specific bitmap formats, etc.,  you _always_ get
them in RGBA order.  If the alpha channel is to be left out, then you
get them as RGB.

The trick with allowing zero-copy sharing is 1) to understand the format,
and 2) to manipulate/display in-place.  The former is necessary for the
latter, which is what Travis is shooting for.  Also, because wxPython
has figured out how PIL images are structured, they can do #2, and so
far no one has mentioned any examples where the standard RGB/RGBA format
hasn't worked for them.

In the case of jpegs (as you mentioned in another message), PIL
uncompresses all images it understands into some kind of 'natural'
format (from what I understand). For 24/32 bit images, that is RGB or
RGBA. For palletized images (gif, 8-bit png, 8-bit bmp, etc.) maybe it
is a palletized format, or maybe it is RGB/RGBA?  I don't know, all of
my images are 24/32 bit, but I can just about guarantee it's not an
issue for the case that Paul mentioned.

 - Josiah

From brett at python.org  Sun Oct 29 21:58:05 2006
From: brett at python.org (Brett Cannon)
Date: Sun, 29 Oct 2006 12:58:05 -0800
Subject: [Python-Dev] Status of new issue tracker
Message-ID: 

The initial admins for the Roundup installation have been chosen: Paul
DuBois, Michael Twomey, Stefan Seefeld, and Erik Forsberg.  The offer from
Upfront Systems (http://www.upfrontsystems.co.za/) has been accepted for
professional Roundup hosting.

Discussion of how to handle the new tracker (including the design of it,
handling the transition, etc.) will take place on the tracker-discuss
mailing list (http://mail.python.org/mailman/listinfo/tracker-discuss).  If
you want to provide input on what you want the new tracker to do, please
join the list.  Input from members of python-dev will take precedence so
please participate if you have any interest.

I don't have a timeline on when all of this will happen (talks amongst the
four admins has already started on the mailing list and Upfront has started
the process of getting us our account).  The first step is to get the admins
situated with their new server.  Then we start worrying about what info we
want the tracker to store and how to transition off of SF.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061029/460f52c4/attachment.html 

From greg.ewing at canterbury.ac.nz  Sun Oct 29 23:58:20 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Oct 2006 11:58:20 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References:  <45431761.1020401@egenix.com>
	 <45439D17.5010306@egenix.com>
	 <4543FFA2.30002@canterbury.ac.nz>

Message-ID: <4545320C.603@canterbury.ac.nz>

Travis E. Oliphant wrote:

> Greg Ewing wrote:

>>What exactly does "bit" mean in that context?   
> 
> Do you mean "big" ?

No, you've got a data type there called "bit",
which seems to imply a size, in contradiction
to the size-independent nature of the other
types. I'm asking what size-independent
information it's meant to convey.

--
Greg

From greg.ewing at canterbury.ac.nz  Mon Oct 30 00:37:54 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Oct 2006 12:37:54 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References:  <45431761.1020401@egenix.com>
	 <45439D17.5010306@egenix.com>
	 <4543AE87.7080909@v.loewis.de>
	 <454463C4.1080009@v.loewis.de>

Message-ID: <45453B52.5030802@canterbury.ac.nz>

Travis E. Oliphant wrote:
> Martin v. L?wis wrote:
> 
>>Travis E. Oliphant schrieb:

>>Is it the intent of this PEP to support such data structures,
>>and allow the user to fill in a Unicode object, and then the
>>processing is automatic?

> No, the point of the data-format object is to communicate information 
> about data-formats not to encode or decode anything.

Well, there's still the issue of how much detail you
want to be able to convey, so I think the question
is valid. Is the encoding of a Unicode string something
we want to be able to communicate via this mechanism,
or is that outside its scope?

--
Greg

From greg.ewing at canterbury.ac.nz  Mon Oct 30 00:38:01 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Oct 2006 12:38:01 +1300
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <20061029120129.0B1E.JCARLSON@uci.edu>
References: <20061029101933.0B0E.JCARLSON@uci.edu> <454502E5.80202@v.loewis.de>
	<20061029120129.0B1E.JCARLSON@uci.edu>
Message-ID: <45453B59.3070406@canterbury.ac.nz>

Josiah Carlson wrote:

> ...in the cases I have seen ... you _always_ get
> them in RGBA order.

Except when you don't. I've had cases where I've had to
convert between RGBA and BGRA (for stuffing directly into
a frame buffer on Linux, as far as I remember).

So it may be worth including some features in the standard
for describing pixel formats.

Pygame seems to have a very detailed and flexible system
for doing this, so it might be a good idea to have a
look at that.

--
Greg

From skip at pobox.com  Mon Oct 30 01:45:42 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 29 Oct 2006 18:45:42 -0600
Subject: [Python-Dev] test_codecs failures
Message-ID: <17733.19254.732407.478451@montanaro.dyndns.org>

I recently began running a Pybots buildslave for SQLAlchemy.  I am still
struggling to get that working correctly.  Today, Python's test_codecs test
began failing:

    test test_codecs failed -- Traceback (most recent call last):
      File "/Library/Buildbot/pybot/trunk.montanaro-g5/build/Lib/test/test_codecs.py", line 1165, in test_basics
        encoder = codecs.getincrementalencoder(encoding)("ignore")
      File "/Library/Buildbot/pybot/trunk.montanaro-g5/build/Lib/encodings/bz2_codec.py", line 56, in __init__
        assert errors == 'strict'
    AssertionError

This failure seems to coincide with some checkins by Georg.  Full output
here:

    http://www.python.org/dev/buildbot/community/all/?show=g5%20OSX%20trunk&show=g5%20OSX%202.5

Skip

From nnorwitz at gmail.com  Mon Oct 30 01:48:51 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 29 Oct 2006 16:48:51 -0800
Subject: [Python-Dev] test_codecs failures
In-Reply-To: <17733.19254.732407.478451@montanaro.dyndns.org>
References: <17733.19254.732407.478451@montanaro.dyndns.org>
Message-ID: 

On 10/29/06, skip at pobox.com  wrote:
> I recently began running a Pybots buildslave for SQLAlchemy.  I am still
> struggling to get that working correctly.  Today, Python's test_codecs test
> began failing:

I checked in a fix for this that hasn't quite completed yet.  (Only
finished successfully on one box so far.)  So this should be taken
care of.  I *think* the fix was correct, but I'm not entirely
positive.

Also the refleak problem is fixed AFAIK.

n

From walter at livinglogic.de  Mon Oct 30 01:51:38 2006
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 30 Oct 2006 01:51:38 +0100
Subject: [Python-Dev] test_codecs failures
In-Reply-To: 
References: <17733.19254.732407.478451@montanaro.dyndns.org>

Message-ID: <45454C9A.4040903@livinglogic.de>

Neal Norwitz wrote:

> On 10/29/06, skip at pobox.com  wrote:
>> I recently began running a Pybots buildslave for SQLAlchemy.  I am still
>> struggling to get that working correctly.  Today, Python's test_codecs test
>> began failing:
> 
> I checked in a fix for this that hasn't quite completed yet.  (Only
> finished successfully on one box so far.)  So this should be taken
> care of.  I *think* the fix was correct, but I'm not entirely
> positive.

The fix *is* indeed correct. bz2 didn't get built on my box, so I didn't
see the failure.

Servus,
   Walter

From ncoghlan at gmail.com  Mon Oct 30 11:00:42 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 30 Oct 2006 20:00:42 +1000
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References:  
Message-ID: <4545CD4A.8040302@gmail.com>

Neal Becker wrote:
> I have watched numpy with interest for a long time.  My own interest is to
> possibly use the c-api to wrap c++ algorithms to use from python.
> 
> One thing that has concerned me, and continues to concern me with this
> proposal, is that it seems to suffer from a very fat interface.  I
> certainly have not studied the options in any depth, but my gut feeling is
> that the interface is too fat and too complex.  I wonder if it's possible
> to avoid this.  I wonder if this is an example of all the methods sinking
> to the base class.

You've just described my number #1 concern with incorporating NumPy wholesale, 
and the reason I believe it would be nice to cherry-pick a couple of key 
components for the standard library, rather than adopting the whole thing.

Travis has done a lot of work towards that goal (the latest result of which is 
this pre-PEP for describing the individual array elements in a way that is 
more flexible than the single character codes of the current array module).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From Jack.Jansen at cwi.nl  Mon Oct 30 16:40:19 2006
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Mon, 30 Oct 2006 16:40:19 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 
Message-ID: 

Would it be possible to make the data-type objects subclassable, with  
the subclasses being able to override the equality test?
The range of data types that you've specified in the PEP are good  
enough for most general use, and probably for NumPy as well, but  
someone already came up with the example of image formats, which have  
their whole own range of data formats. I could throw in audio formats  
(bits per sample, excess-N or signed or ulaw samples, mono/stereo/5.1/ 
etc, order of the channels), and there's probably a whole slew of  
other areas that have their own sets of formats.

If the datatype objects are subclassable, modules could initially  
start by adding their own formats. So, the "jackaudio" and  
"jillaudio" modules would have distinct sets of formats. But then  
later on it should be fairly easy for them to recognize each others  
formats. So, jackaudio would recognize the jillaudio format "msdos  
linear pcm" as being identical to its own "16-bit excess-32768".

Hopefully eventually all audio module writers would get together and  
define a set of standard audio formats.
--
Jack Jansen, , http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061030/7ae95590/attachment.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2255 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20061030/7ae95590/attachment.bin 

From deets at web.de  Mon Oct 30 18:19:19 2006
From: deets at web.de (Diez B. Roggisch)
Date: Mon, 30 Oct 2006 18:19:19 +0100
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <20061029120129.0B1E.JCARLSON@uci.edu>
References: <20061029101933.0B0E.JCARLSON@uci.edu> <454502E5.80202@v.loewis.de>
	<20061029120129.0B1E.JCARLSON@uci.edu>
Message-ID: <200610301819.20117.deets@web.de>

> ...in the cases I have seen, which includes BMP, TGA, uncompressed TIFF,
> a handful of platform-specific bitmap formats, etc.,  you _always_ get
> them in RGBA order.  If the alpha channel is to be left out, then you
> get them as RGB.

Mac OS X unfortunately uses ARGB. Writing some alti-vec code remedied that for 
passing it around to the OpenCV library.

Just my $.02 

Diez

From jimjjewett at gmail.com  Mon Oct 30 18:56:18 2006
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 30 Oct 2006 12:56:18 -0500
Subject: [Python-Dev] PEP: Adding data-type objects to Python
Message-ID: 

Travis E. Oliphant wrote:

> Two packages need to share a chunk of memory (the package authors do not
> know each other and only have and Python as a common reference).  They
> both want to describe that the memory they are sharing has some
> underlying binary structure.

As a quick sanity check, please tell me where I went off track.

it sounds to me like you are assuming that:

(1)  The memory chunk represents a single object (probably an array of
some sort)
(2)  That subchunks can themselves be described by a (single?)
repeating C struct.
(3)  You can't just use the C header, since you want this at run-time.
(4)  It would be enough if you could say

This is an array of 500 elements that look like

struct {
      int  simple;
      struct nested {
           char name[30];
           char addr[45];
           int  amount;
      }

(5)  But is it not acceptable to use Martin's suggested ctypes
equivalent of (building out from the inside):

    class nested(Structure):
        _fields_ = [("name", c_char*30), ("addr", c_char*45),
("amount", c_long)]

    class struct(Structure):
        _fields_ = [("simple", c_int), ("nested", nested)]

    struct * 500

If I misunderstood, could you show me where?

If I did understand correctly, could you expand on why (5) is
unacceptable, given that ctypes is now in the core?  (New and unknown,
I would understand -- but that is also true of any datatype proposal,
for the people who haven't already used it.  I suspect that any
differences from Numpy would be a source of pain for those who *have*
used Numpy, but following Numpy exactly is ... not much simpler than
the above.)

Or are you just saying that "anything with a buffer interface should
also have a datatype object describing the layout in a standard way"?
If so, that makes sense, but I'm inclined to prefer the ctypes way, so
that most people won't ever have to worry about things like
endianness/strides/Fortan layout.

-jJ

From oliphant.travis at ieee.org  Mon Oct 30 22:26:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon, 30 Oct 2006 14:26:02 -0700
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <454502E5.80202@v.loewis.de>
References: <45447E0E.6050106@v.loewis.de>	<79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com>	<20061029101933.0B0E.JCARLSON@uci.edu>
	<454502E5.80202@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Josiah Carlson schrieb:
> 
>>One could also toss wxPython, VTK, or any one of the other GUI libraries
>>into the mix for visualizing those images, of which wxPython just
>>acquired no-copy display of PIL images, and being able to manipulate
>>them with numpy (of which some wxPython built in classes use numpy to
>>speed up manipulation) would be very useful.
> 
> 
> I'm doubtful that this PEP alone would allow zero-copy sharing of images
> for display. Often, the libraries need the data in a different format.
> So they need to copy, even if they could understand the other format.
> However, the PEP won't allow "understanding" the format. If I know I
> have an array of 4-byte values: which of them is R, G, B, and A?
> 

You give a name to the fields: 'R', 'G', 'B', and 'A'.

-Travis

From oliphant.travis at ieee.org  Mon Oct 30 22:44:22 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon, 30 Oct 2006 14:44:22 -0700
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 
Message-ID: 

Jim Jewett wrote:
> Travis E. Oliphant wrote:
> 
> 
>>Two packages need to share a chunk of memory (the package authors do not
>>know each other and only have and Python as a common reference).  They
>>both want to describe that the memory they are sharing has some
>>underlying binary structure.
> 
> 
> As a quick sanity check, please tell me where I went off track.
> 
> it sounds to me like you are assuming that:
> 
> (1)  The memory chunk represents a single object (probably an array of
> some sort)
> (2)  That subchunks can themselves be described by a (single?)
> repeating C struct.
> (3)  You can't just use the C header, since you want this at run-time.
> (4)  It would be enough if you could say
> 
> This is an array of 500 elements that look like
> 
> struct {
>       int  simple;
>       struct nested {
>            char name[30];
>            char addr[45];
>            int  amount;
>       }
> 

Sure.  I think that's pretty much it.  I assume you mean object in the 
general sense and not as in (Python object).

> (5)  But is it not acceptable to use Martin's suggested ctypes
> equivalent of (building out from the inside):

Part of the problem is that ctypes uses a lot of different Python types 
(that's what I mean by "multi-object" to accomplish it's goal).  What 
I'm looking for is a single Python type that can be passed around and 
explains binary data.

Remember the buffer protocol is in compiled code.  So, as a result,

1) It's harder to construct a class to pass through the protocol using 
the multiple-types approach of ctypes.

2) It's harder to interpret the object recevied through the buffer 
protocol.

Sure, it would be *possible* to use ctypes, but I think it would be very 
difficult.  Think about how you would write the get_data_format C 
function in the extended buffer protocol for NumPy if you had to import 
ctypes and then build a class just to describe your data.  How would you 
interpret what you get back?

The ctypes "format-description" approach is not as unified as a single 
Python type object that I'm proposing.

In NumPy, we have a very nice, compact description of complicated data 
already available.  Why not use what we've learned?

I don't think we should just *use ctypes because it's there* when the 
way it describes binary data was not constructed with the extended 
buffer protocol in mind.

The other option, of course, which would not introduce a new Python type 
is to use the array interface specification and pass a list of tuples. 
But, I think this is also un-necessarily wasteful because the sending 
object has to construct it and the receiving object has to de-construct 
it.  The whole point of the (extended) buffer protocol is to communicate 
this information more quickly.

-Travis

From oliphant.travis at ieee.org  Mon Oct 30 22:50:34 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon, 30 Oct 2006 14:50:34 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4545320C.603@canterbury.ac.nz>
References: 
	<45431761.1020401@egenix.com>	
	<45439D17.5010306@egenix.com>	
	<4543FFA2.30002@canterbury.ac.nz>	
	<4545320C.603@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:
> Travis E. Oliphant wrote:
> 
> 
>>Greg Ewing wrote:
> 
> 
>>>What exactly does "bit" mean in that context?   
>>
>>Do you mean "big" ?
> 
> 
> No, you've got a data type there called "bit",
> which seems to imply a size, in contradiction
> to the size-independent nature of the other
> types. I'm asking what size-independent
> information it's meant to convey.

Ah.  I see what you were saying now.   I guess the 'bit' type is 
different (we actually don't have that type in NumPy so my understanding 
of it is limited).

The 'bit' type re-intprets the size information to be in units of "bits" 
and so implies a "bit-field" instead of another data-format.

-Travis

From oliphant.travis at ieee.org  Mon Oct 30 23:00:49 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon, 30 Oct 2006 15:00:49 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45448058.5020700@v.loewis.de>
References: 	<20061028135415.GA13049@code0.codespeak.net>		<45445ECE.9050504@v.loewis.de>	
	<45448058.5020700@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Robert Kern schrieb:
> 
>>>As I unification mechanism, I think it is insufficient. I doubt it
>>>can express all the concepts that ctypes supports.
>>
>>What do you think is missing that can't be added?
> 
> 
> I can factually only report what is missing. Whether it can be added,
> I don't know. As I just wrote in a few other messages: pointers,
> unions, functions pointers, packed structs, incomplete/recursive
> types. Also "flexible array members" (i.e. open-ended arrays).
> 

I understand function pointers, pointers, and unions.

Function pointers are "supported" with the void data-type and could be 
more specifically supported if it were important.   People typically 
don't use the buffer protocol to send function-pointers around in a way 
that the void description wouldn't be enough.

Pointers are also "supported" with the void data-type.  If pointers to 
other data-types were an important feature to support, then this could 
be added in many ways (a flag on the data-type object for example is how 
this is done is NumPy).

Unions are actually supported (just define two fields with the same 
offset).

I don't know what you mean by "packed structs" (unless you are talking 
about alignment issues in which case there is support for it).

I'm not sure I understand what you mean by "incomplete / recursive" 
types unless you are referring to something like a node where an element 
of the structure is a pointer to another structure of the same kind 
(like used in linked-lists or trees).  If that is the case, then it's 
easily supported once support for pointers is added.

I also don't know what you mean by "open-ended arrays."  The data-format 
is meant to describe a fixed-size chunk of data.

String syntax is not needed to support all of these things.  What I'm 
asking for and proposing is a way to construct an instance of a single 
Python type that communicates this data-format information in a 
standardized way.

-Travis

From martin at v.loewis.de  Tue Oct 31 00:25:08 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Oct 2006 00:25:08 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<20061028135415.GA13049@code0.codespeak.net>		<45445ECE.9050504@v.loewis.de>		<45448058.5020700@v.loewis.de>

Message-ID: <454689D4.9040109@v.loewis.de>

Travis Oliphant schrieb:
> Function pointers are "supported" with the void data-type and could be 
> more specifically supported if it were important.   People typically 
> don't use the buffer protocol to send function-pointers around in a way 
> that the void description wouldn't be enough.

As I said before, I can't tell whether it's important, as I still don't
know what the purpose of this PEP is. If it is to support a unification
of memory layout specifications, and if that unifications is also to
include ctypes, then yes, it is important. If it is to describe array
elements in NumArray arrays, then it might not be important.

For the usage of ctypes, the PEP void type is insufficient to describe
function pointers: you also need a specification of the signature of
the function pointer (parameter types and return type), or else you
can't use the function pointer (i.e. you can't call the function).

> Pointers are also "supported" with the void data-type.  If pointers to 
> other data-types were an important feature to support, then this could 
> be added in many ways (a flag on the data-type object for example is how 
> this is done is NumPy).

For ctypes, (I think) you need "true" pointers to other layouts, or
else you couldn't set up the memory correctly.

I don't understand how this could work with some extended buffer
protocol, though: would a buffer still have to be a contiguous piece
of memory? If you have structures with pointers in them, they
rarely point to contiguous memory.

> Unions are actually supported (just define two fields with the same 
> offset).

Ah, ok. What's the string syntax for it?

> I don't know what you mean by "packed structs" (unless you are talking 
> about alignment issues in which case there is support for it).

Yes, this is indeed about alignment; I missed it. What's the string
syntax for it?

> I'm not sure I understand what you mean by "incomplete / recursive" 
> types unless you are referring to something like a node where an element 
> of the structure is a pointer to another structure of the same kind 
> (like used in linked-lists or trees).  If that is the case, then it's 
> easily supported once support for pointers is added.

That's what I mean, yes. I'm not sure how it can easily be added,
though. Suppose you want to describe

struct item{
  int key;
  char* value;
  struct item *next;
};

How would you do that? Something like

item = datatype([('key', 'i4'), ('value', 'S*'), ('next',
'what_to_put_here*')]

can't work: item hasn't been assigned, yet, so you can't
use it as the field type.

> I also don't know what you mean by "open-ended arrays."  The data-format 
> is meant to describe a fixed-size chunk of data.

I see. In C (and thus in ctypes), you sometimes have what C99 calls
"flexible array member":

struct PyString{
  Py_ssize_t ob_refcnt;
  PyObject *ob_type;
  Py_ssize_t ob_len;
  char ob_sval[];
};

where the ob_sval field can extend arbitrarily, as it is the last
member of the struct. Of course, this will give you dynamically-sized
objects (objects in C cannot really be "variable-sized", since the
size of a memory block has to be defined at allocation time, and
can't really change afterwards).

> String syntax is not needed to support all of these things.

Ok. That's confusing in the PEP: it's not clear whether all these
forms are meant to be equivalent, and, if not, which one is the most
generic one, and what aspects are missing in what forms. Also,
if you have a datatype which cannot be expressed in the string
syntax, what is its "str" attribute?

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue Oct 31 00:36:46 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Oct 2006 12:36:46 +1300
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 

Message-ID: <45468C8E.1000203@canterbury.ac.nz>

Travis Oliphant wrote:

> Part of the problem is that ctypes uses a lot of different Python types 
> (that's what I mean by "multi-object" to accomplish it's goal).  What 
> I'm looking for is a single Python type that can be passed around and 
> explains binary data.

It's not clear that multi-object is a bad thing in and
of itself. It makes sense conceptually -- if you have
a datatype object representing a struct, and you ask
for a description of one of its fields, which could
be another struct or array, you would expect to get
another datatype object describing that.

Can you elaborate on what would be wrong with this?

Also, can you clarify whether your objection is to
multi-object or multi-type. They're not the same thing --
you could have a data structure built out of multiple
objects that are all of the same Python type, with
attributes distinguishing between struct, array, etc.
That would be single-type but multi-object.

--
Greg

From greg.ewing at canterbury.ac.nz  Tue Oct 31 00:43:11 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Oct 2006 12:43:11 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References:  <45431761.1020401@egenix.com>
	 <45439D17.5010306@egenix.com>
	 <4543FFA2.30002@canterbury.ac.nz>
	 <4545320C.603@canterbury.ac.nz>

Message-ID: <45468E0F.80000@canterbury.ac.nz>

Travis Oliphant wrote:

> The 'bit' type re-intprets the size information to be in units of "bits" 
> and so implies a "bit-field" instead of another data-format.

Hmmm, okay, but now you've got another orthogonality
problem, because you can't distinguish between e.g.
a 5-bit signed int field and a 5-bit unsigned int
field.

It might be better not to consider "bit" to be a
type at all, and come up with another way of indicating
that the size is in bits. Perhaps

    'i4'   # 4-byte signed int
    'i4b'  # 4-bit signed int
    'u4'   # 4-byte unsigned int
    'u4b'  # 4-bit unsigned int

(Next we can have an argument about whether bit
fields should be packed MSB-to-LSB or vice versa...:-)

--
Greg

From greg.ewing at canterbury.ac.nz  Tue Oct 31 00:49:12 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Oct 2006 12:49:12 +1300
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 
	<20061028135415.GA13049@code0.codespeak.net>

	<45445ECE.9050504@v.loewis.de> 
	<45448058.5020700@v.loewis.de> 
Message-ID: <45468F78.7050707@canterbury.ac.nz>

Travis Oliphant wrote:

> I'm not sure I understand what you mean by "incomplete / recursive" 
> types unless you are referring to something like a node where an element 
> of the structure is a pointer to another structure of the same kind 
> (like used in linked-lists or trees).

Yes, and more complex arrangements of types that
reference each other.

>  If that is the case, then it's 
> easily supported once support for pointers is added.

But it doesn't fit easily into the single-object
model.

--
Greg

From oliphant.travis at ieee.org  Tue Oct 31 02:58:28 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon, 30 Oct 2006 18:58:28 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <20061028135415.GA13049@code0.codespeak.net>
References: 
	<20061028135415.GA13049@code0.codespeak.net>
Message-ID: 

Armin Rigo wrote:
> Hi Travis,
> 
> On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote:
> 
>>    This PEP proposes adapting the data-type objects from NumPy for
>>    inclusion in standard Python, to provide a consistent and standard
>>    way to discuss the format of binary data. 
> 
> 
> How does this compare with ctypes?  Do we really need yet another,
> incompatible way to describe C-like data structures in the standarde
> library?

There is a lot of subtlety in the details that IMHO clouds the central 
issue which I will try to clarify here the way I see it.

First of all:

In order to make sense of the data-format object that I'm proposing you 
have to see the need to share information about data-format through an 
extended buffer protocol (which I will be proposing soon).  I'm not 
going to try to argue that right now because there are a lot of people 
who can do that.

So, I'm going to assume that you see the need for it.  If you don't, 
then just suspend concern about that for the moment.  There are a lot of 
us who really see the need for it.

Now:

To describe data-formats ctypes uses a Python type-object defined for 
every data-format you might need.

In my view this is an 'over-use' of the type-object and in fact, to be 
useful, requires the definition of a meta-type that carries the relevant 
additions to the type-object that are needed to describe data (like 
function pointers to get data in and out of Python objects).

My view is that it is un-necessary to use a different type object to 
describe each different data-type.

The route I'm proposing is to define (in C) a *single* new Python type 
(called a data-format type) that carries the information needed to 
describe a chunk of memory.

In this way *instances* of this new type define data-formats.

In ctypes *instances* of the "meta-type" (i.e. new types) define 
data-formats (actually I'm not sure if all the new c-types are derived 
from the same meta-type).

So, the big difference is that I think data-formats should be 
*instances* of a single type.  There is no need to define a Python 
type-object for every single data-type.  In fact, not only is there no 
need, it makes the extended buffer protocol I'm proposing even more 
difficult to use and explain.

Again, my real purpose is the extended buffer protocol.  These 
data-format type is a means to that end.  If the consensus is that 
nobody sees a greater use of the data-format type beyond the buffer 
protocol, then I will just write 1 PEP for the extended buffer protocol.

-Travis

From oliphant.travis at ieee.org  Tue Oct 31 03:00:59 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon, 30 Oct 2006 19:00:59 -0700
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <45468C8E.1000203@canterbury.ac.nz>
References: 	
	<45468C8E.1000203@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:
> Travis Oliphant wrote:
> 
> 
>>Part of the problem is that ctypes uses a lot of different Python types 
>>(that's what I mean by "multi-object" to accomplish it's goal).  What 
>>I'm looking for is a single Python type that can be passed around and 
>>explains binary data.
> 
> 
> It's not clear that multi-object is a bad thing in and
> of itself. It makes sense conceptually -- if you have
> a datatype object representing a struct, and you ask
> for a description of one of its fields, which could
> be another struct or array, you would expect to get
> another datatype object describing that.
> 
> Can you elaborate on what would be wrong with this?
> 
> Also, can you clarify whether your objection is to
> multi-object or multi-type. They're not the same thing --
> you could have a data structure built out of multiple
> objects that are all of the same Python type, with
> attributes distinguishing between struct, array, etc.
> That would be single-type but multi-object.

I've tried to clarify this in another post.  Basically, what I don't 
like about the ctypes approach is that it is multi-type (every new 
data-format is a Python type).

In order to talk about all these Python types together, then they must 
all share some attribute (or else be derived from a meta-type in C with 
a specific function-pointer entry).

I think it is simpler to think of a single Python type whose instances 
convey information about data-format.

-Travis

From oliphant.travis at ieee.org  Tue Oct 31 06:10:17 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Mon, 30 Oct 2006 22:10:17 -0700
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 		<45468C8E.1000203@canterbury.ac.nz>

Message-ID: 

Travis Oliphant wrote:
> Greg Ewing wrote:
>> Travis Oliphant wrote:
>>
>>
>>> Part of the problem is that ctypes uses a lot of different Python types 
>>> (that's what I mean by "multi-object" to accomplish it's goal).  What 
>>> I'm looking for is a single Python type that can be passed around and 
>>> explains binary data.
>>
>> It's not clear that multi-object is a bad thing in and
>> of itself. It makes sense conceptually -- if you have
>> a datatype object representing a struct, and you ask
>> for a description of one of its fields, which could
>> be another struct or array, you would expect to get
>> another datatype object describing that.

Yes, exactly.  This is what the Python type I'm proposing does as well. 
   So, perhaps we are misunderstanding each other.  The difference is 
that data-types are instances of the data-type (data-format) object 
instead of new Python types (as they are in ctypes).
> 
> I've tried to clarify this in another post.  Basically, what I don't 
> like about the ctypes approach is that it is multi-type (every new 
> data-format is a Python type).
> 

I should clarify that I have no opinion about the ctypes approach for 
what ctypes does with it.  I like ctypes and have adapted NumPy to make 
it easier to work with ctypes.

I'm saying that I don't like the idea of forcing this approach on 
everybody else who wants to describe arbitrary binary data just because 
ctypes is included.  Now, if it is shown that it is indeed better than a 
simpler instances-of-a-single-type approach that I'm basically proposing 
  then I'll be persuaded.

However, the existence of an alternative strategy using a single Python 
type and multiple instances of that type to describe binary data (which 
is the NumPy approach and essentially the array module approach) means 
that we can't just a-priori assume that the way ctypes did it is the 
only or best way.

The examples of "missing features" that Martin has exposed are not 
show-stoppers.  They can all be easily handled within the context of 
what is being proposed.   I can modify the PEP to show this.  But, I 
don't have the time to spend if it's just all going to be rejected in 
the end.  I need some encouragement in order to continue to invest 
energy in pushing this forward.

-Travis

From oliphant.travis at ieee.org  Tue Oct 31 06:51:18 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Mon, 30 Oct 2006 22:51:18 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45468E0F.80000@canterbury.ac.nz>
References: 
	<45431761.1020401@egenix.com>	
	<45439D17.5010306@egenix.com>	
	<4543FFA2.30002@canterbury.ac.nz>	
	<4545320C.603@canterbury.ac.nz>	
	<45468E0F.80000@canterbury.ac.nz>
Message-ID: 

Greg Ewing wrote:
> Travis Oliphant wrote:
> 
>> The 'bit' type re-intprets the size information to be in units of "bits" 
>> and so implies a "bit-field" instead of another data-format.
> 
> Hmmm, okay, but now you've got another orthogonality
> problem, because you can't distinguish between e.g.
> a 5-bit signed int field and a 5-bit unsigned int
> field.

Good point.

> 
> It might be better not to consider "bit" to be a
> type at all, and come up with another way of indicating
> that the size is in bits. Perhaps
> 
>     'i4'   # 4-byte signed int
>     'i4b'  # 4-bit signed int
>     'u4'   # 4-byte unsigned int
>     'u4b'  # 4-bit unsigned int
> 

I like this.  Very nice.  I think that's the right way to look at it.

> (Next we can have an argument about whether bit
> fields should be packed MSB-to-LSB or vice versa...:-)

I guess we need another flag / attribute to indicate that.

The other thing that needs to be discussed at some point may be a way to 
indicate the floating-point format.  I've basically punted on this and 
just meant 'f' to mean "platform float"

Thus, you can't use the data-type object to pass information between two 
platforms that don't share a common floating point representation.

-Travis

From oliphant.travis at ieee.org  Tue Oct 31 07:12:48 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Mon, 30 Oct 2006 23:12:48 -0700
Subject: [Python-Dev] PEP: Extending the buffer protocol to share array
	information.
Message-ID: 

Attached is my PEP for extending the buffer protocol to allow array data 
to be shared.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pep_buffer.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20061030/90c68b35/attachment.txt 

From oliphant.travis at ieee.org  Tue Oct 31 07:32:47 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Mon, 30 Oct 2006 23:32:47 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4543B016.7070002@egenix.com>
References: 	<45431761.1020401@egenix.com>		<45439D17.5010306@egenix.com>	
	<4543B016.7070002@egenix.com>
Message-ID: <4546EE0F.90000@ieee.org>

M.-A. Lemburg wrote:
> Travis E. Oliphant wrote:
> 
> I understand and that's why I'm asking why you made the range
> explicit in the definition.
> 

In the case of NumPy it was so that String and Unicode arrays would both 
look like multi-length string "character" arrays and not arrays of 
arrays of some character.

But, this can change in the data-format object.  I can see that the 
Unicode description needs to be improved.

> The definition should talk about Unicode code points.
> The number of bytes then determines whether you can only
> represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only)
> or UCS4 (4 bytes, all currently assigned code points).

Yes, you are correct.  A string of unicode characters should really be 
represented in the same way that an array of integers is represented for 
a data-format object.

-Travis

From martin at v.loewis.de  Tue Oct 31 08:51:25 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Oct 2006 08:51:25 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<20061028135415.GA13049@code0.codespeak.net>

Message-ID: <4547007D.30404@v.loewis.de>

Travis Oliphant schrieb:
> So, the big difference is that I think data-formats should be 
> *instances* of a single type.

This is nearly the case for ctypes as well. All layout descriptions
are instances of the type type. Nearly, because they are instances
of subtypes of the type type:

py> type(ctypes.c_long)

py> type(ctypes.c_double)

py> type(ctypes.c_double).__bases__
(,)
py> type(ctypes.Structure)

py> type(ctypes.Array)

py> type(ctypes.Structure).__bases__
(,)
py> type(ctypes.Array).__bases__
(,)

So if your requirement is "all layout descriptions ought to have
the same type", then this is (nearly) the case: they are instances
of type (rather then datatype, as in your PEP).

Regards,
Martin

From p.f.moore at gmail.com  Tue Oct 31 10:47:08 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 31 Oct 2006 09:47:08 +0000
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 
	<20061028135415.GA13049@code0.codespeak.net>

Message-ID: <79990c6b0610310147q74851b19v55e7caab6f87c444@mail.gmail.com>

On 10/31/06, Travis Oliphant  wrote:
> In order to make sense of the data-format object that I'm proposing you
> have to see the need to share information about data-format through an
> extended buffer protocol (which I will be proposing soon).  I'm not
> going to try to argue that right now because there are a lot of people
> who can do that.
>
> So, I'm going to assume that you see the need for it.  If you don't,
> then just suspend concern about that for the moment.  There are a lot of
> us who really see the need for it.

[...]

> Again, my real purpose is the extended buffer protocol.  These
> data-format type is a means to that end.  If the consensus is that
> nobody sees a greater use of the data-format type beyond the buffer
> protocol, then I will just write 1 PEP for the extended buffer protocol.

While I don't personally use NumPy, I can see where an extended buffer
protocol like you describe could be advantageous, and so I'm happy to
concede that benefit.

I can also vaguely see that a unified "block of memory description"
would be useful. My interest would be in the area of the struct module
(unpacking and packing data for dumping to byte streams - whether this
happens in place or not is not too important to this use case).
However, I cannot see how your proposal would help here in practice -
does it include the functionality of the struct module (or should it?)
If so, then I'd like to see examples of equivalent constructs. If not,
then isn't it yet another variation on the theme, adding to the
problem of multiple approaches rather than helping?

I can also see the parallels with ctypes. Here I feel a little less
sure that keeping the two approaches is wrong. I don't know why I feel
like that - maybe nothing more than familiarity with ctypes - but I
don't have the same reluctance to have both the ctypes data definition
stuff and the new datatype proposal.

Enough of the abstract. As a concrete example, suppose I have a (byte)
string in my program containing some binary data - an ID3 header, or a
TCP packet, or whatever. It doesn't really matter. Does your proposal
offer anything to me in how I might manipulate that data (assuming I'm
not using NumPy)? (I'm not insisting that it should, I'm just trying
to understand the scope of the PEP).

Paul.

From ncoghlan at gmail.com  Tue Oct 31 13:44:26 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 31 Oct 2006 22:44:26 +1000
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 		<45468C8E.1000203@canterbury.ac.nz>	

Message-ID: <4547452A.5040501@gmail.com>

Travis E. Oliphant wrote:
> However, the existence of an alternative strategy using a single Python 
> type and multiple instances of that type to describe binary data (which 
> is the NumPy approach and essentially the array module approach) means 
> that we can't just a-priori assume that the way ctypes did it is the 
> only or best way.

As a hypothetical, what if there was a helper function that translated a 
description of a data structure using basic strings and sequences (along the 
lines of what you have in your PEP) into a ctypes data structure?

> The examples of "missing features" that Martin has exposed are not 
> show-stoppers.  They can all be easily handled within the context of 
> what is being proposed.   I can modify the PEP to show this.  But, I 
> don't have the time to spend if it's just all going to be rejected in 
> the end.  I need some encouragement in order to continue to invest 
> energy in pushing this forward.

I think the most important thing in your PEP is the formats for describing 
structures in a way that is easy to construct in both C and Python 
(specifically, by using strings and sequences), and it is worth pursuing for 
that aspect alone. Whether that datatype is then implemented as a class in its 
own right or as a factory function that returns a ctypes data type object is, 
to my mind, a relatively minor implementation issue (either way has questions 
to be addressed - I'm not sure how you tell ctypes that you have a 32-bit 
integer with a non-native endian format, for example).

In fact, it may make sense to just use the lists/strings directly as the data 
exchange format definitions, and let the various libraries do their own 
translation into their private format descriptions instead of creating a new 
one-type-to-describe-them-all.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From gmccaughan at synaptics-uk.com  Tue Oct 31 11:56:50 2006
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Tue, 31 Oct 2006 11:56:50 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References:  <45468E0F.80000@canterbury.ac.nz>

Message-ID: <200610311056.51070.gmccaughan@synaptics-uk.com>

> > It might be better not to consider "bit" to be a
> > type at all, and come up with another way of indicating
> > that the size is in bits. Perhaps
> > 
> >     'i4'   # 4-byte signed int
> >     'i4b'  # 4-bit signed int
> >     'u4'   # 4-byte unsigned int
> >     'u4b'  # 4-bit unsigned int
> > 
> 
> I like this.  Very nice.  I think that's the right way to look at it.

I remark that 'ib4' and 'ub4' make for marginally easier
parsing and less danger of ambiguity.

-- 
g

From mcherm at mcherm.com  Tue Oct 31 14:26:35 2006
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue, 31 Oct 2006 05:26:35 -0800
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
Message-ID: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com>

In this email I'm responding to a series of emails from Travis
pretty much in the order I read them:

Travis Oliphant writes:
> I'm saying we should introduce a single-object mechanism for  
> describing binary data so that the many-object approach of c-types  
> does not become some kind of de-facto standard.  C-types can  
> "translate" this object-instance to its internals if and when it  
> needs to.
>
> In the mean-time, how are other packages supposed to communicate  
> binary information about data with each other?

Here we disagree.

I haven't used C-types. I have no idea whether it is well-designed or
horribly unusable. So if someone wanted to argue that C-types is a
mistake and should be thrown out, I'd be willing to listen. Until
someone tries to make that argument, I'm presuming it's good enough to
be part of the standard library for Python.

Given that, I think that it *SHOULD* become a de-facto standard. I
think that the way different packages should communicate binary information
about data with each other is using C-types. Not because it's wonderful
(remember, I've never used it), but because it's STANDARD. There should
be one obvious way to do things! When there is, it makes interoperability
WAY easier, and interoperability is the main objective when dealing with
things like binary data formats.

Propose using C-types. Or propose *improving* C-types. But don't propose
ignoring it.

In a different message, he writes:
> It also bothers me that so many ways to describe binary data are  
> being used out there.  This is a problem that deserves being solved.  
>  And, no, ctypes hasn't solved it (we can't directly use the ctypes  
> solution).

Really? Why? Is this a failing in C-types? Can C-types be "fixed"?

Later he explains:
> Remember the buffer protocol is in compiled code.  So, as a result,
>
> 1) It's harder to construct a class to pass through the protocol  
> using the multiple-types approach of ctypes.
>
> 2) It's harder to interpret the object recevied through the buffer protocol.
>
> Sure, it would be *possible* to use ctypes, but I think it would be  
> very difficult.  Think about how you would write the get_data_format  
> C function in the extended buffer protocol for NumPy if you had to  
> import ctypes and then build a class just to describe your data.   
> How would you interpret what you get back?

Aha! So what you REALLY ought to be asking for is a C interface to the
ctypes module. That seems like a very sensible and reasonable request.

> I don't think we should just *use ctypes because it's there* when  
> the way it describes binary data was not constructed with the  
> extended buffer protocol in mind.

I just disagree. (1) I *DO* think we should "just use ctypes because it's
there". After all, the problem we're trying to solve is one of
COMPATIBILITY - you don't solve those by introducing competing standards.
(2) From what I understand of it, I think ctypes is quite capable of
describing data to be accessed via the buffer protocol.

In another email:
> In order to make sense of the data-format object that I'm proposing  
> you have to see the need to share information about data-format  
> through an extended buffer protocol (which I will be proposing  
> soon).  I'm not going to try to argue that right now because there  
> are a lot of people who can do that.

Actually, no need to convince me... I am already convinced of the
wisdom of this approach.

> My view is that it is un-necessary to use a different type object to  
> describe each different data-type.
      [...]
> So, the big difference is that I think data-formats should be  
> *instances* of a single type.

Why? Who cares? Seriously, if we were proposing to describe the layouts
with a collection of rubber bands and potato chips, I'd say it was a
crazy idea. But we're proposing using data structures in a computer
memory. Why does it matter whether those data structures are of the same
"python type" or different "python types"? I care whether the structure
can be created, passed around, and interrogated. I don't care what
Python type they are.

> I'm saying that I don't like the idea of forcing this approach on  
> everybody else who wants to describe arbitrary binary data just  
> because ctypes is included.

And I'm saying that I *do*. Hey, if someone proposed getting rid of
the current syntax for the array module (for Py3K) and replacing it with
use of ctypes, I'd give it serious consideration. There should be only
one way to describe binary structures. It should be powerful enough to
describe almost any structure, easy-to-use, and most of all it should be
used consistently everywhere.

> I need some encouragement in order to continue to invest energy in  
> pushing this forward.

Please keep up the good work! Some day I'd like to see NumPy built in
to the standard Python distribution. The incremental, PEP by PEP approach
you are taking is the best route to getting there. But there may be
some changes along the way -- convergence with ctypes may be one of
those.

-------------

Look, my advice is to try to make ctypes work for you. Not having any
easy way to construct or to interrogate ctypes objects from C is a
legitimate complaint... and if you can define your requirements, it
should be relatively easy to add a C interface to meet those needs.

-- Michael Chermside

From oliphant.travis at ieee.org  Tue Oct 31 16:32:39 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Tue, 31 Oct 2006 08:32:39 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com>
References: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com>
Message-ID: 

Michael Chermside wrote:
> In this email I'm responding to a series of emails from Travis
> pretty much in the order I read them:
> 
>>
>> In the mean-time, how are other packages supposed to communicate  
>> binary information about data with each other?
> 
> Here we disagree.
> 
> I haven't used C-types. I have no idea whether it is well-designed or
> horribly unusable. So if someone wanted to argue that C-types is a
> mistake and should be thrown out, I'd be willing to listen. 
> Until
> someone tries to make that argument, I'm presuming it's good enough to
> be part of the standard library for Python.

My problem with this argument is two fold:

1) I'm not sure you really know what your talking about since you 
apparently haven't used either ctypes or NumPy (I've used both and so 
forgive me if I claim to understand the strengths of the data-format 
representations that each uses a bit better).  Therefore, it's hard for 
me to take your opinion seriously.  I will try though. I understand you 
have a preference for not wildly expanding the ways to do similar 
things.  I share that preference with you.

2) You are assuming that because it's good enough for the standard 
library means that the way they describe data-formats (using a separate 
Python type for each one) is the *one true way*.  When was this 
discussed?   Frankly it's a weak argument because the struct module has 
been around for a lot longer.  Why didn't the ctypes module follow that 
standard?  Or the standard that's in the array module for describing 
data-types.  That's been there for a long time too.  Why wasn't ctypes 
forced to use that approach?

The reason it wasn't is because it made sense for ctypes to use a 
separate type for each data-format object so that you could call 
C-functions as if they were Python functions.  If this is your goal, 
then it seems like a good idea (though not strictly necessary) to use a 
separate Python type for each data-format.

But, there are distinct disadvantages to this approach compared to what 
I'm trying to allow.   Martin claims that the ctypes approach is 
*basically* equivalent but this is just not true.  It could be made more 
true if the ctypes objects inherited from a "meta-type" and if Python 
allowed meta-types to expand their C-structures.  But, last I checked 
this is not possible.

A Python type object is a very particular kind of Python-type.  As far 
as I can tell, it's not as flexible in terms of the kinds of things you 
can do with the "instances" of a type object (i.e. what ctypes types 
are) on the C-level.

The other disadvantage of what you are describing is: Who is going to 
write the code?

I'm happy to have the data-format object live separate from ctypes and 
leave it to the ctypes author(s) to support it if desired.  But, the 
claim that the extended buffer protocol jump through all kinds of hoops 
to conform to the "ctypes standard" when that "standard" was designed 
with a different idea in mind is not acceptable.

Ctypes has only been in Python since 2.5 and the array interface was 
around before that.   Numeric has been around longer than ctypes.  The 
array module and the struct modules in Python have also both been around 
longer than ctypes as well.

Where is the discussion that crowned the ctypes way of doing things as 
"the one true way"

> 
> In a different message, he writes:
>> It also bothers me that so many ways to describe binary data are  
>> being used out there.  This is a problem that deserves being solved.  
>>  And, no, ctypes hasn't solved it (we can't directly use the ctypes  
>> solution).
> 
> Really? Why? Is this a failing in C-types? Can C-types be "fixed"?

You can't grow C-function pointers on to an existing type object.   You 
are also carrying around a lot of weight in the Python type object that 
is un-necessary if all you are doing is describing data.

> 
> I just disagree. (1) I *DO* think we should "just use ctypes because it's
> there". After all, the problem we're trying to solve is one of
> COMPATIBILITY - you don't solve those by introducing competing standards.
> (2) From what I understand of it, I think ctypes is quite capable of
> describing data to be accessed via the buffer protocol.

Capable but not supporting all the things I'm talking about.  The ctypes 
objects don't have any of the methods or attributes (or C function 
pointers) that I've described.  Nor should they necessarily grow them.

> 
> Why? Who cares? Seriously, if we were proposing to describe the layouts
> with a collection of rubber bands and potato chips, I'd say it was a
> crazy idea. But we're proposing using data structures in a computer
> memory. Why does it matter whether those data structures are of the same
> "python type" or different "python types"? I care whether the structure
> can be created, passed around, and interrogated. I don't care what
> Python type they are.

Sure, but the flexibility you have with an instance of a Python type is 
different then when that instance must itself also be a Python type.  It 
*is* different.  This is quite noticeable in C especially.

> 
>> I'm saying that I don't like the idea of forcing this approach on  
>> everybody else who wants to describe arbitrary binary data just  
>> because ctypes is included.
> 
> And I'm saying that I *do*. Hey, if someone proposed getting rid of
> the current syntax for the array module (for Py3K) and replacing it with
> use of ctypes, I'd give it serious consideration. There should be only
> one way to describe binary structures. It should be powerful enough to
> describe almost any structure, easy-to-use, and most of all it should be
> used consistently everywhere.

I'm not opposed to convergence, but ctypes must be willing to come to us 
too.  It's devleopment of a "standard" was not done with the array 
interface in mind so why should it be surprising that it does not fill 
the need for us.

-Travis

From oliphant.travis at ieee.org  Tue Oct 31 17:54:12 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue, 31 Oct 2006 09:54:12 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <4547007D.30404@v.loewis.de>
References: 	<20061028135415.GA13049@code0.codespeak.net>	
	<4547007D.30404@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Travis Oliphant schrieb:
> 
>>So, the big difference is that I think data-formats should be 
>>*instances* of a single type.
> 
> 
> This is nearly the case for ctypes as well. All layout descriptions
> are instances of the type type. Nearly, because they are instances
> of subtypes of the type type:
> 
> py> type(ctypes.c_long)
> 
> py> type(ctypes.c_double)
> 
> py> type(ctypes.c_double).__bases__
> (,)
> py> type(ctypes.Structure)
> 
> py> type(ctypes.Array)
> 
> py> type(ctypes.Structure).__bases__
> (,)
> py> type(ctypes.Array).__bases__
> (,)
> 
> So if your requirement is "all layout descriptions ought to have
> the same type", then this is (nearly) the case: they are instances
> of type (rather then datatype, as in your PEP).
> 

The big difference, however, is that by going this route you are forced 
to use the "type object" as your data-format "instance".  This is 
fitting a square peg into a round hole in my opinion.    To really be 
useful, you would need to add the attributes and (most importantly) 
C-function pointers and C-structure members to these type objects.  I 
don't even think that is possible in Python (even if you do create a 
meta-type that all the c-type type objects can use that carries the same 
information).

There are a few people claiming I should use the ctypes type-hierarchy 
but nobody has explained how that would be possible given the 
attributes, C-structure members and C-function pointers that I'm proposing.

In NumPy we also have a Python type for each basic data-format (we call 
them array scalars).  For a little while they carried the data-format 
information on the Python side.  This turned out to be not flexible 
enough.  So, we expanded the PyArray_Descr * structure which has always 
been a part of Numeric (and the array module array type) into an actual 
Python type and a lot of things became possible.

It was clear to me that we were "on to something".  Now, the biggest 
claim against the gist of what I'm proposing (details we can argue 
about), seems from my perspective to be a desire to "go backwards" and 
carry data-type information around with a Python type.

The data-type object did not just appear out of thin-air one day.  It 
really can be seen as an evolution from the beginnings of Numeric (and 
the Python array module).

So, this is what we came up with in the NumPy world.  Ctypes came up 
with something a bit different.  It is not "trivial" to "just use 
ctypes."  I could say the same thing and tell ctypes to just use NumPy's 
  data-type object.   It could be done that way, but of course it would 
take a bit of work on the part of ctypes to make that happen.

Having ctypes in the standard library does not mean that any other 
discussion of how data-format should be represented has been decided on. 
    If I had known that was what it meant to put ctypes in the standard 
library, I would have been more vocal several months ago.

-Travis

From oliphant.travis at ieee.org  Tue Oct 31 18:13:39 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue, 31 Oct 2006 10:13:39 -0700
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <4547452A.5040501@gmail.com>
References: 		<45468C8E.1000203@canterbury.ac.nz>		
	<4547452A.5040501@gmail.com>
Message-ID: 

Nick Coghlan wrote:
> Travis E. Oliphant wrote:
> 
>>However, the existence of an alternative strategy using a single Python 
>>type and multiple instances of that type to describe binary data (which 
>>is the NumPy approach and essentially the array module approach) means 
>>that we can't just a-priori assume that the way ctypes did it is the 
>>only or best way.
> 
> 
> As a hypothetical, what if there was a helper function that translated a 
> description of a data structure using basic strings and sequences (along the 
> lines of what you have in your PEP) into a ctypes data structure?
> 

That would be fine and useful in fact.  I don't see how it helps the 
problem of "what to pass through the buffer protocol"  I see passing 
c-types type objects around on the c-level as an un-necessary and 
burdensome approach unless the ctypes objects were significantly enhanced.

> 
> In fact, it may make sense to just use the lists/strings directly as the data 
> exchange format definitions, and let the various libraries do their own 
> translation into their private format descriptions instead of creating a new 
> one-type-to-describe-them-all.

Yes, I'm open to this possibility.   I basically want two things in the 
object passed through the extended buffer protocol:

1) It's fast on the C-level
2) It covers all the use-cases.

If just a particular string or list structure were passed, then I would 
drop the data-format PEP and just have the dataformat argument of the 
extended buffer protocol be that thing.

Then, something that converts ctypes objects to that special format 
would be very nice indeed.

-Travis

From martin at v.loewis.de  Tue Oct 31 18:27:18 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Oct 2006 18:27:18 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com>

Message-ID: <45478776.3040607@v.loewis.de>

Travis E. Oliphant schrieb:
> But, there are distinct disadvantages to this approach compared to what 
> I'm trying to allow.   Martin claims that the ctypes approach is 
> *basically* equivalent but this is just not true.

I may claim that, but primarily, my goal was to demonstrate that the
proposed PEP cannot be used to describe ctypes object layouts (without
checking, I can readily believe that the PEP covers everything in
the array and struct modules).

> It could be made more 
> true if the ctypes objects inherited from a "meta-type" and if Python 
> allowed meta-types to expand their C-structures.  But, last I checked 
> this is not possible.

That I don't understand. a) what do you think is not possible? b)
why is that an important difference between a datatype and a ctype?

If you are suggesting that, given two Python types A and B, and
B inheriting from A, that the memory layout of B cannot extend
the memory layout of A, then: that is certainly possible in Python,
and there are many examples for it.

> A Python type object is a very particular kind of Python-type.  As far 
> as I can tell, it's not as flexible in terms of the kinds of things you 
> can do with the "instances" of a type object (i.e. what ctypes types 
> are) on the C-level.

Ah, you are worried that NumArray objects would have to be *instances*
of ctypes types. That wouldn't be necessary at all. Instead, if each
NumArray object had a method get_ctype(), which returned a ctypes type,
then you would get the same desciptiveness that you get with the
PEP's datatype.

> I'm happy to have the data-format object live separate from ctypes and 
> leave it to the ctypes author(s) to support it if desired.  But, the 
> claim that the extended buffer protocol jump through all kinds of hoops 
> to conform to the "ctypes standard" when that "standard" was designed 
> with a different idea in mind is not acceptable.

That, of course, is a reasoning I can understand. This is free software,
contributors can chose to contribute whatever they want; you can't force
anybody to do anything specific you want to get done. Acceptance of
any PEP (not just this PEP) should always be contingent on available
of a patch implementing it.

> Where is the discussion that crowned the ctypes way of doing things as 
> "the one true way"

It hasn't been crowned this way. Me, personally, I just said two things
about this PEP and ctypes:
a) the PEP does not support all concepts that ctypes needs
b) ctypes can express all examples in the PEP
in response to your proposal that ctypes should adopt the PEP, and
that ctypes is not good enough to be the one true way.

Regards,
Martin

From andorxor at gmx.de  Tue Oct 31 18:31:30 2006
From: andorxor at gmx.de (Stephan Tolksdorf)
Date: Tue, 31 Oct 2006 18:31:30 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <454689D4.9040109@v.loewis.de>
References: 	<20061028135415.GA13049@code0.codespeak.net>		<45445ECE.9050504@v.loewis.de>		<45448058.5020700@v.loewis.de>	
	<454689D4.9040109@v.loewis.de>
Message-ID: <45478872.2010906@gmx.de>

Martin v. L?wis wrote:
> Travis Oliphant schrieb:
>> Function pointers are "supported" with the void data-type and could be 
>> more specifically supported if it were important.   People typically 
>> don't use the buffer protocol to send function-pointers around in a way 
>> that the void description wouldn't be enough.
> 
> As I said before, I can't tell whether it's important, as I still don't
> know what the purpose of this PEP is. If it is to support a unification
> of memory layout specifications, and if that unifications is also to
> include ctypes, then yes, it is important. If it is to describe array
> elements in NumArray arrays, then it might not be important.
 >
 > For the usage of ctypes, the PEP void type is insufficient to describe
 > function pointers: you also need a specification of the signature of
 > the function pointer (parameter types and return type), or else you
 > can't use the function pointer (i.e. you can't call the function).

The buffer protocol is primarily meant for describing the format of 
(large) contiguous pieces of binary data. In most cases that will be all 
kinds of numerical data for scientific applications, image and other 
media data, simple databases and similar kinds of data.

There is currently no adequate data format type which sufficiently 
supports these applications, otherwise Travis wouldn't make this proposal.

While Travis' proposal encompasses the data format functionality within 
the struct module and overlaps with what ctypes has to offer, it does 
not aim to replace ctypes.

I don't think that a basic data format type necessarily should be able 
to encode all the information a foreign function interface needs to call 
a code library. From my point of view, that kind of information is one 
abstraction layer above a basic data format and should be implemented as 
an extension of or complementary to the basic data format.

I also do not understand why the data format type should attempt to 
fully describe arbitrarily complex data formats, like fragmented 
(non-continuous) data structures in memory. You'd probably need a full 
programming language for that anyway.

Regards,
   Stephan

From theller at ctypes.org  Tue Oct 31 18:38:01 2006
From: theller at ctypes.org (Thomas Heller)
Date: Tue, 31 Oct 2006 18:38:01 +0100
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 		<45468C8E.1000203@canterbury.ac.nz>

Message-ID: <454789F9.7050808@ctypes.org>

Travis Oliphant schrieb:
> Greg Ewing wrote:
>> Travis Oliphant wrote:
>> 
>> 
>>>Part of the problem is that ctypes uses a lot of different Python types 
>>>(that's what I mean by "multi-object" to accomplish it's goal).  What 
>>>I'm looking for is a single Python type that can be passed around and 
>>>explains binary data.
>> 
>> 
>> It's not clear that multi-object is a bad thing in and
>> of itself. It makes sense conceptually -- if you have
>> a datatype object representing a struct, and you ask
>> for a description of one of its fields, which could
>> be another struct or array, you would expect to get
>> another datatype object describing that.
>> 
>> Can you elaborate on what would be wrong with this?
>> 
>> Also, can you clarify whether your objection is to
>> multi-object or multi-type. They're not the same thing --
>> you could have a data structure built out of multiple
>> objects that are all of the same Python type, with
>> attributes distinguishing between struct, array, etc.
>> That would be single-type but multi-object.
> 
> I've tried to clarify this in another post.  Basically, what I don't 
> like about the ctypes approach is that it is multi-type (every new 
> data-format is a Python type).
> 
> In order to talk about all these Python types together, then they must 
> all share some attribute (or else be derived from a meta-type in C with 
> a specific function-pointer entry).

(I tried to read the whole thread again, but it is too large already.)

There is a (badly named, probably) api to access information
about ctypes types and instances of this type.  The functions are
PyObject_stgdict(obj) and PyType_stgdict(type).  Both return a
'StgDictObject' instance or NULL if the funtion fails.  This object
is the ctypes' type object's __dict__.

StgDictObject is a subclass of PyDictObject and has fields that
carry information about the C type (alignment requirements, size in bytes,
plus some other stuff).  Also it contains several pointers to functions
that implement (in C) struct-like functionality (packing/unpacking).

Of course several of these fields can only be used for ctypes-specific
purposes, for example a pointer to the ffi_type which is used when
calling foreign functions, or the restype, argtypes, and errcheck fields
which are only used when the type describes a function pointer.

This mechanism is probably a hack because it'n not possible to add C accessible
fields to type objects, on the other hand it is extensible (in principle, at least).

Just to describe the implementation.

Thomas

From martin at v.loewis.de  Tue Oct 31 18:48:33 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Oct 2006 18:48:33 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<20061028135415.GA13049@code0.codespeak.net>		<4547007D.30404@v.loewis.de>

Message-ID: <45478C71.2010600@v.loewis.de>

Travis Oliphant schrieb:
> The big difference, however, is that by going this route you are forced 
> to use the "type object" as your data-format "instance".

Since everything is an object (an "instance) in Python, this is not
such a big difference.

> This is 
> fitting a square peg into a round hole in my opinion.    To really be 
> useful, you would need to add the attributes and (most importantly) 
> C-function pointers and C-structure members to these type objects. 

Can you explain why that is? In the PEP, I see two C fucntions:
setitem and getitem. I think they can be implemented readily with
ctypes' GETFUNC and SETFUNC function pointers that it uses
all over the place.

I don't see a requirement to support C structure members or
function pointers in the datatype object.

> There are a few people claiming I should use the ctypes type-hierarchy 
> but nobody has explained how that would be possible given the 
> attributes, C-structure members and C-function pointers that I'm proposing.

Ok, here you go. Remember, I'm still not claiming that this should be
done: I'm just explaining how it could be done.

- byteorder/isnative: I think this could be derived from the
  presence of the _swappedbytes_ field
- itemsize: can be done with ctypes.sizeof
- kind: can be created through a mapping of the _type_ field
  (I think)
- fields: can be derived from the _fields_ member
- hasobject: compare, recursively, with py_object
- name: use __name__
- base: again, created from _type_ (if _length_ is present)
- shape: recursively look at _length_
- alignment: use ctypes.alignment

> It was clear to me that we were "on to something".  Now, the biggest 
> claim against the gist of what I'm proposing (details we can argue 
> about), seems from my perspective to be a desire to "go backwards" and 
> carry data-type information around with a Python type.

I, at least, have no such desire. I just explained that the ctypes
model of memory layouts is just as expressive as the one in the
PEP. Which of these is "better" for what the PEP wants to achieve,
I can't say, because I still don't quite understand what the PEP
wants to achieve.

Regards,
Martin

From martin at v.loewis.de  Tue Oct 31 18:58:01 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Oct 2006 18:58:01 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45478872.2010906@gmx.de>
References: 	<20061028135415.GA13049@code0.codespeak.net>		<45445ECE.9050504@v.loewis.de>		<45448058.5020700@v.loewis.de>		<454689D4.9040109@v.loewis.de>
	<45478872.2010906@gmx.de>
Message-ID: <45478EA9.80302@v.loewis.de>

Stephan Tolksdorf schrieb:
> While Travis' proposal encompasses the data format functionality within 
> the struct module and overlaps with what ctypes has to offer, it does 
> not aim to replace ctypes.

This discussion could have been a lot shorter if he had said so.
Unfortunately (?) he stated that it was *precisely* a motivation
of the PEP to provide a standard data description machinery that
can then be adopted by the struct, array, and ctypes modules.

> I also do not understand why the data format type should attempt to 
> fully describe arbitrarily complex data formats, like fragmented 
> (non-continuous) data structures in memory. You'd probably need a full 
> programming language for that anyway.

For an FFI application, you need to be able to describe arbitrary
in-memory formats, since that's what the foreign function will
expect. For type safety and reuse, you better separate the
description of the layout from the creation of the actual values.
Otherwise (i.e. if you have to define the layout on each invocation),
creating the parameters for a foreign function becomes very tedious
and error-prone, with errors often being catastrophic (i.e. interpreter
crashes).

Regards,
Martin

From oliphant.travis at ieee.org  Tue Oct 31 20:48:28 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue, 31 Oct 2006 12:48:28 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45478C71.2010600@v.loewis.de>
References: 	<20061028135415.GA13049@code0.codespeak.net>		<4547007D.30404@v.loewis.de>	
	<45478C71.2010600@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Travis Oliphant schrieb:
> 
>>The big difference, however, is that by going this route you are forced 
>>to use the "type object" as your data-format "instance".
> 
> 
> Since everything is an object (an "instance) in Python, this is not
> such a big difference.
> 

I think it actually is.  Perhaps I'm wrong, but a type-object is still a 
special kind of an instance of a meta-type.  I once tried to add 
function pointers to a type object by inheriting from it.  But, I was 
told that Python is not set up to handle that.  Maybe I misunderstood.

Let me be very clear.  The whole reason I make any statements about 
ctypes is because somebody else brought it up.  I'm not trying to 
replace ctypes and the way it uses type objects to represent data 
internally.   All I'm trying to do is come up with a way to describe 
data-types through a buffer protocol.  The way ctypes does it is "too" 
bulky by definining a new Python type for every data-format.

While semantically you may talk about the equivalency of types being 
instances of a "meta-type" and regular objects being instances of a 
type.  My understanding is still that there are practical differences 
when it comes to implementation --- and certain things that "can't be done"

Here's what I mean by the difference.

This is akin to what I'm proposing

struct {
	PyObject_HEAD
	/* whatever you need to represent your instance
	Quite a bit of flexibility....
	*/
} PyDataFormatObject;

A Python type object (what every C-types data-format "type" inherits 
from) has a C-structure

struct {
	PyObject_VAR_HEAD
	char *tp_name;
         int tp_basicsize, tp_itemsize;

         /* Methods to implement standard operations */

         destructor tp_dealloc;
         printfunc tp_print;
         getattrfunc tp_getattr;
         setattrfunc tp_setattr;
         cmpfunc tp_compare;
         reprfunc tp_repr;

	...
	...

	PyObject *tp_bases;
         PyObject *tp_mro; /* method resolution order */
         PyObject *tp_cache;
         PyObject *tp_subclasses;
         PyObject *tp_weaklist;
         destructor tp_del;

         ... /* + more under certain conditions */
} PyTypeObject;

Why in the world do we need to carry all this extra baggage around in 
each data-format instance in order to just describe data?  I can see why 
it's useful for ctypes to do it and that's fine.  But, the argument that 
every exchange of data-format information should use this type-object 
instance is hard to swallow.

So, I'm happy to let ctypes continue on doing what it's doing trusting 
its developers to have done something good.  I'd be happy to drop any 
reference to ctypes.  The only reason to have the data-type objects is 
something to pass as part of the extended buffer protocol.

> 
> 
> Can you explain why that is? In the PEP, I see two C fucntions:
> setitem and getitem. I think they can be implemented readily with
> ctypes' GETFUNC and SETFUNC function pointers that it uses
> all over the place.

Sure, but where do these function pointers live and where are they 
stored.  In ctypes it's in the CField_object.  Now, this is closer to 
what I'm talking about.  But, why is not not the same thing.  Why, yet 
another type object to talk about fields of a structure?

These are rhetorical questions.  I really don't expect or need an answer 
because I'm not questioning why ctypes did what it did for solving the 
problem it was solving.  I am questioning anyone who claims that we 
should use this mechanism for describing data-formats in the extended 
buffer protocol.

> 
> I don't see a requirement to support C structure members or
> function pointers in the datatype object.
> 
> 
>>There are a few people claiming I should use the ctypes type-hierarchy 
>>but nobody has explained how that would be possible given the 
>>attributes, C-structure members and C-function pointers that I'm proposing.
> 
> 
> Ok, here you go. Remember, I'm still not claiming that this should be
> done: I'm just explaining how it could be done.

O.K.  Thanks for putting in the effort.   It doesn't answer my real 
concerns, though.

>>It was clear to me that we were "on to something".  Now, the biggest 
>>claim against the gist of what I'm proposing (details we can argue 
>>about), seems from my perspective to be a desire to "go backwards" and 
>>carry data-type information around with a Python type.
> 
> 
> I, at least, have no such desire. I just explained that the ctypes
> model of memory layouts is just as expressive as the one in the
> PEP. 

I agree with this.  I'm very aware of what "can" be expressed.  I just 
think it's too awkard and bulky to use in the extended buffer protocol

> Which of these is "better" for what the PEP wants to achieve,
> I can't say, because I still don't quite understand what the PEP
> wants to achieve.
>

Are you saying you still don't understand after having read the extended 
buffer protocol PEP, yet?

-Travis

From oliphant.travis at ieee.org  Tue Oct 31 21:04:53 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue, 31 Oct 2006 13:04:53 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45478776.3040607@v.loewis.de>
References: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com>	
	<45478776.3040607@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Travis E. Oliphant schrieb:
> 
>>But, there are distinct disadvantages to this approach compared to what 
>>I'm trying to allow.   Martin claims that the ctypes approach is 
>>*basically* equivalent but this is just not true.
> 
> 
> I may claim that, but primarily, my goal was to demonstrate that the
> proposed PEP cannot be used to describe ctypes object layouts (without
> checking, I can readily believe that the PEP covers everything in
> the array and struct modules).
> 

That's a fine argument.  You are right in terms of the PEP as it stands. 
  However, I want to make clear that a single Python type object *could* 
be used to describe data including all the cases you laid out.  It would 
not be difficult to extend the PEP to cover all the cases you've 
described --- I'm not sure that's desireable.  I'm not trying to replace 
what ctypes does.  I'm just trying to get something that we can use to 
exchange data-format information through the extended buffer protocol.

It really comes down to using Python type-objects as the instances 
describing data-formats (which ctypes does) or "normal" Python objects 
as the instances describing data-formats (what the PEP proposes).

> 
>>It could be made more 
>>true if the ctypes objects inherited from a "meta-type" and if Python 
>>allowed meta-types to expand their C-structures.  But, last I checked 
>>this is not possible.
> 
> 
> That I don't understand. a) what do you think is not possible?

Extending the C-structure of PyTypeObject and having Python types use 
that as their "type-object".

  b)
> why is that an important difference between a datatype and a ctype?

Because with instances of C-types you are stuck with the PyTypeObject 
structure.  If you want to add anything you have to do it in the 
dictionary.

Instances of a datatype allow adding anything after the PyObject_HEAD 
structure.

> 
> If you are suggesting that, given two Python types A and B, and
> B inheriting from A, that the memory layout of B cannot extend
> the memory layout of A, then: that is certainly possible in Python,
> and there are many examples for it.
>

I know this.  I've done it for many different objects.  I'm saying it's 
not quite the same when what you are extending is the PyTypeObject and 
trying to use it as the type object for some other object.

> 
>>A Python type object is a very particular kind of Python-type.  As far 
>>as I can tell, it's not as flexible in terms of the kinds of things you 
>>can do with the "instances" of a type object (i.e. what ctypes types 
>>are) on the C-level.
> 
> 
> Ah, you are worried that NumArray objects would have to be *instances*
> of ctypes types. That wouldn't be necessary at all. Instead, if each
> NumArray object had a method get_ctype(), which returned a ctypes type,
> then you would get the same desciptiveness that you get with the
> PEP's datatype.
> 

No, I'm not worried about that (It's not NumArray by the way, it's 
NumPy.  NumPy replaces both NumArray and Numeric).

NumPy actually interfaces with ctypes quite well.  This is how I learned 
anything I might know about ctypes.  So, I'm well aware of this.

What I am concerned about is using Python type objects (i.e. Python 
objects that can be cast in C to PyTypeObject *) outside of ctypes to 
describe data-formats when you don't need it and it just complicates 
dealing with the data-format description.

> 
>>Where is the discussion that crowned the ctypes way of doing things as 
>>"the one true way"
> 
> 
> It hasn't been crowned this way. Me, personally, I just said two things
> about this PEP and ctypes:

Thanks for clarifying, but I know you didn't say this.  Others, however, 
basically did.

> a) the PEP does not support all concepts that ctypes needs

It could be extended, but I'm not sure it *needs* to be in it's real 
context.  I'm very sorry for contributing to the distraction that ctypes 
should adopt the PEP.  My words were unclear.  But, I'm not pushing for 
that.  I really have no opinion how ctypes describes data.

> b) ctypes can express all examples in the PEP
> in response to your proposal that ctypes should adopt the PEP, and
> that ctypes is not good enough to be the one true way.
> 

I think it is "good enough" in the semantic sense.  But, I think using 
type objects in this fashion for general-purpose data-description is 
over-kill and will be much harder to extend and deal with.

-Travis

From oliphant.travis at ieee.org  Tue Oct 31 21:13:25 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue, 31 Oct 2006 13:13:25 -0700
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <454789F9.7050808@ctypes.org>
References: 		<45468C8E.1000203@canterbury.ac.nz>	
	<454789F9.7050808@ctypes.org>
Message-ID: 

Thomas Heller wrote:
> 
> (I tried to read the whole thread again, but it is too large already.)
> 
> There is a (badly named, probably) api to access information
> about ctypes types and instances of this type.  The functions are
> PyObject_stgdict(obj) and PyType_stgdict(type).  Both return a
> 'StgDictObject' instance or NULL if the funtion fails.  This object
> is the ctypes' type object's __dict__.
> 
> StgDictObject is a subclass of PyDictObject and has fields that
> carry information about the C type (alignment requirements, size in bytes,
> plus some other stuff).  Also it contains several pointers to functions
> that implement (in C) struct-like functionality (packing/unpacking).
> 
> Of course several of these fields can only be used for ctypes-specific
> purposes, for example a pointer to the ffi_type which is used when
> calling foreign functions, or the restype, argtypes, and errcheck fields
> which are only used when the type describes a function pointer.
> 
> 
> This mechanism is probably a hack because it'n not possible to add C accessible
> fields to type objects, on the other hand it is extensible (in principle, at least).
> 

Thank you for the description.  While I've studied the ctypes code, I 
still don't understand the purposes beind all the data-structures.

Also, I really don't have an opinion about ctypes' implementation.   All 
my comparisons are simply being resistant to the "unexplained" idea that 
I'm supposed to use ctypes objects in a way they weren't really designed 
to be used.

For example, I'm pretty sure you were the one who made me aware that you 
can't just extend the PyTypeObject.  Instead you extended the tp_dict of 
the Python typeObject to store some of the extra information that is 
needed to describe a data-type like I'm proposing.

So, if you I'm just describing data-format information, why do I need 
all this complexity (that makes ctypes implementation easier/more 
natural/etc)?  What if the StgDictObject is the Python data-format 
object I'm talking about?  It actually looks closer.

But, if all I want is the StgDictObject (or something like it), then why 
should I pass around the whole type object?

This is all I'm saying to those that want me to use ctypes to describe 
data-formats in the extended buffer protocol.  I'm not trying to change 
anything in ctypes.

-Travis

From theller at ctypes.org  Tue Oct 31 21:46:15 2006
From: theller at ctypes.org (Thomas Heller)
Date: Tue, 31 Oct 2006 21:46:15 +0100
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 		<45468C8E.1000203@canterbury.ac.nz>		<454789F9.7050808@ctypes.org>

Message-ID: <4547B617.3050400@ctypes.org>

Travis Oliphant schrieb:
> For example, I'm pretty sure you were the one who made me aware that you 
> can't just extend the PyTypeObject.  Instead you extended the tp_dict of 
> the Python typeObject to store some of the extra information that is 
> needed to describe a data-type like I'm proposing.
> 
> So, if you I'm just describing data-format information, why do I need 
> all this complexity (that makes ctypes implementation easier/more 
> natural/etc)?  What if the StgDictObject is the Python data-format 
> object I'm talking about?  It actually looks closer.
> 
> But, if all I want is the StgDictObject (or something like it), then why 
> should I pass around the whole type object?

Maybe you don't need it.  ctypes certainly needs the type object because
it is also used for constructing instances (while NumPy uses factory functions,
IIUC), or for converting 'native' Python object into foreign function arguments.

I know that this doesn't interest you from the NumPy perspective (and I don't want
to offend you by saying this).

> This is all I'm saying to those that want me to use ctypes to describe 
> data-formats in the extended buffer protocol.  I'm not trying to change 
> anything in ctypes.

I don't want to change anything in NumPy, either, and was not the one who
suggested to use ctypes objects, although I had thought about whether it
would be possible or not.

What I like about ctypes, and dislike about Numeric/Numarry/NumPy is
the way C compatible types are defined in ctypes.  I find the ctypes
way more natural than the numxxx or array module way, but what else would
anyone expect from me as the ctypes author...

I hope that a useful interface is developed from your proposals, and
will be happy to adapt ctypes to use it or interface ctypes with it
if this makes sense.

Thomas

From oliphant.travis at ieee.org  Tue Oct 31 21:56:30 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue, 31 Oct 2006 13:56:30 -0700
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: <45478EA9.80302@v.loewis.de>
References: 	<20061028135415.GA13049@code0.codespeak.net>		<45445ECE.9050504@v.loewis.de>		<45448058.5020700@v.loewis.de>		<454689D4.9040109@v.loewis.de>	<45478872.2010906@gmx.de>
	<45478EA9.80302@v.loewis.de>
Message-ID: 

Martin v. L?wis wrote:
> Stephan Tolksdorf schrieb:
> 
>>While Travis' proposal encompasses the data format functionality within 
>>the struct module and overlaps with what ctypes has to offer, it does 
>>not aim to replace ctypes.
> 
> 
> This discussion could have been a lot shorter if he had said so.
> Unfortunately (?) he stated that it was *precisely* a motivation
> of the PEP to provide a standard data description machinery that
> can then be adopted by the struct, array, and ctypes modules.

Struct and array I was sure about.  Ctypes less sure.  I'm very sorry 
for the distraction I caused by mis-stating my objective.   My objective 
is really the extended buffer protocol.  The data-type object is a means 
to that end.

I do think ctypes could make use of the data-type object and that there 
is a real difference between using Python type objects as data-format 
descriptions and using another Python type for those descriptions.  I 
thought to go the ctypes route (before I even knew what ctypes did) but 
decided against it for a number of reasons.

But, nonetheless those are side issues.  The purpose of the PEP is to 
provide an object that the extended buffer protocol can use to share 
data-format information.  It should be considered primarily in that context.

-Travis

From martin at v.loewis.de  Tue Oct 31 22:12:11 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Oct 2006 22:12:11 +0100
Subject: [Python-Dev] PEP:  Adding data-type objects to Python
In-Reply-To: 
References: 	<20061028135415.GA13049@code0.codespeak.net>		<4547007D.30404@v.loewis.de>		<45478C71.2010600@v.loewis.de>

Message-ID: <4547BC2B.30406@v.loewis.de>

Travis Oliphant schrieb:
> I think it actually is.  Perhaps I'm wrong, but a type-object is still a 
> special kind of an instance of a meta-type.  I once tried to add 
> function pointers to a type object by inheriting from it.  But, I was 
> told that Python is not set up to handle that.  Maybe I misunderstood.

I'm not quite sure what the problems are: one "obvious" problem is
that the next Python version may also extend the size of type objects.
But, AFAICT, even that should "work", in the sense that this new version
should check for the presence of a flag to determine whether the
additional fields are there. The only tricky question is how you can
find out whether your own extension is there.

If that is a common problem, I think a framework could be added to
support extensible type objects (with some kind of registry for
additional fields, and a per-type-object indicator whether a certain
extension field is present).

> Let me be very clear.  The whole reason I make any statements about 
> ctypes is because somebody else brought it up.  I'm not trying to 
> replace ctypes and the way it uses type objects to represent data 
> internally.

Ok. I understood you differently earlier.

Regards,
Martin

From p.f.moore at gmail.com  Tue Oct 31 22:12:59 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 31 Oct 2006 21:12:59 +0000
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: 
References: 
	<20061028135415.GA13049@code0.codespeak.net>
	 <4547007D.30404@v.loewis.de>
	 <45478C71.2010600@v.loewis.de>

Message-ID: <79990c6b0610311312y2a749b4bw617f0cf18ae9d660@mail.gmail.com>

On 10/31/06, Travis Oliphant  wrote:
> Martin v. L?wis wrote:
> > [...] because I still don't quite understand what the PEP
> > wants to achieve.
> >
>
> Are you saying you still don't understand after having read the extended
> buffer protocol PEP, yet?

I can't speak for Martin, but I don't understand how I, as a Python
programmer, might use the data type objects specified in the PEP. I
have skimmed the extended buffer protocol PEP, but I'm conscious that
no objects I currently use support the extended buffer protocol (and
the PEP doesn't mention adding support to existing objects), so I
don't see that as too relevant to me.

I have also installed numpy, and looked at the help for numpy.dtype,
but that doesn't add much to the PEP. The freely available chapters of
the numpy book explain how dtypes describe data structures, but not
how to use them. The freely available Numeric documentation doesn't
refer to dtypes, as far as I can tell. Is there any documentation on
how to use dtypes, independently of other features of numpy? If not,
can you clarify where the benefit lies for a Python user of this
proposal? (I understand the benefits of a common language for
extensions to communicate datatype information, but why expose it to
Python? How do Python users use it?)

This is probably all self-evident to the numpy community, but I think
that as the PEP is aimed at a wider audience it needs a little more
background.

Paul.

From martin at v.loewis.de  Tue Oct 31 22:26:30 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Oct 2006 22:26:30 +0100
Subject: [Python-Dev] PEP: Extending the buffer protocol to share array
 information.
In-Reply-To: 
References: 
Message-ID: <4547BF86.6070806@v.loewis.de>

Travis E. Oliphant schrieb:
>     Several extensions to Python utilize the buffer protocol to share
>     the location of a data-buffer that is really an N-dimensional
>     array.  However, there is no standard way to exchange the
>     additional N-dimensional array information so that the data-buffer
>     is interpreted correctly.  The NumPy project introduced an array
>     interface (http://numpy.scipy.org/array_interface.shtml) through a
>     set of attributes on the object itself.  While this approach
>     works, it requires attribute lookups which can be expensive when
>     sharing many small arrays.  

Can you please give examples for real-world applications of this
interface, preferably examples involving multiple
independently-developed libraries?
("this" being the current interface in NumPy - I understand that
 the PEP's interface isn't implemented, yet)

Paul Moore (IIRC) gave the example of equalising the green values
and maximizing the red values in a PIL image by passing it to NumPy:
Is that a realistic (even though not-yet real-world) example? If
so, what algorithms of NumPy would I use to perform this image
manipulation (and why would I use NumPy for it if I could just
write a for loop that does that in pure Python, given PIL's
getpixel/setdata)?

Regards,
Martin

From brett at python.org  Tue Oct 31 23:40:11 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 31 Oct 2006 14:40:11 -0800
Subject: [Python-Dev] PEP: Extending the buffer protocol to share array
	information.
In-Reply-To: 
References: 
Message-ID: 

On 10/30/06, Travis E. Oliphant  wrote:
>
> Attached is my PEP for extending the buffer protocol to allow array data
> to be shared.

You might want to reference this thread (
http://mail.python.org/pipermail/python-3000/2006-August/003309.html) as
Guido mentions that extending the buffer protocol to tell more about the
data in the buffer and "would offer the numarray folks their 'array
interface'".

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061031/404b5f60/attachment.html 

From jcarlson at uci.edu  Tue Oct 31 23:59:11 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 31 Oct 2006 14:59:11 -0800
Subject: [Python-Dev] PEP: Adding data-type objects to Python
In-Reply-To: <79990c6b0610311312y2a749b4bw617f0cf18ae9d660@mail.gmail.com>
References: 
	<79990c6b0610311312y2a749b4bw617f0cf18ae9d660@mail.gmail.com>
Message-ID: <20061031144447.C0D7.JCARLSON@uci.edu>

"Paul Moore"  wrote:
> On 10/31/06, Travis Oliphant  wrote:
> > Martin v. L?wis wrote:
> > > [...] because I still don't quite understand what the PEP
> > > wants to achieve.
> > >
> >
> > Are you saying you still don't understand after having read the extended
> > buffer protocol PEP, yet?
> 
> I can't speak for Martin, but I don't understand how I, as a Python
> programmer, might use the data type objects specified in the PEP. I
> have skimmed the extended buffer protocol PEP, but I'm conscious that
> no objects I currently use support the extended buffer protocol (and
> the PEP doesn't mention adding support to existing objects), so I
> don't see that as too relevant to me.

Presumably str in 2.x and bytes in 3.x could be extended to support the
'S' specifier, unicode in 2.x and text in 3.x could be extended to
support the 'U' specifier.  The various array.array variants could be
extended to support all relevant specifiers, etc.

> This is probably all self-evident to the numpy community, but I think
> that as the PEP is aimed at a wider audience it needs a little more
> background.

Someone correct me if I am wrong, but it allows things equivalent to the
following that is available in C, available in Python...

    typedef struct {
        char R;
        char G;
        char B;
        char A;
    } pixel_RGBA;

    pixel_RGBA image[1024][768];

Or even...

    typedef struct {
        long long numerator;
        unsigned long long denominator;
        double approximation;
    } rational;

    rational ratios[1024];

The real use is that after you have your array of (packed) objects, be
it one of the above samples, or otherwise, you don't need to explicitly
pass around specifiers (like in struct, or ctypes), numpy and others can
talk to each other, and pick up the specifier with the extended buffer
protocol, and it just works.

 - Josiah

From paul.chiusano at gmail.com  Sun Oct 29 16:51:01 2006
From: paul.chiusano at gmail.com (Paul Chiusano)
Date: Sun, 29 Oct 2006 10:51:01 -0500
Subject: [Python-Dev] Status of pairing_heap.py?
Message-ID: 

I was looking for a good pairing_heap implementation and came across
one that had apparently been checked in a couple years ago (!). Here
is the full link:

http://svn.python.org/view/sandbox/trunk/collections/pairing_heap.py?rev=40887&view=markup

I was just wondering about the status of this implementation. The api
looks pretty good to me -- it's great that the author decided to have
the insert method return a node reference which can then be passed to
delete and adjust_key. It's a bit of a pain to implement that
functionality, but it's extremely useful for a number of applications.

If that project is still alive, I have a couple api suggestions:

* Add a method which nondestructively yields the top K elements of the
heap. This would work by popping the top k elements of the heap into a
list, then reinserting those elements in reverse order. By reinserting
the sorted elements in reverse order, the top of the heap is
essentially a sorted linked list, so if the exact operation is
repeated again, the removals take contant time rather than amortized
logarthmic.
  * So, for example: if we have a min heap, the topK method would pop
K elements from the heap, say they are {1, 3, 5, 7}, then do
insert(7), followed by insert(5), ... insert(1).
  * Even better might be if this operation avoided having to allocate
new heap nodes, and just reused the old ones.
 * I'm not sure if adjust_key should throw an exception if the key
adjustment is in the wrong direction. Perhaps it should just fall back
on deleting and reinserting that node?

Paul