From bob at redivi.com Sun Oct 1 00:21:50 2006 From: bob at redivi.com (Bob Ippolito) Date: Sat, 30 Sep 2006 15:21:50 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> <451E2F32.9070405@v.loewis.de> <451E31ED.7030905@gmail.com> Message-ID: <6a36e7290609301521g44b56e59iecc3b0c448cd91c3@mail.gmail.com> On 9/30/06, Terry Reedy wrote: > > "Nick Coghlan" wrote in message > news:451E31ED.7030905 at gmail.com... > >I suspect the problem would typically stem from floating point values that > >are > >read in from a human-readable file rather than being the result of a > >'calculation' as such: > > For such situations, one could create a translation dict for both common > float values and for non-numeric missing value indicators. For instance, > flotran = {'*': None, '1.0':1.0, '2.0':2.0, '4.0':4.0} > The details, of course, depend on the specific case. But of course you have to know that common float values are never cached and that it may cause you problems. Some users may expect them to be because common strings and integers are cached. -bob From rasky at develer.com Sun Oct 1 00:19:22 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sun, 1 Oct 2006 00:19:22 +0200 Subject: [Python-Dev] PEP 355 status References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm><2mk63lfu6j.fsf@starship.python.net> Message-ID: <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Guido van Rossum wrote: > OK. Pronouncement: PEP 355 is dead. The authors (or the PEP editor) > can update the PEP. > > I'm looking forward to a new PEP. It would be terrific if you gave us some clue about what is wrong in PEP355, so that the next guy does not waste his time. For instance, I find PEP355 incredibly good for my own path manipulation (much cleaner and concise than the awful os.path+os+shutil+stat mix), and I have trouble understanding what is *so* wrong with it. You said "it's an amalgam of unrelated functionality", but you didn't say what exactly is "unrelated" for you. Giovanni Bajo From skip at pobox.com Sun Oct 1 00:37:49 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 30 Sep 2006 17:37:49 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <20060929081402.GB19781@craig-wood.com> Message-ID: <17694.61885.527128.686743@montanaro.dyndns.org> Steve> By these statistics I think the answer to the original question Steve> is clearly "no" in the general case. As someone else (Guido?) pointed out, the literal case isn't all that interesting. I modified floatobject.c to track a few interesting floating point values: static unsigned int nfloats[5] = { 0, /* -1.0 */ 0, /* 0.0 */ 0, /* +1.0 */ 0, /* everything else */ 0, /* whole numbers from -10.0 ... 10.0 */ }; PyObject * PyFloat_FromDouble(double fval) { register PyFloatObject *op; if (free_list == NULL) { if ((free_list = fill_free_list()) == NULL) return NULL; } if (fval == 0.0) nfloats[1]++; else if (fval == 1.0) nfloats[2]++; else if (fval == -1.0) nfloats[0]++; else nfloats[3]++; if (fval >= -10.0 && fval <= 10.0 && (int)fval == fval) { nfloats[4]++; } /* Inline PyObject_New */ op = free_list; free_list = (PyFloatObject *)op->ob_type; PyObject_INIT(op, &PyFloat_Type); op->ob_fval = fval; return (PyObject *) op; } static void _count_float_allocations(void) { fprintf(stderr, "-1.0: %d\n", nfloats[0]); fprintf(stderr, " 0.0: %d\n", nfloats[1]); fprintf(stderr, "+1.0: %d\n", nfloats[2]); fprintf(stderr, "rest: %d\n", nfloats[3]); fprintf(stderr, "whole numbers -10.0 to 10.0: %d\n", nfloats[4]); } then called atexit(_count_float_allocations) in _PyFloat_Init and ran "make test". The output was: ... ./python.exe -E -tt ../Lib/test/regrtest.py -l ... -1.0: 29048 0.0: 524241 +1.0: 91561 rest: 1749807 whole numbers -10.0 to 10.0: 1151442 So for a largely non-floating point "application", a fair number of floats are allocated, a bit over 25% of them are -1.0, 0.0 or +1.0, and nearly 50% of them are whole numbers between -10.0 and 10.0, inclusive. Seems like it at least deserves a serious look. It would be nice to have the numeric crowd contribute to this subject as well. Skip From bob at redivi.com Sun Oct 1 00:52:32 2006 From: bob at redivi.com (Bob Ippolito) Date: Sat, 30 Sep 2006 15:52:32 -0700 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: References: Message-ID: <6a36e7290609301552s45435ce7l7a841d9673f59101@mail.gmail.com> On 9/30/06, Scott David Daniels wrote: > Christos Georgiou wrote: > > Does anyone know why this happens? I can't find any information pointing to > > this being deliberate. > > > > I just upgraded to 2.5 on Windows (after making sure I can build extensions > > with the freeware VC++ Toolkit 2003) and some of my programs stopped > > operating. I saw in a French forum that someone else had the same problem, > > and what they did was to copy the relevant files from a 2.4.3 installation. > > I did the same, and it seems it works, with only a console message appearing > > as soon as a root window is created: > > Also note: the Os/X universal seems to include a Tix runtime for the > non-Intel processor, but not for the Intel processor. This > makes me think there is a build problem. Are you sure about that? What file are you referring to specifically? -bob From Scott.Daniels at Acm.Org Sun Oct 1 01:28:24 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat, 30 Sep 2006 16:28:24 -0700 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: <6a36e7290609301552s45435ce7l7a841d9673f59101@mail.gmail.com> References: <6a36e7290609301552s45435ce7l7a841d9673f59101@mail.gmail.com> Message-ID: Bob Ippolito wrote: > On 9/30/06, Scott David Daniels wrote: >> Christos Georgiou wrote: >>> Does anyone know why this happens? I can't find any information pointing to >>> this being deliberate. >> Also note: the Os/X universal seems to include a Tix runtime for the >> non-Intel processor, but not for the Intel processor. This >> makes me think there is a build problem. > > Are you sure about that? What file are you referring to specifically? OK, from the 2.5 universal: (hand-typed, I e-mail from another machine) =========== Using Idle =========== >>> import Tix >>> Tix.Tk() Traceback (most recent call last): File "(pyshell#8)", line 1, in (module) Tix.Tk() File "/Library/Frameworks/Python.framework/Versions/2.5/ lib/python2.5/lib-tk/Tix.py", line 210 in __init__ self.tk.eval('package require Tix') TclError: no suitable image found. Did find: /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. =========== From the command line =========== >>> import Tix >>> Tix.Tk() Traceback (most recent call last): File "", line 1, in (module) File "/Library/Frameworks/Python.framework/Versions/2.5/ lib/python2.5/lib-tk/Tix.py", line 210 in __init__ self.tk.eval('package require Tix') _tkinter.TclError: no suitable image found. Did find: /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. -- Scott David Daniels Scott.Daniels at Acm.Org From bob at redivi.com Sun Oct 1 01:33:22 2006 From: bob at redivi.com (Bob Ippolito) Date: Sat, 30 Sep 2006 16:33:22 -0700 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: References: <6a36e7290609301552s45435ce7l7a841d9673f59101@mail.gmail.com> Message-ID: <6a36e7290609301633h319cc7b7l8cbf796838a9af63@mail.gmail.com> On 9/30/06, Scott David Daniels wrote: > Bob Ippolito wrote: > > On 9/30/06, Scott David Daniels wrote: > >> Christos Georgiou wrote: > >>> Does anyone know why this happens? I can't find any information pointing to > >>> this being deliberate. > >> Also note: the Os/X universal seems to include a Tix runtime for the > >> non-Intel processor, but not for the Intel processor. This > >> makes me think there is a build problem. > > > > Are you sure about that? What file are you referring to specifically? > > OK, from the 2.5 universal: (hand-typed, I e-mail from another machine) > > > =========== Using Idle =========== > >>> import Tix > >>> Tix.Tk() > > Traceback (most recent call last): > File "(pyshell#8)", line 1, in (module) > Tix.Tk() > File "/Library/Frameworks/Python.framework/Versions/2.5/ > lib/python2.5/lib-tk/Tix.py", line 210 in __init__ > self.tk.eval('package require Tix') > TclError: no suitable image found. Did find: > /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. > > =========== From the command line =========== > > >>> import Tix > >>> Tix.Tk() > > Traceback (most recent call last): > File "", line 1, in (module) > File "/Library/Frameworks/Python.framework/Versions/2.5/ > lib/python2.5/lib-tk/Tix.py", line 210 in __init__ > self.tk.eval('package require Tix') > _tkinter.TclError: no suitable image found. Did find: > /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. Those files are not distributed with Python. -bob From kbk at shore.net Sun Oct 1 04:11:45 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat, 30 Sep 2006 22:11:45 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200610010211.k912BjNN001090@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 422 open ( +2) / 3415 closed ( +5) / 3837 total ( +7) Bugs : 933 open (+18) / 6212 closed (+26) / 7145 total (+44) RFE : 237 open ( +2) / 239 closed ( +1) / 476 total ( +3) New / Reopened Patches ______________________ platform.py support for IronPython (2006-09-23) http://python.org/sf/1563842 opened by Anthony Baxter pybench support for IronPython (2006-09-23) http://python.org/sf/1563844 opened by Anthony Baxter Py_signal_pipe (2006-09-24) http://python.org/sf/1564547 opened by Gustavo J. A. M. Carneiro tarfile depends on undocumented behaviour (2006-09-25) http://python.org/sf/1564981 opened by Seo Sanghyeon use LSB version information to detect a platform (2006-09-25) http://python.org/sf/1565037 opened by Matthias Klose doc changes for SMTP_SSL (2006-09-28) http://python.org/sf/1567274 opened by Monty Taylor super() and instancemethod() shouldn't accept keyword args (2006-09-29) CLOSED http://python.org/sf/1567691 opened by ?iga Seilnacht Patches Closed ______________ Python 2.5 fails with -Wl,--as-needed in LDFLAGS (2006-09-21) http://python.org/sf/1562825 closed by masterdriverz super() and instancemethod() shouldn't accept keyword args (2006-09-29) http://python.org/sf/1567691 closed by gbrandl Enable SSL for smtplib (2005-09-05) http://python.org/sf/1282340 closed by gbrandl pyclbr reports different module for Class and Function (2006-09-18) http://python.org/sf/1560617 closed by gbrandl datetime's strftime limits strings to 127 chars (2006-09-12) http://python.org/sf/1557390 closed by gbrandl New / Reopened Bugs ___________________ Quitter object masked (2006-05-01) http://python.org/sf/1479785 reopened by kbk ,msi fails for AMD Turion 64 mobile (2006-09-21) CLOSED http://python.org/sf/1563185 opened by Andy Harrington temporary file(s) (2006-09-22) CLOSED http://python.org/sf/1563236 opened by Grzegorz Makarewicz http//... test file (2006-09-22) CLOSED http://python.org/sf/1563238 opened by Grzegorz Makarewicz python_d python (2006-09-22) http://python.org/sf/1563243 opened by Grzegorz Makarewicz IDLE doesn't load - apparently without firewall problems (2006-09-22) http://python.org/sf/1563630 opened by dani struct.unpack doens't support buffer protocol objects (2006-09-23) http://python.org/sf/1563759 reopened by loewis struct.unpack doens't support buffer protocol objects (2006-09-23) http://python.org/sf/1563759 opened by Adal Chiriliuc Build of Python 2.5 on AIX 5.3 with GCC Fails (2006-09-22) http://python.org/sf/1563807 opened by Daniel Clark Typo in whatsnew/pep-342.html (2006-09-23) CLOSED http://python.org/sf/1563963 opened by Xavier Bassery IDLE invokes completion even when running code (2006-09-23) http://python.org/sf/1563981 opened by Martin v. L?wis 2.6 changes stomp on 2.5 docs (2006-09-23) http://python.org/sf/1564039 opened by ggpauly Fails to install on Fedora Core 5 (2006-09-20) CLOSED http://python.org/sf/1562171 reopened by mnsummerfield BaseCookie does not support "$Port" (2006-09-24) http://python.org/sf/1564508 opened by Anders Aagaard Unicode comparison change in 2.4 vs. 2.5 (2006-09-24) CLOSED http://python.org/sf/1564763 opened by Joe Wreschnig update Lib/plat-linux2/IN.py (2006-09-25) http://python.org/sf/1565071 opened by Matthias Klose Misbehaviour in zipfile (2006-09-25) CLOSED http://python.org/sf/1565087 opened by Richard Philips make plistlib.py available in every install (2006-09-25) http://python.org/sf/1565129 opened by Matthias Klose os.stat() subsecond file mode time is incorrect on Windows (2006-09-25) http://python.org/sf/1565150 opened by Mike Glassford Repair or Change installation error (2006-09-26) http://python.org/sf/1565509 opened by Greg Hazel does not raise SystemError on too many nested blocks (2006-09-26) http://python.org/sf/1565514 opened by Greg Hazel gc allowing tracebacks to eat up memory (2006-09-26) http://python.org/sf/1565525 opened by Greg Hazel webbrowser on gnome runs wrong browser (2006-09-26) CLOSED http://python.org/sf/1565661 opened by kangabroo 'all' documentation missing online (2006-09-26) http://python.org/sf/1565797 opened by Alan sets missing from standard types list in ref (2006-09-26) http://python.org/sf/1565919 opened by Georg Brandl pyexpat produces fals parsing results in CharacterDataHandle (2006-09-26) CLOSED http://python.org/sf/1565967 opened by Michael Gebetsroither RE (regular expression) matching stuck in loop (2006-09-27) http://python.org/sf/1566086 opened by Fabien Devaux T_ULONG -> double rounding in PyMember_GetOne() (2006-09-27) http://python.org/sf/1566140 opened by Piet Delport Logging problem on Windows XP (2006-09-27) http://python.org/sf/1566280 opened by Pavel Krupets Bad behaviour in .obuf* (2006-09-27) http://python.org/sf/1566331 opened by Sam Dennis test_posixpath failure (2006-09-27) CLOSED http://python.org/sf/1566602 opened by WallsRSolid Idle 2.1 - Calltips Hotkey dies not work (2006-09-27) http://python.org/sf/1566611 opened by fladd Library Reference Section 5.1.8.1 is wrong. (2006-09-27) CLOSED http://python.org/sf/1566663 opened by Chris Connett site-packages isn't created before install_egg_info (2006-09-27) http://python.org/sf/1566719 opened by James Oakley urllib doesn't raise IOError correctly with new IOError (2006-09-28) CLOSED http://python.org/sf/1566800 opened by Arthibus Gissehel unchecked metaclass mro (2006-09-28) http://python.org/sf/1567234 opened by ganges master logging.RotatingFileHandler has no "infinite" backupCount (2006-09-28) http://python.org/sf/1567331 opened by Skip Montanaro False sentence about formatted print in tutorial section 7.1 (2006-09-28) CLOSED http://python.org/sf/1567375 opened by David Benbennick tabs missing in idle options configure (2006-09-28) http://python.org/sf/1567450 opened by jrgutierrez GetFileAttributesExA and Win95 (2006-09-29) http://python.org/sf/1567666 opened by giomach missing _typesmodule.c,Visual Studio 2005 pythoncore.vcproj (2006-09-29) http://python.org/sf/1567910 opened by everbruin http://docs.python.org/tut/node10.html typo (2006-09-29) CLOSED http://python.org/sf/1567976 opened by Simon Morgan GUI scripts always return to an interpreter (2006-09-29) http://python.org/sf/1568075 reopened by jejackson GUI scripts always return to an interpreter (2006-09-29) http://python.org/sf/1568075 opened by jjackson Encoding bug (2006-09-30) CLOSED http://python.org/sf/1568120 opened by ?er FADIL USTA Tix is not included in 2.5 for Windows (2006-09-30) http://python.org/sf/1568240 opened by Christos Georgiou init_types (2006-09-30) http://python.org/sf/1568243 opened by Bosko Vukov broken info files generation (2006-09-30) http://python.org/sf/1568429 opened by Arkadiusz Miskiewicz Bugs Closed ___________ ,msi fails for AMD Turion 64 mobile (2006-09-22) http://python.org/sf/1563185 closed by loewis temporary file(s) (2006-09-22) http://python.org/sf/1563236 closed by gbrandl http//... test file (2006-09-22) http://python.org/sf/1563238 closed by gbrandl Parser crash (2006-09-12) http://python.org/sf/1557232 closed by gbrandl struct.unpack doens't support buffer protocol objects (2006-09-23) http://python.org/sf/1563759 closed by adalx Typo in whatsnew/pep-342.html (2006-09-23) http://python.org/sf/1563963 closed by nnorwitz Fails to install on Fedora Core 5 (2006-09-20) http://python.org/sf/1562171 closed by mnsummerfield Unicode comparison change in 2.4 vs. 2.5 (2006-09-25) http://python.org/sf/1564763 closed by lemburg python 2.5 fails to build with --as-needed (2006-09-18) http://python.org/sf/1560984 closed by gbrandl Misbehaviour in zipfile (2006-09-25) http://python.org/sf/1565087 closed by gbrandl webbrowser on gnome runs wrong browser (2006-09-26) http://python.org/sf/1565661 closed by gbrandl pyexpat produces fals parsing results in CharacterDataHandle (2006-09-26) http://python.org/sf/1565967 closed by loewis test_posixpath failure (2006-09-27) http://python.org/sf/1566602 closed by gbrandl Library Reference Section 5.1.8.1 is wrong. (2006-09-27) http://python.org/sf/1566663 closed by gbrandl urllib doesn't raise IOError correctly with new IOError (2006-09-28) http://python.org/sf/1566800 closed by gbrandl False sentence about formatted print in tutorial section 7.1 (2006-09-28) http://python.org/sf/1567375 closed by gbrandl http://docs.python.org/tut/node10.html typo (2006-09-30) http://python.org/sf/1567976 closed by quiver Encoding bug (2006-09-30) http://python.org/sf/1568120 closed by gbrandl locale.format gives wrong exception on some erroneous input (2006-01-23) http://python.org/sf/1412580 closed by gbrandl cgi.FormContentDict constructor should support parse options (2006-03-24) http://python.org/sf/1457823 closed by gbrandl inspect.getargspec() is wrong for def foo((x)): (2006-03-27) http://python.org/sf/1459159 closed by gbrandl Calls from VBScript clobber passed args (2005-03-03) http://python.org/sf/1156179 closed by gbrandl datetime's strftime limits strings to 127 chars (2006-09-12) http://python.org/sf/1556784 closed by gbrandl unicode('foo', '.utf99') does not raise LookupError (2006-03-09) http://python.org/sf/1446043 closed by gbrandl struct.unpack problem with @, =, < specifiers (2006-05-08) http://python.org/sf/1483963 closed by gbrandl Incomplete info in 7.18.1 ZipFile Objects (2006-08-24) http://python.org/sf/1545836 closed by gbrandl PyString_FromString() clarification (2006-08-24) http://python.org/sf/1546052 closed by gbrandl New / Reopened RFE __________________ Better order in file type descriptions (2006-09-27) http://python.org/sf/1566260 opened by Daniele Varrazzo poplib.py list interface (2006-09-29) http://python.org/sf/1567948 opened by Hasan Diwan From ncoghlan at gmail.com Sun Oct 1 05:56:53 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 01 Oct 2006 13:56:53 +1000 Subject: [Python-Dev] PEP 355 status In-Reply-To: <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm><2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Message-ID: <451F3C85.9050100@gmail.com> Giovanni Bajo wrote: > Guido van Rossum wrote: > >> OK. Pronouncement: PEP 355 is dead. The authors (or the PEP editor) >> can update the PEP. >> >> I'm looking forward to a new PEP. > > It would be terrific if you gave us some clue about what is wrong in PEP355, so > that the next guy does not waste his time. For instance, I find PEP355 > incredibly good for my own path manipulation (much cleaner and concise than the > awful os.path+os+shutil+stat mix), and I have trouble understanding what is > *so* wrong with it. > > You said "it's an amalgam of unrelated functionality", but you didn't say what > exactly is "unrelated" for you. Things the PEP 355 path object lumps together: - string manipulation operations - abstract path manipulation operations (work for non-existent filesystems) - read-only traversal of a concrete filesystem (dir, stat, glob, etc) - addition & removal of files/directories/links within a concrete filesystem Dumping all of these into a single class is certainly practical from a utility point of view, but it's about as far away from beautiful as you can get, which creates problems from a learnability point of view, and from a capability-based security point of view. PEP 355 itself splits the methods up into 11 distinct categories when listing the interface. At the very least, I would want to split the interface into separate abstract and concrete interfaces. The abstract object wouldn't care whether or not the path actually existed on the current filesystem (and hence could be relied on to never raise IOError), whereas the concrete object would include the many operations that might need to touch the real IO device. (the PEP has already made a step in the right direction here by removing the methods that accessed a file's contents, leaving that job to the file object where it belongs). There's a case to be made for the abstract object inheriting from str or unicode for compatiblity with existing code, but an alternative would be to enhance the standard library to better support the use of non-basestring objects to describe filesystem paths. A PEP should at least look into what would have to change at the Python API level and the C API level to go that route rather than the inheritance route. For the concrete interface, the behaviour is very dependent on whether the path refers to a file, directory or symlink on the current filesystem. For an OO filesystem interface, does it really make sense to leave them all lumped into the one class with a bunch of isdir() and islink() style methods? Or does it make more sense to have a method on the abstract object that will return the appropriate kind of filesystem info object? If the latter, then how would you deal with the issue of state coherency (i.e. it was a file when you last touched it on the filesystem, but someone else has since changed it to a link)? (that last question actually lends strong support to the idea of a *single* concrete interface that dynamically responds to changes in the underlying filesystem). Another key difference between the two is that the abstract objects would be hashable and serialisable, as their state is immutable and independent of the filesystem. For the concrete objects, the only immutable part of their state is the path name - the rest would reflect the state of the filesystem at the current point in time. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sun Oct 1 06:18:11 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 01 Oct 2006 14:18:11 +1000 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> References: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> Message-ID: <451F4183.5050907@gmail.com> Hans Polak wrote: > Hi, > > > > Just an opinion, but many uses of the ?while true loop? are instances of > a ?do loop?. I appreciate the language layout question, so I?ll give you > an alternative: > > > > do: > > > > > > while > I believe you meant to write PEP 315 in the subject line :) To fully account for loop else clauses, this suggestion would probably need to be modified to look something like this: Basic while loop: while : else: Using break to avoid code duplication: while True: if not : break Current version of PEP 315: do: while : else: This suggestion: do: while else: I personally like that style, and if the compiler can dig through a function looking for yield statements to identify generators, it should be able to dig through a do-loop looking for the termination condition. As I recall, the main objection to this style was that it could hide the loop termination condition, but that isn't actually mentioned in the PEP (and in the typical do-while case, the loop condition will still be clearly visible at the end of the loop body). Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From glyph at divmod.com Sun Oct 1 08:09:02 2006 From: glyph at divmod.com (glyph at divmod.com) Date: Sun, 1 Oct 2006 02:09:02 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <451F3C85.9050100@gmail.com> Message-ID: <20061001060902.1717.565172190.divmod.quotient.64146@ohm> On Sun, 01 Oct 2006 13:56:53 +1000, Nick Coghlan wrote: >Things the PEP 355 path object lumps together: > - string manipulation operations > - abstract path manipulation operations (work for non-existent filesystems) > - read-only traversal of a concrete filesystem (dir, stat, glob, etc) > - addition & removal of files/directories/links within a concrete filesystem > >Dumping all of these into a single class is certainly practical from a utility >point of view, but it's about as far away from beautiful as you can get, which >creates problems from a learnability point of view, and from a >capability-based security point of view. PEP 355 itself splits the methods up >into 11 distinct categories when listing the interface. > >At the very least, I would want to split the interface into separate abstract >and concrete interfaces. The abstract object wouldn't care whether or not the >path actually existed on the current filesystem (and hence could be relied on >to never raise IOError), whereas the concrete object would include the many >operations that might need to touch the real IO device. (the PEP has already >made a step in the right direction here by removing the methods that accessed >a file's contents, leaving that job to the file object where it belongs). > >There's a case to be made for the abstract object inheriting from str or >unicode for compatiblity with existing code, I think that compatibility can be achieved by having a "pathname" string attribute or similar to convert to a string when appropriate. It's not like datetime inherits from str to facilitate formatting or anything like that. >but an alternative would be to >enhance the standard library to better support the use of non-basestring >objects to describe filesystem paths. A PEP should at least look into what >would have to change at the Python API level and the C API level to go that >route rather than the inheritance route. In C, this is going to be really difficult. Existing C APIs want to use C functions to deal with pathnames, and many libraries are not going to support arbitrary VFS I/O operations. For some libraries, like GNOME or KDE, you'd have to use the appropriate VFS object for their platform. >For the concrete interface, the behaviour is very dependent on whether the >path refers to a file, directory or symlink on the current filesystem. For an >OO filesystem interface, does it really make sense to leave them all lumped >into the one class with a bunch of isdir() and islink() style methods? Or does >it make more sense to have a method on the abstract object that will return >the appropriate kind of filesystem info object? I don't think returning different types of objects makes sense. This sort of typing is inherently prone to race conditions. If you get a "DirectoryPath" object in Python, and then the underlying filesystem changes so that the name that used to be a directory is now a file (or a device, or UNIX socket, or whatever), how do you change the underlying type? >If the latter, then how would >you deal with the issue of state coherency (i.e. it was a file when you last >touched it on the filesystem, but someone else has since changed it to a >link)? (that last question actually lends strong support to the idea of a >*single* concrete interface that dynamically responds to changes in the >underlying filesystem). In non-filesystem cases, for example the "zip path" case, there are inherent failure modes that you can't really do anything about (what if the zip file is removed while you're in the middle of manipulating it?) but there are actual applications which depend on the precise atomic semantics and error conditions associated with moving, renaming, and deleting directories and files, at least on POSIX systems. The way Twisted does this is that FilePath objects explicitly cache the results of "stat" and then have an explicit "restat" method for resychronizing with the current state of the filesystem. None of their methods for *manipulating* the filesystem look at this state, since it is almost guaranteed to be out of date :). >Another key difference between the two is that the abstract objects would be >hashable and serialisable, as their state is immutable and independent of the >filesystem. For the concrete objects, the only immutable part of their state >is the path name - the rest would reflect the state of the filesystem at the >current point in time. It doesn't really make sense to separate these to me; whenever you're serializing or hashing that information, the "mutable" parts should just be discarded. From ronaldoussoren at mac.com Sun Oct 1 10:13:19 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 1 Oct 2006 10:13:19 +0200 Subject: [Python-Dev] Tix not included in 2.5 for Windows In-Reply-To: References: Message-ID: <097B109A-8969-4DBC-800B-513E18C82A9B@mac.com> On Sep 30, 2006, at 11:13 PM, Scott David Daniels wrote: > Christos Georgiou wrote: >> Does anyone know why this happens? I can't find any information >> pointing to >> this being deliberate. >> >> I just upgraded to 2.5 on Windows (after making sure I can build >> extensions >> with the freeware VC++ Toolkit 2003) and some of my programs stopped >> operating. I saw in a French forum that someone else had the same >> problem, >> and what they did was to copy the relevant files from a 2.4.3 >> installation. >> I did the same, and it seems it works, with only a console message >> appearing >> as soon as a root window is created: > > Also note: the Os/X universal seems to include a Tix runtime for the > non-Intel processor, but not for the Intel processor. > This > makes me think there is a build problem. The OSX universal binaries don't include Tcl/Tk at all but link to the system version of the Tcl/Tk frameworks. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061001/b9bbae3c/attachment.bin From ronaldoussoren at mac.com Sun Oct 1 10:54:48 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 1 Oct 2006 10:54:48 +0200 Subject: [Python-Dev] HAVE_UINTPTR_T test in configure.in Message-ID: Hi, Someone reported on the pythonmac list that HAVE_UINTPTR_T wasn't defined in pyconfig.h while it should have been defined. I'm looking into this and am now wondering whether the configure snipped below is correct: AC_MSG_CHECKING(for uintptr_t support) have_uintptr_t=no AC_TRY_COMPILE([], [uintptr_t x; x = (uintptr_t)0;], [ AC_DEFINE(HAVE_UINTPTR_T, 1, [Define this if you have the type uintptr_t.]) have_uintptr_t=yes ]) AC_MSG_RESULT($have_uintptr_t) if test "$have_uintptr_t" = yes ; then AC_CHECK_SIZEOF(uintptr_t, 4) fi This seems to check for uintptr_t as a builtin type. Isn't one supposed to include to get this type? Chaning the AC_TRY_COMPILE line to the line below fixes the issue for me, but I've only tested on OSX and don't know if this is the right fix for all supported platforms. AC_TRY_COMPILE([#include ], [uintptr_t x; x = (uintptr_t) 0;], [ Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061001/e8397f2d/attachment-0001.bin From nick at craig-wood.com Sun Oct 1 11:38:46 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Sun, 1 Oct 2006 10:38:46 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <20061001093846.GA20938@craig-wood.com> On Fri, Sep 29, 2006 at 12:03:03PM -0700, Guido van Rossum wrote: > I see some confusion in this thread. > > If a *LITERAL* 0.0 (or any other float literal) is used, you only get > one object, no matter how many times it is used. For some reason that doesn't happen in the interpreter which has been confusing the issue slightly... $ python2.5 Python 2.5c1 (r25c1:51305, Aug 19 2006, 18:23:29) [GCC 4.1.2 20060814 (prerelease) (Debian 4.1.1-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a=0.0 >>> b=0.0 >>> id(a), id(b) (134737756, 134737772) >>> $ python2.5 -c 'a=0.0; b=0.0; print id(a),id(b)' 134737796 134737796 > But if the result of a *COMPUTATION* returns 0.0, you get a new object > for each such result. If you have 70 MB worth of zeros, that's clearly > computation results, not literals. In my application I'm receiving all the zeros from a server over TCP as ASCII and these are being float()ed in python. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From nick at craig-wood.com Sun Oct 1 11:43:38 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Sun, 1 Oct 2006 10:43:38 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <6a36e7290609301521g44b56e59iecc3b0c448cd91c3@mail.gmail.com> References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> <451E2F32.9070405@v.loewis.de> <451E31ED.7030905@gmail.com> <6a36e7290609301521g44b56e59iecc3b0c448cd91c3@mail.gmail.com> Message-ID: <20061001094338.GB20938@craig-wood.com> On Sat, Sep 30, 2006 at 03:21:50PM -0700, Bob Ippolito wrote: > On 9/30/06, Terry Reedy wrote: > > "Nick Coghlan" wrote in message news:451E31ED.7030905 at gmail.com... > > > I suspect the problem would typically stem from floating point > > > values that are read in from a human-readable file rather than > > > being the result of a 'calculation' as such: Over a TCP socket in ASCII format for my application > > For such situations, one could create a translation dict for both common > > float values and for non-numeric missing value indicators. For instance, > > flotran = {'*': None, '1.0':1.0, '2.0':2.0, '4.0':4.0} > > The details, of course, depend on the specific case. > > But of course you have to know that common float values are never > cached and that it may cause you problems. Some users may expect them > to be because common strings and integers are cached. I have to say I was surprised to find out how many copies of 0.0 there were in my code and I guess I was subconsciously expecting the immutable 0.0s to be cached even though I know consciously I've never seen anything but int and str mentioned in the docs. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From rrr at ronadam.com Sun Oct 1 12:20:07 2006 From: rrr at ronadam.com (Ron Adam) Date: Sun, 01 Oct 2006 05:20:07 -0500 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <451F4183.5050907@gmail.com> References: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> <451F4183.5050907@gmail.com> Message-ID: <451F9657.3010808@ronadam.com> Nick Coghlan wrote: > Hans Polak wrote: >> Hi, >> >> >> >> Just an opinion, but many uses of the ?while true loop? are instances of >> a ?do loop?. I appreciate the language layout question, so I?ll give you >> an alternative: >> >> >> >> do: >> >> >> >> >> >> while (I don't think this has been suggested yet.) while , : This would be a do-loop. while 1, : In situations where you want to enter a loop on one condition and exit on a second condition: if value1: value2 = True while value2: Would be ... while value1, value2: I've used that pattern on more than a few occasions. A single condition while would be the same as... while , : # same entry and exit condition So do just as we do now... while : # same entry and exit condition > As I recall, the main objection to this style was that it could hide the loop > termination condition, but that isn't actually mentioned in the PEP (and in > the typical do-while case, the loop condition will still be clearly visible at > the end of the loop body). Putting both the entry and exit conditions at the top is easier to read. The end of the first loop is also the beginning of all the following loops, so having the exit_condition at the top doesn't really put anything out of order. If the exit_condition is not evaluated until the top of the second loop, the names it uses do not need to be pre defined, they can just be assigned in the loop. Ron From murman at gmail.com Sun Oct 1 16:14:14 2006 From: murman at gmail.com (Michael Urman) Date: Sun, 1 Oct 2006 09:14:14 -0500 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <451F9657.3010808@ronadam.com> References: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> <451F4183.5050907@gmail.com> <451F9657.3010808@ronadam.com> Message-ID: On 10/1/06, Ron Adam wrote: > (I don't think this has been suggested yet.) > > while , : > [snip] > Putting both the entry and exit conditions at the top is easier to read. I agree in principle, but I thought the proposed syntax already has meaning today (as it turns out, parentheses are required to make a tuple in a while condition, at least in 2.4 and 2.5). To help stave off similar confusion I'd rather see a pseudo-keyword added. However my first candidate "until" seems to apply a negation to the exit condition. while True until False: # run once? run forever? while True until True: # run forever? run once? It's still very different from any syntactical syntax I can think of in python. I'm not sure I like the idea. Michael -- Michael Urman http://www.tortall.net/mu/blog From ark at acm.org Sun Oct 1 18:58:41 2006 From: ark at acm.org (Andrew Koenig) Date: Sun, 1 Oct 2006 12:58:41 -0400 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <451F9657.3010808@ronadam.com> Message-ID: <002501c6e57a$dbf69920$6402a8c0@arkdesktop> > (I don't think this has been suggested yet.) > > while , : > This usage makes me uneasy, not the least because I don't understand why the comma isn't creating a tuple. That is, why whould while x, y: be any different from while (x, y): ? My other concern is that is evaluated out of sequence. From tjreedy at udel.edu Sun Oct 1 19:54:31 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 1 Oct 2006 13:54:31 -0400 Subject: [Python-Dev] Caching float(0.0) References: <20061001093846.GA20938@craig-wood.com> Message-ID: "Nick Craig-Wood" wrote in message news:20061001093846.GA20938 at craig-wood.com... > On Fri, Sep 29, 2006 at 12:03:03PM -0700, Guido van Rossum wrote: >> I see some confusion in this thread. >> >> If a *LITERAL* 0.0 (or any other float literal) is used, you only get >> one object, no matter how many times it is used. > > For some reason that doesn't happen in the interpreter which has been > confusing the issue slightly... > > $ python2.5 >>>> a=0.0 >>>> b=0.0 >>>> id(a), id(b) > (134737756, 134737772) Guido said *a* literal (emphasis shifted), reused as in a loop or function recalled, while you used *a* literal, then *another* literal, without reuse. Try a=b=0.0 instead. Terry Jan Reedy From pje at telecommunity.com Sun Oct 1 19:55:06 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 01 Oct 2006 13:55:06 -0400 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <002501c6e57a$dbf69920$6402a8c0@arkdesktop> References: <451F9657.3010808@ronadam.com> Message-ID: <5.1.1.6.0.20061001135107.02f49e68@sparrow.telecommunity.com> At 12:58 PM 10/1/2006 -0400, Andrew Koenig wrote: > > (I don't think this has been suggested yet.) > > > > while , : > > > >This usage makes me uneasy, not the least because I don't understand why the >comma isn't creating a tuple. That is, why whould > > while x, y: > > >be any different from > > while (x, y): > > >? > >My other concern is that is evaluated out of sequence. This pattern: while entry_cond: ... and while not exit_cond: ... has been suggested before, and I believe that at least one of the times it was suggested, it had some support from Guido. Essentially, the "and while not exit" is equivalent to an "if exit: break" that's more visible due to not being indented. I'm not sure I like it, myself, but out of all the things that get suggested for this issue, I think it's the best. The fact that it's still not very good despite being the best, is probably the reason we don't have it yet. :) From exarkun at divmod.com Sun Oct 1 20:01:51 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Sun, 1 Oct 2006 14:01:51 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Message-ID: <20061001180151.1717.1491936593.divmod.quotient.64438@ohm> On Sun, 1 Oct 2006 13:54:31 -0400, Terry Reedy wrote: > >"Nick Craig-Wood" wrote in message >news:20061001093846.GA20938 at craig-wood.com... >> On Fri, Sep 29, 2006 at 12:03:03PM -0700, Guido van Rossum wrote: >>> I see some confusion in this thread. >>> >>> If a *LITERAL* 0.0 (or any other float literal) is used, you only get >>> one object, no matter how many times it is used. >> >> For some reason that doesn't happen in the interpreter which has been >> confusing the issue slightly... >> >> $ python2.5 >>>>> a=0.0 >>>>> b=0.0 >>>>> id(a), id(b) >> (134737756, 134737772) > >Guido said *a* literal (emphasis shifted), reused as in a loop or function >recalled, while you used *a* literal, then *another* literal, without >reuse. Try a=b=0.0 instead. Actually this just has to do with, um, "compilation units", for lack of a better term: exarkun at kunai:~$ python Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a = 0.0 >>> b = 0.0 >>> print a is b False >>> ^D exarkun at kunai:~$ cat > test.py a = 0.0 b = 0.0 print a is b ^D exarkun at kunai:~$ python test.py True exarkun at kunai:~$ cat > test_a.py a = 0.0 ^D exarkun at kunai:~$ cat > test_b.py b = 0.0 ^D exarkun at kunai:~$ cat > test.py from test_a import a from test_b import b print a is b ^D exarkun at kunai:~$ python test.py False exarkun at kunai:~$ python Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a = 0.0; b = 0.0 >>> print a is b True >>> exarkun at kunai:~$ Each line in an interactive session is compiled separately, like modules are compiled separately. With the current implementation, literals in a single compilation unit have a chance to be "cached" like this. Literals in different compilation units, even for the same value, don't. Jean-Paul From ronaldoussoren at mac.com Sun Oct 1 20:11:12 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 1 Oct 2006 20:11:12 +0200 Subject: [Python-Dev] HAVE_UINTPTR_T test in configure.in In-Reply-To: References: Message-ID: <27707AD9-69FE-41A1-A158-B08B0B791FFE@mac.com> On Oct 1, 2006, at 10:54 AM, Ronald Oussoren wrote: > Hi, > > Someone reported on the pythonmac list that HAVE_UINTPTR_T wasn't > defined in pyconfig.h while it should have been defined. I'm > looking into this and am now wondering whether the configure > snipped below is correct: > > AC_MSG_CHECKING(for uintptr_t support) > have_uintptr_t=no > AC_TRY_COMPILE([], [uintptr_t x; x = (uintptr_t)0;], [ > AC_DEFINE(HAVE_UINTPTR_T, 1, [Define this if you have the type > uintptr_t.]) > have_uintptr_t=yes > ]) > AC_MSG_RESULT($have_uintptr_t) > if test "$have_uintptr_t" = yes ; then > AC_CHECK_SIZEOF(uintptr_t, 4) > fi > > This seems to check for uintptr_t as a builtin type. Isn't one > supposed to include to get this type? > > Chaning the AC_TRY_COMPILE line to the line below fixes the issue > for me, but I've only tested on OSX and don't know if this is the > right fix for all supported platforms. > > AC_TRY_COMPILE([#include ], [uintptr_t x; x = (uintptr_t) > 0;], [ The same problem exists on Linux, and is fixed by the same change. BTW. Python 2.4 suffers from the same problem and I've filed a bugreport for this (http://www.python.org/sf/1568842). Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061001/800bb0d0/attachment.bin From ark at acm.org Sun Oct 1 20:44:55 2006 From: ark at acm.org (Andrew Koenig) Date: Sun, 1 Oct 2006 14:44:55 -0400 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: <5.1.1.6.0.20061001135107.02f49e68@sparrow.telecommunity.com> Message-ID: <002e01c6e589$b324ee20$6402a8c0@arkdesktop> > This pattern: > > while entry_cond: > ... > and while not exit_cond: > ... > > has been suggested before, and I believe that at least one of the times it > was suggested, it had some support from Guido. Essentially, the "and > while not exit" is equivalent to an "if exit: break" that's more visible > due to not being indented. I like this suggestion. In fact it is possible that at one time I suggested something similar. It reminds me of something that Dijkstra suggested in his 1971 book "A Discipline of Programming." His ides looked somewhat like this: do condition 1 -> action 1 ... [] condition n -> action n od Here, the [] should be thought of as a delimiter; it was typeset as a tall narrow rectangle. The semantics are as follows: If all of the conditions are false, the statement does nothing. Otherwise, the implementation picks one of the true conditions, executes the corresponding action, and does it all again. There is no guarantee about which action is executed if more than one of the conditions is true. The general idea, then, is that each action should falsify its corresponding condition while bring the loop closer to termination; when all of the conditions are false, the loop is done. For example, he might write Euclid's algorithm this way: do x < y -> y := y mod x [] y < x -> x := x mod y od If we were to adopt "while ... and while" in Python, then Dijkstra's construct could be rendered this way: while x < y: y %= x or while y < x: x %= y I'm not suggesting this seriously as I don't have enough realistic use cases. Still, it's interesting to see that someone else has grappled with a similar problem. From rrr at ronadam.com Sun Oct 1 21:08:45 2006 From: rrr at ronadam.com (Ron Adam) Date: Sun, 01 Oct 2006 14:08:45 -0500 Subject: [Python-Dev] PEP 351 - do while In-Reply-To: References: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com> <451F4183.5050907@gmail.com> <451F9657.3010808@ronadam.com> Message-ID: <4520123D.90303@ronadam.com> Michael Urman wrote: > On 10/1/06, Ron Adam wrote: >> (I don't think this has been suggested yet.) >> >> while , : >> > > [snip] > >> Putting both the entry and exit conditions at the top is easier to read. > > I agree in principle, but I thought the proposed syntax already has > meaning today (as it turns out, parentheses are required to make a > tuple in a while condition, at least in 2.4 and 2.5). To help stave > off similar confusion I'd rather see a pseudo-keyword added. However > my first candidate "until" seems to apply a negation to the exit > condition. > > while True until False: # run once? run forever? > while True until True: # run forever? run once? > > It's still very different from any syntactical syntax I can think of > in python. I'm not sure I like the idea. > > Michael I thought the comma might be a sticking point. My first thought was to have a series of conditions evaluated on loops with the last condition repeated. while loop1_cond, loop2_cond, loop3_cond, ..., rest_condition: But I couldn't think of good uses past the first two that are obvious so I trimmed it down to just enter_condition and exit_condition which keeps it simple. But from this example you can see they are all really just top of the loop tests done in sequence. A do loop is just a matter of having the first one evaluate as True. The current while condition is an entry condition the first time it's evaluated and an exit condition on the rest. So by splitting it in two, we can specify an enter and exit test more explicitly. There's a certain consistency I like about this also. Is it just getting around or finding a nice alternative to the comma that is the biggest problem with this? Maybe just using "then" would work? while cond1 then cond2: Cheers, Ron From guido at python.org Sun Oct 1 22:35:56 2006 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Oct 2006 13:35:56 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Message-ID: On 9/30/06, Giovanni Bajo wrote: > It would be terrific if you gave us some clue about what is wrong in PEP355, so > that the next guy does not waste his time. For instance, I find PEP355 > incredibly good for my own path manipulation (much cleaner and concise than the > awful os.path+os+shutil+stat mix), and I have trouble understanding what is > *so* wrong with it. > > You said "it's an amalgam of unrelated functionality", but you didn't say what > exactly is "unrelated" for you. Sorry, no time. But others in this thread clearly agreed with me, so they can guide you. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From arigo at tunes.org Sun Oct 1 23:09:24 2006 From: arigo at tunes.org (Armin Rigo) Date: Sun, 1 Oct 2006 23:09:24 +0200 Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source In-Reply-To: References: Message-ID: <20061001210923.GA31682@code0.codespeak.net> Hi Brett, On Wed, Sep 27, 2006 at 02:11:30PM -0700, Brett Cannon wrote: > is so bad that it is worth trying to re-implement the import semantics in > pure Python or if in the name of time to just work with the C code. In the name of time, sanity and usefulness, rewriting the expected semantics in Python would be a major good idea IMHO. I can cite many projects that have reimplemented half of the semantics in Python (runpy.py, the 'py' lib, PyPy...), but none that completed them. Having such a complete implementation available in the first place would be helpful. A bientot, Armin From Jack.Jansen at cwi.nl Sun Oct 1 23:04:53 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Sun, 1 Oct 2006 23:04:53 +0200 Subject: [Python-Dev] OT: How many other people got this spam? References: <0F6EC883$0A010F152D3B$9C41388@E457FDF720CE414> Message-ID: <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> I was wondering: how many other people who maintain websites (well: "maintain" might be a bit of a misnomer in my case:-) related to Python have also got this spam? Begin forwarded message: > From: "Snake Tracks" > Date: October 1, 2006 21:21:45 GMT+02:00 > To: Cwi > Subject: Special Invitation for cwi.nl from Snake Tracks > > Fellow Website Owner/Operator; > > As of September 29th, 2006 we will be launching what is soon to be the > worlds largest snake enthusiast website. The website contains > valuable > information for all those interested in snakes including care sheets, > species information and identification, breeding information, and an > extensive list of snake specific forums. > > We welcome you to visit our website and join our community of snake > enthusiasts worldwide. Currently we are browsing through Google and > other major search engines looking for websites we feel would make > good > link partners. I have personally come across your site and think that > exchanging links could benefit both of our businesses. By linking > to us > you will receive a reciprocal link and be showcased in front of all > our > visitors. > > If you are interested in this partnership please add one of the > following text links or banners to a high traffic area on your > website: > > 1) Snake Tracks - The Worlds Largest Snake Enthusiast Website. Visit > our site for care sheets, species information, field herping > information, breeding, captive care, and our extensive list of snake > enthusiast forums. > > 2) Snake Tracks Forums - Visit the Worlds Largest Collection of Snake > Enthusiast forums including our field herping, captive care, habitat > design, and regional forums. > > 3) Snake Care Sheets - Visit the Worlds Largest Snake Enthusiast > Website. Forums, Care Sheets, Field Herping, Species information and > more. > > You may also visit our link page to choose from several banner images > and text links. Once you have linked to our website, fill out the > form > and we will add your site to our directory. > > http://www.snaketracks.com/linktous.html > > I look forward to hearing from you in regards to this email. Please > allow up to 24 hours for a response as we are currently receiving > extremely large amounts of email. > > Sincerely; > Blair Russell - Snaketracks.com > -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From greg.ewing at canterbury.ac.nz Mon Oct 2 03:03:44 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Oct 2006 14:03:44 +1300 Subject: [Python-Dev] OT: How many other people got this spam? In-Reply-To: <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> References: <0F6EC883$0A010F152D3B$9C41388@E457FDF720CE414> <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> Message-ID: <45206570.9020802@canterbury.ac.nz> Jack Jansen wrote: > I was wondering: how many other people who maintain websites (well: > "maintain" might be a bit of a misnomer in my case:-) related to > Python have also got this spam? I got it. I was rather amused that they claim to have been "looking for sites that would make good link partners" when obviously no human eye of theirs has actually seen my site. Addressing me as "Canterbury" in the To: line wasn't a good sign either. :-) I'm tempted to take them up on the offer and see whether they actually make a link to my site from theirs... -- Greg From fredrik at pythonware.com Mon Oct 2 07:58:15 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 02 Oct 2006 07:58:15 +0200 Subject: [Python-Dev] OT: How many other people got this spam? In-Reply-To: <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> References: <0F6EC883$0A010F152D3B$9C41388@E457FDF720CE414> <5B8AC307-658E-4ED1-BBBD-DE56DFEB3357@cwi.nl> Message-ID: Jack Jansen wrote: > I was wondering: how many other people who maintain websites (well: > "maintain" might be a bit of a misnomer in my case:-) related to > Python have also got this spam? probably everyone. I've gotten two copies, this far. From nick at craig-wood.com Mon Oct 2 09:54:35 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Mon, 2 Oct 2006 08:54:35 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20061001180151.1717.1491936593.divmod.quotient.64438@ohm> References: <20061001180151.1717.1491936593.divmod.quotient.64438@ohm> Message-ID: <20061002075434.GA18278@craig-wood.com> On Sun, Oct 01, 2006 at 02:01:51PM -0400, Jean-Paul Calderone wrote: > Each line in an interactive session is compiled separately, like modules > are compiled separately. With the current implementation, literals in a > single compilation unit have a chance to be "cached" like this. Literals > in different compilation units, even for the same value, don't. That makes sense - thanks for the explanation! -- Nick Craig-Wood -- http://www.craig-wood.com/nick From jason.orendorff at gmail.com Mon Oct 2 12:28:28 2006 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Mon, 2 Oct 2006 06:28:28 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Message-ID: On 9/30/06, Giovanni Bajo wrote: > Guido van Rossum wrote: > > OK. Pronouncement: PEP 355 is dead. [...] > > It would be terrific if you gave us some clue about what is > wrong in PEP355, [...] Here are my guesses. I believe Guido rejected this PEP for a lot of reasons. By the way, what I'm about to do is known as "channeling Guido (badly)" and I'm pretty sure it annoys him. Sorry, Guido. Please don't treat the following as authoritative; I have never met Guido and obviously I cannot speak for him. - I don't think Guido ever saw much benefit from "path objects". That is, the Motivation was not compelling. I think the main motivation is to eliminate some clutter and add a handful of useful methods to the stdlib, so it's easy to see how this could be the case. - Guido just flat-out didn't like the looks of the PEP. Too much weirdness. (path.py contains more weirdness, including some stuff Guido particularly disliked, and I think it's fair to say that PEP355 suffered somewhat by association.) - Any proposal to add a Second Way To Do It has to meet a very high standard. PEP355 was too big to be considered an incremental change. Yet it didn't even attempt to fix all the perceived problems with the existing APIs. A more thorough job would have had a better chance. - Nobody liked the API design--too many methods. - Now we're hearing rumors of better ideas out there, which comes as a relief. I suspect any one of these could have scuttled the proposal. -j From ncoghlan at gmail.com Mon Oct 2 12:48:05 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 02 Oct 2006 20:48:05 +1000 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <000601c6e5ed$d99ad290$1d2c440a@spain.capgemini.com> References: <000601c6e5ed$d99ad290$1d2c440a@spain.capgemini.com> Message-ID: <4520EE65.50507@gmail.com> Hans Polak wrote: > Hi Nick, > > Yep, PEP 315. Sorry about that. > > Now, about your suggestion > do: > > while > > else: > > > This is pythonic, but not logical. The 'do' will execute at least once, so > the else clause is not needed, nor is the . The body> should go before the while terminator. This objection is based on a misunderstanding of what the else clause is for in a Python loop. The else clause is only executed if the loop terminated naturally (the exit condition became false) rather than being explicitly terminated using a break statement. This behaviour is most commonly useful when using a for loop to search through an iterable (breaking when the object is found, and using the else clause to handle the 'not found' case), but it is also defined for while loops. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Mon Oct 2 15:43:48 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 Oct 2006 15:43:48 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <451E31ED.7030905@gmail.com> References: <20060929081402.GB19781@craig-wood.com> <451DC113.4040002@canterbury.ac.nz> <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com> <451E2F32.9070405@v.loewis.de> <451E31ED.7030905@gmail.com> Message-ID: <45211794.7020503@v.loewis.de> Nick Coghlan schrieb: >> Right. Although I do wonder what kind of software people write to run >> into this problem. As Guido points out, the numbers must be the result >> from some computation, or created by an extension module by different >> means. If people have many *simultaneous* copies of 0.0, I would expect >> there is something else really wrong with the data structures or >> algorithms they use. > > I suspect the problem would typically stem from floating point values > that are read in from a human-readable file rather than being the result > of a 'calculation' as such: That's how you can end up with 100 different copies of 0.0. But apparently, people are creating millions of them, and keep them in memory simultaneously. Unless the text file *only* consists of floating point numbers, I would expect they have bigger problems than that. Regards, Martin From martin at v.loewis.de Mon Oct 2 15:49:50 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 Oct 2006 15:49:50 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <129CEF95A523704B9D46959C922A28000451FED3@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A28000451FED3@nemesis.central.ccp.cc> Message-ID: <452118FE.6040704@v.loewis.de> Kristj?n V. J?nsson schrieb: > Well, a lot of extension code, like ours use PyFloat_FromDouble(foo); > This can be from vectors and stuff. Hmm. If you get a lot of 0.0 values from vectors and stuff, I would expect that memory usage is already high. In any case, a module that creates a lot of copies of 0.0 that way could do its own caching, right? > Very often these are values from a database. Integral float values > are very common in such case and id didn't occur to me that they > weren't being reused, at least for small values. Sure - but why are keeping people them in memory all the time? Also, isn't it a mis-design of the database if you have many float values in it that represent natural numbers? Shouldn't you use a more appropriate data type, then? > Also, a lot of arithmetic involving floats is expected to end in > integers, like computing some index from a float value. Integers get > promoted to floats when touched by them, as you know. Again, sounds like a programming error to me. > Anyway, I now precreate integral values from -10 to 10 with great > effect. The cost is minimal, the benefit great. In an extension module, the knowledge about the application domain is larger, so it may be reasonable to do the caching there. I would still expect that in the typical application where this is an issue, there is some kind of larger design bug. Regards, Martin From kristjan at ccpgames.com Mon Oct 2 16:19:18 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Mon, 2 Oct 2006 14:19:18 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> Well, Skip made the argument when analyzing the testsuite: "So for a largely non-floating point "application", a fair number of floats are allocated, a bit over 25% of them are -1.0, 0.0 or +1.0, and nearly 50% of them are whole numbers between -10.0 and 10.0, inclusive. " In C, there is no need to treat 0.0 any different from any other value, since they are literals. You will find axis aligned unit vectors to be very common in any 3D app. I can't say exactly where all those integral floats are coming from. I could investigate further, but it seems to me that they are simply quite common in real-world applications. Experience shows that 0.0 is _very_ common, even, and the test suite test skip made should make this abundantly clear. I can't see how this situation is any different from the re-use of low ints. There is no fundamental law that says that ints below 100 are more common than other, yet experience shows that this is so, and so they are reused. Rather than to view this as a programming error, why not simply accept that this is a recurring pattern and adjust python to be more efficient when faced by it? Surely a lot of karma lies that way? Cheers, Kristj?n > -----Original Message----- > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] > Sent: 2. okt?ber 2006 13:50 > To: Kristj?n V. J?nsson > Cc: Bob Ippolito; python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > Kristj?n V. J?nsson schrieb: > > Well, a lot of extension code, like ours use > PyFloat_FromDouble(foo); > > This can be from vectors and stuff. > > Hmm. If you get a lot of 0.0 values from vectors and stuff, I > would expect that memory usage is already high. > > In any case, a module that creates a lot of copies of 0.0 > that way could do its own caching, right? > > > Very often these are values from a database. Integral float values > > are very common in such case and id didn't occur to me that they > > weren't being reused, at least for small values. > > Sure - but why are keeping people them in memory all the time? > Also, isn't it a mis-design of the database if you have many > float values in it that represent natural numbers? Shouldn't > you use a more appropriate data type, then? > > > Also, a lot of arithmetic involving floats is expected to end in > > integers, like computing some index from a float value. > Integers get > > promoted to floats when touched by them, as you know. > > Again, sounds like a programming error to me. > > > Anyway, I now precreate integral values from -10 to 10 with great > > effect. The cost is minimal, the benefit great. > > In an extension module, the knowledge about the application > domain is larger, so it may be reasonable to do the caching > there. I would still expect that in the typical application > where this is an issue, there is some kind of larger design bug. > > Regards, > Martin > From martin at v.loewis.de Mon Oct 2 16:37:09 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 Oct 2006 16:37:09 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> Message-ID: <45212415.5000104@v.loewis.de> Kristj?n V. J?nsson schrieb: > I can't see how this situation is any different from the re-use of > low ints. There is no fundamental law that says that ints below 100 > are more common than other, yet experience shows that this is so, > and so they are reused. There are two important differences: 1. it is possible to determine whether the value is "special" in constant time, and also fetch the singleton value in constant time for ints; the same isn't possible for floats. 2. it may be that there is a loss of precision in reusing an existing value (although I'm not certain that this could really happen). For example, could it be that two values compare successful in ==, yet are different values? I know this can't happen for integers, so I feel much more comfortable with that cache. > Rather than to view this as a programming error, why not simply > accept that this is a recurring pattern and adjust python to be more > efficient when faced by it? Surely a lot of karma lies that way? I'm worried about the penalty that this causes in terms of run-time cost. Also, how do you chose what values to cache? Regards, Martin From kristjan at ccpgames.com Mon Oct 2 17:08:08 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Mon, 2 Oct 2006 15:08:08 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> I see, you are thinking of the general fractional case. My point was that whole numbers seem to pop up often and to reuse those is easy I did a test of tracking actual floating point numbers and the majority of heavy usage comes from integral values. It would indeed be strange if some fractional number were heavily use but it can be argued that integral ones are "special" in many ways. Anyway, Skip noted that 50% of all floats are whole numbers between -10 and 10 inclusive, and this is the code that I employ in our python build today: PyObject * PyFloat_FromDouble(double fval) { register PyFloatObject *op; int ival; if (free_list == NULL) { if ((free_list = fill_free_list()) == NULL) return NULL; /* CCP addition, cache common values */ if (!f_reuse[0]) { int i; for(i = 0; i<21; i++) f_reuse[i] = PyFloat_FromDouble((double)(i-10)); } } /* CCP addition, check for recycling */ ival = (int)fval; if ((double)ival == fval && ival>=-10 && ival <= 10) { ival+=10; if (f_reuse[ival]) { Py_INCREF(f_reuse[ival]); return f_reuse[ival]; } } ... Cheers, Kristj?n > -----Original Message----- > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] > Sent: 2. okt?ber 2006 14:37 > To: Kristj?n V. J?nsson > Cc: Bob Ippolito; python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > Kristj?n V. J?nsson schrieb: > > I can't see how this situation is any different from the > re-use of low > > ints. There is no fundamental law that says that ints > below 100 are > > more common than other, yet experience shows that this is > so, and so > > they are reused. > > There are two important differences: > 1. it is possible to determine whether the value is "special" in > constant time, and also fetch the singleton value in constant > time for ints; the same isn't possible for floats. > 2. it may be that there is a loss of precision in reusing an existing > value (although I'm not certain that this could really happen). > For example, could it be that two values compare successful in > ==, yet are different values? I know this can't happen for > integers, so I feel much more comfortable with that cache. > > > Rather than to view this as a programming error, why not > simply accept > > that this is a recurring pattern and adjust python to be more > > efficient when faced by it? Surely a lot of karma lies that way? > > I'm worried about the penalty that this causes in terms of > run-time cost. Also, how do you chose what values to cache? > > Regards, > Martin > From mwh at python.net Mon Oct 2 17:22:14 2006 From: mwh at python.net (Michael Hudson) Date: Mon, 02 Oct 2006 16:22:14 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45212415.5000104@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Mon, 02 Oct 2006 16:37:09 +0200") References: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> <45212415.5000104@v.loewis.de> Message-ID: <2m4pumg2u1.fsf@starship.python.net> "Martin v. L?wis" writes: > Kristj?n V. J?nsson schrieb: >> I can't see how this situation is any different from the re-use of >> low ints. There is no fundamental law that says that ints below 100 >> are more common than other, yet experience shows that this is so, >> and so they are reused. > > There are two important differences: > 1. it is possible to determine whether the value is "special" in > constant time, and also fetch the singleton value in constant > time for ints; the same isn't possible for floats. I don't think you mean "constant time" here do you? I think most of the code posted so far has been constant time, at least in terms of instruction count, though some might indeed be fairly slow on some processors -- conversion from double to integer on the PowerPC involves a trip off to memory for example. Even so, everything should be fairly efficient compared to allocation, even with PyMalloc. > 2. it may be that there is a loss of precision in reusing an existing > value (although I'm not certain that this could really happen). > For example, could it be that two values compare successful in > ==, yet are different values? I know this can't happen for > integers, so I feel much more comfortable with that cache. I think the only case is that the two zeros compare equal, which is unfortunate given that it's the most compelling value to cache... I don't know a reliable and fast way to distinguish +0.0 and -0.0. Cheers, mwh -- The bottom tier is what a certain class of wanker would call "business objects" ... -- Greg Ward, 9 Dec 1999 From martin at v.loewis.de Mon Oct 2 17:34:39 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 Oct 2006 17:34:39 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <2m4pumg2u1.fsf@starship.python.net> References: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> <45212415.5000104@v.loewis.de> <2m4pumg2u1.fsf@starship.python.net> Message-ID: <4521318F.6080002@v.loewis.de> Michael Hudson schrieb: >> 1. it is possible to determine whether the value is "special" in >> constant time, and also fetch the singleton value in constant >> time for ints; the same isn't possible for floats. > > I don't think you mean "constant time" here do you? Right; I really wondered whether the code was dependent or independent of the number of special-case numbers. > I think most of > the code posted so far has been constant time, at least in terms of > instruction count, though some might indeed be fairly slow on some > processors -- conversion from double to integer on the PowerPC > involves a trip off to memory for example. Kristian's code testing only for integers in a range would be of that kind. Code that tests for a list of literals determined at compile time typically needs time "linear" with the number of special-cased constants (of course, as that there is a fixed number of constants, this is O(1)). >> 2. it may be that there is a loss of precision in reusing an existing >> value (although I'm not certain that this could really happen). >> For example, could it be that two values compare successful in >> ==, yet are different values? I know this can't happen for >> integers, so I feel much more comfortable with that cache. > > I think the only case is that the two zeros compare equal, which is > unfortunate given that it's the most compelling value to cache... Thanks for pointing that out. I can believe this is the only case in IEEE-754; I also wonder whether alternative implementations could cause problems (although I don't really worry too much about VMS). Regards, Martin From aahz at pythoncraft.com Mon Oct 2 18:51:30 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 2 Oct 2006 09:51:30 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <4521318F.6080002@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE99A7@nemesis.central.ccp.cc> <45212415.5000104@v.loewis.de> <2m4pumg2u1.fsf@starship.python.net> <4521318F.6080002@v.loewis.de> Message-ID: <20061002165130.GA1166@panix.com> On Mon, Oct 02, 2006, "Martin v. L?wis" wrote: > Michael Hudson schrieb: >> >> I think most of >> the code posted so far has been constant time, at least in terms of >> instruction count, though some might indeed be fairly slow on some >> processors -- conversion from double to integer on the PowerPC >> involves a trip off to memory for example. > > Kristian's code testing only for integers in a range would be of > that kind. Code that tests for a list of literals determined > at compile time typically needs time "linear" with the number of > special-cased constants (of course, as that there is a fixed > number of constants, this is O(1)). What if we do this work only on float()? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "LL YR VWL R BLNG T S" -- www.nancybuttons.com From jcarlson at uci.edu Mon Oct 2 19:33:03 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 02 Oct 2006 10:33:03 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <2m4pumg2u1.fsf@starship.python.net> References: <45212415.5000104@v.loewis.de> <2m4pumg2u1.fsf@starship.python.net> Message-ID: <20061002101308.08FC.JCARLSON@uci.edu> Michael Hudson wrote: > "Martin v. L?wis" writes: > > Kristj?n V. J?nsson schrieb: > >> I can't see how this situation is any different from the re-use of > >> low ints. There is no fundamental law that says that ints below 100 > >> are more common than other, yet experience shows that this is so, > >> and so they are reused. > > > > There are two important differences: > > 1. it is possible to determine whether the value is "special" in > > constant time, and also fetch the singleton value in constant > > time for ints; the same isn't possible for floats. > > I don't think you mean "constant time" here do you? I think most of > the code posted so far has been constant time, at least in terms of > instruction count, though some might indeed be fairly slow on some > processors -- conversion from double to integer on the PowerPC > involves a trip off to memory for example. Even so, everything should > be fairly efficient compared to allocation, even with PyMalloc. > > > 2. it may be that there is a loss of precision in reusing an existing > > value (although I'm not certain that this could really happen). > > For example, could it be that two values compare successful in > > ==, yet are different values? I know this can't happen for > > integers, so I feel much more comfortable with that cache. > > I think the only case is that the two zeros compare equal, which is > unfortunate given that it's the most compelling value to cache... > > I don't know a reliable and fast way to distinguish +0.0 and -0.0. The same way one could handle the lookups quickly; cast the pointer to a uint64 and dereference it. For all non-extended floats (I don't know the proper terminology, but their >64 bit precision is stored on the processor, not in memory), this will disambiguate *which* value it is. It may cause problems with NaNs and infities, but we aren't caching them, so we don't care. The result of all this is that we can do the following on Intel x86 platforms (replace with hex if desired)... switch (*(uint64*(&fval))) { case 13845191154443747328ULL: case 13844628204490326016ULL: case 13844065254536904704ULL: case 13842939354630062080ULL: case 13841813454723219456ULL: case 13840687554816376832ULL: case 13839561654909534208ULL: case 13837309855095848960ULL: case 13835058055282163712ULL: case 13830554455654793216ULL: case 0ULL: case 4607182418800017408ULL: case 4611686018427387904ULL: case 4613937818241073152ULL: case 4616189618054758400ULL: case 4617315517961601024ULL: case 4618441417868443648ULL: case 4619567317775286272ULL: case 4620693217682128896ULL: case 4621256167635550208ULL: /*lookup in the table */ default: break; } Each platform would need a new block depending on their endianness mixing of float/uint64 (if any), as well as depending on their double representations (as long as it conforms to IEEE764 fp doubles for these 21 values, they don't need a new one). - Josiah From tim.hochberg at ieee.org Mon Oct 2 19:43:51 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon, 02 Oct 2006 10:43:51 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17694.61885.527128.686743@montanaro.dyndns.org> References: <20060929081402.GB19781@craig-wood.com> <17694.61885.527128.686743@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote:\/ > Steve> By these statistics I think the answer to the original question > Steve> is clearly "no" in the general case. > > As someone else (Guido?) pointed out, the literal case isn't all that > interesting. I modified floatobject.c to track a few interesting > floating point values: > [...code...] > > So for a largely non-floating point "application", a fair number of floats > are allocated, a bit over 25% of them are -1.0, 0.0 or +1.0, and nearly 50% > of them are whole numbers between -10.0 and 10.0, inclusive. > > Seems like it at least deserves a serious look. It would be nice to have > the numeric crowd contribute to this subject as well. As a representative of the numeric crowd, I'll say that I've never noticed this to be a problem. I suspect that it's a non issue since we generally store our numbers in arrays, not big piles of Python floats, so there's no opportunity for identical floats to pile up. -tim From brett at python.org Mon Oct 2 22:01:28 2006 From: brett at python.org (Brett Cannon) Date: Mon, 2 Oct 2006 13:01:28 -0700 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) Message-ID: In the interest of time I have decided to go ahead and do the PEP 302 phase 2 work in C. I fully expect to tackle rewriting import in Python in my spare time after I finish this work since I will be much more familiar with how the whole import machinery works and it sounds like a fun challenge. The branch for the work is in pep302_phase2 . Any help would be appreciated in this work. I plan on keeping a BRANCH_PLANS file that outlines the what/why/how of the whole thing. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061002/97f7a7b8/attachment.htm From pje at telecommunity.com Mon Oct 2 22:59:43 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 02 Oct 2006 16:59:43 -0400 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: Message-ID: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> At 01:01 PM 10/2/2006 -0700, Brett Cannon wrote: >In the interest of time I have decided to go ahead and do the PEP 302 >phase 2 work in C. Just FYI, it's not possible (so far as I know) to implement phase 2 while maintaining backward compatibility with existing 2.x code. So this work shouldn't go back to the 2.x trunk without discussion of those issues. Essentially, I abandoned trying to do the phase 2 work for Python 2.5 because there's too much code in the field that depends on the current order of when special/built-in imports are processed vs. when PEP 302 imports are processed. Thus, instead of adding new PEP 302 APIs (like get_loader) to 'imp', I added them to 'pkgutil'. There are, I believe, some notes in that module's source regarding what the ordering issues are w/meta_path vs. the way import works now. That having been said, we could possibly have a transition for 2.6, but everybody who's written any PEP 302 emulation code (outside of pkgutil itself) would have to adapt their code somewhat. I'm surprised, however, that you think working on this in C is going to be *less* time than it would take to simply replace __import__ with a Python function that reimplements PEP 302... especially since pkgutil contains a whole lot of the code you'd need, e.g.: def __import__(...): ... loader = pkgutil.find_loader(fullname) if loader is not None: module = loader.load_module(fullname) ... And much of the rest of the above can probably be filled out by swiping code from ihooks, imputil, or other Python __import__ implementations. From p.f.moore at gmail.com Tue Oct 3 00:27:07 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 2 Oct 2006 23:27:07 +0100 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> Message-ID: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> On 10/2/06, Phillip J. Eby wrote: > Just FYI, it's not possible (so far as I know) to implement phase 2 while > maintaining backward compatibility with existing 2.x code. So this work > shouldn't go back to the 2.x trunk without discussion of those issues. While that's a fair point, we need to be clear what compatibility issues there are. The built in import mechanisms aren't well documented, so it's not a black-and-white situation. An unqualified statement "there are issues" isn't much help on its own... > Essentially, I abandoned trying to do the phase 2 work for Python 2.5 > because there's too much code in the field that depends on the current > order of when special/built-in imports are processed vs. when PEP 302 > imports are processed. Can you say what that code is, and who we should be talking to to understand their issues? If not, how do we find such code? Presumably, you've got a lot of feedback through your work on setuptools/eggs - do you have a record of who might participate in a discussion? > Thus, instead of adding new PEP 302 APIs (like > get_loader) to 'imp', I added them to 'pkgutil'. How does that help? Where the code goes doesn't seem likely to make much difference... > There are, I believe, > some notes in that module's source regarding what the ordering issues are > w/meta_path vs. the way import works now. The only notes I could see in pkgutil.py refer to special locations like the Windows registry, and refer to the fact that they will be searched after path entries, not before (for reasons I couldn't quite follow, but that's likely because I only read the comments fairly quickly). But if the whole mechanism is moved to sys.meta_path (which is what Phase 2 is about) surely it's possible to choose the ordering just by the order the importers go on sys.meta_path? > That having been said, we could possibly have a transition for 2.6, but > everybody who's written any PEP 302 emulation code (outside of pkgutil > itself) would have to adapt their code somewhat. I don't really see how we're going to address that other than by implementing it, and waiting for people with issues to speak up. Highlighting the changes early is good, as it avoids a mid-beta rush of people "suddenly" finding issues, but I doubt we'll do much better than that. > I'm surprised, however, that you think working on this in C is going to be > *less* time than it would take to simply replace __import__ with a Python > function that reimplements PEP 302... That I do agree with. There's a bootstrapping issue (you can't import the Python module that does all this without using a C-coded import mechanism) but that should be resolvable. > especially since pkgutil contains a > whole lot of the code you'd need, e.g.: Yes, I'm quite surprised at how much has appeared in pkgutil. The "what's new" entry is very terse, and the module documentation itself hasn't been updated to mention the new stuff. That's a shame, as it looks very useful (and as you say, could form a substantial part of this change if we were coding it in Python). Paul. From brett at python.org Tue Oct 3 00:48:16 2006 From: brett at python.org (Brett Cannon) Date: Mon, 2 Oct 2006 15:48:16 -0700 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: On 10/2/06, Paul Moore wrote: > > On 10/2/06, Phillip J. Eby wrote: > [SNIP] > > I'm surprised, however, that you think working on this in C is going to > be > > *less* time than it would take to simply replace __import__ with a > Python > > function that reimplements PEP 302... > > That I do agree with. There's a bootstrapping issue (you can't import > the Python module that does all this without using a C-coded import > mechanism) but that should be resolvable. This is why I asked for input from people on which would take less time. Almost all the answers I got was that the the C code was delicate but that it was workable. Several people said they wished for a Python implementation, but hardly anyone said flat-out, "don't waste your time, the Python version will be faster to do". As for the bootstrapping, I am sure it is resolvable as well. There are several ways to go about it that are all tractable. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061002/2f47cbd4/attachment.html From tdelaney at avaya.com Tue Oct 3 01:47:03 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 3 Oct 2006 09:47:03 +1000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> skip at pobox.com wrote: > Steve> By these statistics I think the answer to the original > question Steve> is clearly "no" in the general case. > > As someone else (Guido?) pointed out, the literal case isn't all that > interesting. I modified floatobject.c to track a few interesting > floating point values: > > static unsigned int nfloats[5] = { > 0, /* -1.0 */ > 0, /* 0.0 */ > 0, /* +1.0 */ > 0, /* everything else */ > 0, /* whole numbers from -10.0 ... 10.0 */ > }; > > PyObject * > PyFloat_FromDouble(double fval) > { > register PyFloatObject *op; > if (free_list == NULL) { > if ((free_list = fill_free_list()) == NULL) > return NULL; > } > > if (fval == 0.0) nfloats[1]++; > else if (fval == 1.0) nfloats[2]++; > else if (fval == -1.0) nfloats[0]++; > else nfloats[3]++; > > if (fval >= -10.0 && fval <= 10.0 && (int)fval == fval) { > nfloats[4]++; > } This doesn't actually give us a very useful indication of potential memory savings. What I think would be more useful is tracking the maximum simultaneous count of each value i.e. what the maximum refcount would have been if they were shared. Tim Delaney From pje at telecommunity.com Tue Oct 3 01:52:09 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 02 Oct 2006 19:52:09 -0400 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: References: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: <5.1.1.6.0.20061002194406.03c28ac0@sparrow.telecommunity.com> At 03:48 PM 10/2/2006 -0700, Brett Cannon wrote: >On 10/2/06, Paul Moore <p.f.moore at gmail.com> >wrote: >>On 10/2/06, Phillip J. Eby >><pje at telecommunity.com> wrote: >>[SNIP] >> > I'm surprised, however, that you think working on this in C is going to be >> > *less* time than it would take to simply replace __import__ with a Python >> > function that reimplements PEP 302... >> >>That I do agree with. There's a bootstrapping issue (you can't import >>the Python module that does all this without using a C-coded import >>mechanism) but that should be resolvable. > >This is why I asked for input from people on which would take less >time. Almost all the answers I got was that the the C code was delicate >but that it was workable. Several people said they wished for a Python >implementation, but hardly anyone said flat-out, "don't waste your time, >the Python version will be faster to do". > >As for the bootstrapping, I am sure it is resolvable as well. There are >several ways to go about it that are all tractable. When I implemented the PEP 302 fix for the import speedups, I basically prototyped it using Python code that got loaded prior to 'site.py'. Once I had the Python version solid, I converted it to a C type via straightforward code transcription. That's pretty much the route I would follow for this too, although of course "freezing" the Python version into C code is also an option, since there's not much performance benefit to be had from a C translation, except for two parts of __import__: the part that checks sys.modules to shortcut the process, and the part that runs after the target module has been loaded or found. Aside from this "fast path" part of __import__, any additional interpretation overhead will probably be dwarfed by I/O considerations. From brett at python.org Tue Oct 3 01:52:46 2006 From: brett at python.org (Brett Cannon) Date: Mon, 2 Oct 2006 16:52:46 -0700 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for a new issue tracker Message-ID: On behalf of the PSF Infrastructure committee, I am happy to report that we have reached a recommendation for a new issue tracker for Python! But first, I want to extend our thanks to all who stepped forward to provide the committee with a test installation of an issue tracker to use as a basis of our evaluations. Having several trackers to compare may have made this more time-consuming, but it helped to realize what people did and did not like about the various issue trackers and solidify what we thought python-dev would want. Thank you! The Infrastructure committee (Andrew Kuchling, Thomas Wouters, Barry Warsaw, Martin v. Loewis, and myself; Richard Jones excused himself from the discussion because of personal bias) met and discussed the four trackers being considered to replace SourceForge: Launchpad, JIRA, Roundup, and Trac. After evaluating the trackers on several points (issue creation, querying, etc.), we reached a tie between JIRA and Roundup in terms of pure tracker features. For JIRA, members found it to be a very powerful, polished issue tracker. But some found it to be a little more complex than they would like in an issue tracker. Roundup, on the other hand, had the exact opposite points. While not as polished as JIRA, it is the simpler tracker which some committee members preferred. As for Trac and Launchpad, both had fundamental issues that led to them not being chosen in the end. Most of the considerations had to do with customization or UI problems. With JIRA and Roundup being considered equal overall in terms of the tracker themselves, there is the tie-breaking issue of hosting. Atlassian, the company that created JIRA, has offered us free hosting of a JIRA installation. This cannot be overlooked as keeping an issue tracker running is not easy and requires supervision at various hours of the day to make sure possible downtime is minimized. There is also always the issue of upgrading, etc. that come with any major software installation. Details on the hosting is pasted in at the end of this email as provided by Jonathan Nolen of Atlassian. He has also been cc:ed on this email so as to allow him to answer any questions directly. In order for Roundup to be considered equivalent in terms of an overall tracker package there needs to be a sufficient number of volunteer admins (roughly 6 - 10 people) who can help set up and maintain the Roundup installation. If enough people can be gathered, then Roundup will become the recommendation of the committee based on the fact that the trackers are roughly equal but that Roundup is implemented in Python and is FLOSS. If not enough support can be gathered, the committee's recommendation of going with JIRA will stand. If people want Roundup to be considered the tracker we go with by volunteering to be an admin, please email infrastructure at python.org and state your time commitment, the timezone you would be working from, and your level of Roundup knowledge. Please email the committee by October 16. If enough people step forward we will notify python-dev that Roundup should be considered the recommendation of the committee and graciously turn down Atlassian's offer. -Brett Cannon Chairman, PSF Infrastructure Committee ----------------------------------------------------------- [email from Jonathan, unedited, with details about hosting] Hosting is with http://contegix.com. They host all of our servers, as well as those of Cenqua, Codehaus, Jive (I think), and a bunch of other folks in the Java community. They have engineers online 24x7x365. I've contacted them at all hours of the night and weekend and never failed to get a response with 5 minutes, though they guarantee 30 minutes. The engineers I've worked with have been universally top-notch. They've been able to help with every kind of question I've thrown at them. It's hard to describe how great they are, but it's like having a full-time sysadmin on staff who knows everything about your systems, who never goes to sleep, and who is always seems chipper at the very thought of making any change you might ask. Ideally, we'd set it up so that the appropriate members of the Python team could contact Contegix directly for any requests you may have. You'll also have direct access yourself if you need to do any work on your own. As far as the export, they will set it up any way you like. The two obvious ways that come to mind are copying the XML backup or a database dump each night (Or whatever frequency you specify). Either option would allow you to fully restore a JIRA instance to the point of the backup with full history. They will pro-actively keep your apps up to date as well. They usually know as soon as we release new versions and will contact your to arrange upgrades almost immediately. They also perform things like OS upgrades and patches on a regular basis without having to be prompted. Contegix will set up monitoring on your server(s) to watch things like disk-space, memory, CPU and networking usage. If any of those resources starts to get maxed out, they'll let us know and offer advice on how to fix it. Right now, we have the Python stuff and the Mailman stuff on one server. There should be enough capacity for both, but if your usage grows to the point where we need more hardware, we can arrange that (within reason). If you ever needed to make your own arrangements with Contegix, their rates are reasonable, and you can either buy or lease hardware as you choose. I'm also sure that they would be flexible for a active, popular, open-source project such as Python. When Barry and I spoke, he told me that you had four or five other servers scattered around the world running things like SVN, mail and web. If you would ever be interested in consolidating those services with Contegix, it is likely that we could help you out with those as well. SVN would be a particular benefit, as the Fisheye Plugin for JIRA is really useful, and will perform better over the local network. It can still be used from your current host, it'll just be a little slower to get new information. I should also mention that Atlassian will soon be introducing two new products: Crowd, a user-management/single sign-on solution and Bamboo, a build server. if you guys are interested in trying either of those, you're welcome to them. I can imagine both might be useful to a project like Python. I'm happy to help out, and we continue to be very interested in seeing the project happen. If there's anything further we can do, don't hesitate to ask. Cheers, Jonathan P.S. Here's is Contegix's material about their service: Data Center Contegix's data center is located in the Bandwidth Exchange Building on Walnut Avenue, which is the premier, carrier building for this region. Security is very important to our clients and us. As a result, access beyond the lobby requires a code access for the elevators. Once someone reaches our floor, all of our perimeter doors require both a card key access combined with a matching biometric palm scan to access our facility. Once someone has been admitted to our suite, they are then required to log in and IDs checked against our Access Log for customers. Once authenticated, a Customer Badge will be issued. Visitors are only allowed escorted access to the data center and NOC on an as needed basis. In addition to all exterior doors being controlled access, all internal doors leading to the data center also require an additional card scan for access. Within the data center, all customer equipment is located in locked cabinets or cages. In addition to restricted access, the facility is monitored with digital cameras 24x7 recording all movement within the data center. Technical Support Our facility is staffed with Tier 3 Support Engineers 24x7x365. Our engineers are available to assist you with any needs you may have at any time. Because the highest level of support available is key to both of our businesses, Contegix engineers focus upon keeping your application and data available at all times. Therefore, we guarantee all support requests will be responded to within thirty minutes and the average response is three minutes. In addition, through our custom monitoring system, we are capable of actively monitoring almost anything you would like monitored. Many of our engineers are Dell certified technicians. In addition, we maintain ample stock of spare parts for Dell servers including hard drives, memory, etc. Rest assured, every precaution and measure is taken to ensure your equipment will be up and running should a hardware failure occur. Contegix Network Contegix offers one of the strongest networks available in the industry. Our network infrastructure is fully meshed, running redundant Juniper routers and Foundry BigIron core switches. We have five Tier 1 providers including Sprint, Level (3), MCI, XO and WilTel running BGP4. Because our data center is located in the Bandwidth Exchange Building, which is the Internet hub for this region, all of our connections to our providers in our managed network are "On Net" meaning we connect directly to the Internet avoiding local loops and local connections. Our core switching infrastructure provides the ability to deliver load balancing without a significant investment in equipment. When your needs grow, Contegix will be able to deliver. Our network is enhanced by our Intelligent Routing Solution and DDoS Mitigation/Protection system, which drastically improves the quality of our network performance and reliability. One of the benefits of our redundant, intelligently routed network is our 100% Network Uptime guarantee, delivered in writing. Power Infrastructure Our power infrastructure was built with redundancy in mind. All power supplied to the data center is clean and constant coming from the redundant UPS' (AC) or battery plants (DC). The PowerWare UPS systems run in a redundant configuration to maximize reliability. The UPS and battery plants are being constantly charged by our dual grid connection to Ameren. There is an Automatic Transfer Switch between the two grids. In addition, if power is interrupted to the UPS/battery plants, another Automatic Transfer Switch automatically starts the diesel generator farm. All of this occurs instantaneously without human intervention to eliminate potential mistakes or errors and maximizing performance. Environmental Controls and Protection Our Environmental Systems run in a redundant configuration. Each Environmental Control Unit/CRAC has a redundant "twin" on stand-by to take over in the event of a failure or service affecting health issue. These units maintain constant temperature (72?) and humidity (45%) in the data center. Contegix has configured our data center with hot and cold aisles for maximum cooling performance. Fire Detection / Suppression is configured with three independent systems. The first two monitor for temperature and smoke. The third system is a VESDA system that inspects air samples with a laser to detect any potential fire hazards prior to an actual fire event. Our sprinkler system is dry pipe / pre-action which means the sprinkler lines are filled with compressed air, not water. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061002/362491f0/attachment-0001.htm From tjreedy at udel.edu Tue Oct 3 02:05:28 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 2 Oct 2006 20:05:28 -0400 Subject: [Python-Dev] Caching float(0.0) References: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> Message-ID: "Kristj?n V. J?nsson" wrote in message news:129CEF95A523704B9D46959C922A280002FE99A9 at nemesis.central.ccp.cc... >Anyway, Skip noted that 50% of all floats are whole numbers between -10 >and 10 inclusive, Please, no. He said something like this about *non-floating-point applications* (evidence unspecified, that I remember). But such applications, by definition, usually don't have enough floats for caching (or conversion time) to matter too much. For true floating point measurements (of temperature, for instance), 'integral' measurements (which are an artifact of the scale used (degrees F versus C versus K)) should generally be no more common than other realized measurements. Thirty years ago, a major stat package written in Fortran (BMDP) required that all data be stored as (Fortran 4-byte) floats for analysis. So a column of yes/no or male/female data would be stored as 0.0/1.0 or perhaps 1.0/2.0. That skewed the distribution of floats. But Python and, I hope, Python apps, are more modern than that. >and this is the code that I employ in our python build today: [snip] For the analysis of typical floating point data, this is all pointless and a complete waste of time. After a billion comversions or so, I expect the extra time might add up to something noticeable. > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] >> I'm worried about the penalty that this causes in terms of >> run-time cost. Me too. >> Also, how do you chose what values to cache? At one time (don't know about today), it was mandatory in some Fortran circles to name the small float constants used in a particular program with the equivalent of C #defines. In Python, zero = 0.0, half = 0.5, one = 1.0, twopi = 6.29..., eee = 2.7..., phi = .617..., etc. (Note that naming is not restricted to integral or otherwise 'nice' values.) The purpose then was to allow easy conversion from float to double to extended double. And in some cases, it also made the code clearer. With Python, the same procedure would guarantee only one copy (caching) of the same floats for constructed data structures. Float caching strikes me a a good subject for cookbook recipies, but not, without real data and a willingness to slightly screw some users, for the default core code. Terry Jan Reedy From amk at amk.ca Tue Oct 3 02:21:10 2006 From: amk at amk.ca (A.M. Kuchling) Date: Mon, 2 Oct 2006 20:21:10 -0400 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: <20061003002110.GA20505@rogue.amk.ca> On Mon, Oct 02, 2006 at 11:27:07PM +0100, Paul Moore wrote: > Yes, I'm quite surprised at how much has appeared in pkgutil. The > "what's new" entry is very terse, and the module documentation itself > hasn't been updated to mention the new stuff. These two things are related, of course; I couldn't figure out which bits of pkgutil.py are intended to be publicly used and which weren't. There's an __all__ in the module, but some things such as read_code() don't look like they're intended for external use. --amk From skip at pobox.com Tue Oct 3 02:50:44 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 2 Oct 2006 19:50:44 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> Message-ID: <17697.46052.714229.687538@montanaro.dyndns.org> Tim> This doesn't actually give us a very useful indication of potential Tim> memory savings. What I think would be more useful is tracking the Tim> maximum simultaneous count of each value i.e. what the maximum Tim> refcount would have been if they were shared. Most definitely. I just posted what I came up with in about two minutes. I'll add some code to track the high water mark as well and report back. Skip From skip at pobox.com Tue Oct 3 02:53:34 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 2 Oct 2006 19:53:34 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> Message-ID: <17697.46222.513799.299606@montanaro.dyndns.org> Terry> "Kristj?n V. J?nsson" wrote: >> Anyway, Skip noted that 50% of all floats are whole numbers between >> -10 and 10 inclusive, Terry> Please, no. He said something like this about Terry> *non-floating-point applications* (evidence unspecified, that I Terry> remember). But such applications, by definition, usually don't Terry> have enough floats for caching (or conversion time) to matter too Terry> much. Correct. The non-floating-point application I chose was the one that was most immediately available, "make test". Note that I have no proof that regrtest.py isn't terribly floating point intensive. I just sort of guessed that it was. Skip From pje at telecommunity.com Tue Oct 3 03:04:31 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 02 Oct 2006 21:04:31 -0400 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: <20061003002110.GA20505@rogue.amk.ca> References: <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: <5.1.1.6.0.20061002205715.03c331f0@sparrow.telecommunity.com> At 08:21 PM 10/2/2006 -0400, A.M. Kuchling wrote: >On Mon, Oct 02, 2006 at 11:27:07PM +0100, Paul Moore wrote: > > Yes, I'm quite surprised at how much has appeared in pkgutil. The > > "what's new" entry is very terse, and the module documentation itself > > hasn't been updated to mention the new stuff. > >These two things are related, of course; I couldn't figure out which >bits of pkgutil.py are intended to be publicly used and which weren't. >There's an __all__ in the module, but some things such as read_code() >don't look like they're intended for external use. The __all__ listing is correct; I intended to expose read_code() for the benefit of other importer implementations and Python utilities. Over the years, I've found myself writing the equivalent of read_code() several times, so it seemed to me to make sense to expose it as a utility function, since it already needed to be there for the ImpLoader class to work. In general, the idea behind the additions to pkgutil was to make life easier for people doing import-related operations, by being a Python reference implementation of commonly-reinvented parts of the import process. The '-m' machinery in 2.5 had a bunch of this stuff in it, and so did setuptools, so I yanked the code from both and refactored it to allow reuse by both, then fleshed it out to support all the optional PEP 302 loader protocols, and additional protocols needed to support tools like pydoc being able to run against arbitrary importers (esp. zip files). From skip at pobox.com Tue Oct 3 03:25:12 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 2 Oct 2006 20:25:12 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17697.46052.714229.687538@montanaro.dyndns.org> References: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> <17697.46052.714229.687538@montanaro.dyndns.org> Message-ID: <17697.48120.767852.672495@montanaro.dyndns.org> skip> Most definitely. I just posted what I came up with in about two skip> minutes. I'll add some code to track the high water mark as well skip> and report back. Using the smallest change I could get away with, I came up with these allocation figures (same as before): -1.0: 29048 0.0: 524340 +1.0: 91560 rest: 1753479 whole numbers -10.0 to 10.0: 1151543 and these max ref counts: -1.0: 16 0.0: 136 +1.0: 161 rest: 1 whole numbers -10.0 to 10.0: 161 When I have a couple more minutes I'll just implement a cache for whole numbers between -10.0 and 10.0 and test that whole range of values right. Skip From nick at craig-wood.com Tue Oct 3 10:14:41 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Tue, 3 Oct 2006 09:14:41 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> Message-ID: <20061003081441.GA12283@craig-wood.com> On Tue, Oct 03, 2006 at 09:47:03AM +1000, Delaney, Timothy (Tim) wrote: > This doesn't actually give us a very useful indication of potential > memory savings. What I think would be more useful is tracking the > maximum simultaneous count of each value i.e. what the maximum refcount > would have been if they were shared. It isn't just memory savings we are playing for. Even if 0.0 is allocated and de-allocated 10,000 times in a row, there would be no memory savings by caching its value. However there would be a) less allocator overhead - allocation objects is relatively expensive b) better caching of the value c) less cache thrashing I think you'll find that even in the no memory saving case a few cycles spent on comparison with 0.0 (or maybe a few other values) will speed up programs. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From nick at craig-wood.com Tue Oct 3 10:17:26 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Tue, 3 Oct 2006 09:17:26 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17697.46222.513799.299606@montanaro.dyndns.org> References: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> <17697.46222.513799.299606@montanaro.dyndns.org> Message-ID: <20061003081726.GB12283@craig-wood.com> On Mon, Oct 02, 2006 at 07:53:34PM -0500, skip at pobox.com wrote: > Terry> "Kristj?n V. J?nsson" wrote: > >> Anyway, Skip noted that 50% of all floats are whole numbers between > >> -10 and 10 inclusive, > > Terry> Please, no. He said something like this about > Terry> *non-floating-point applications* (evidence unspecified, that I > Terry> remember). But such applications, by definition, usually don't > Terry> have enough floats for caching (or conversion time) to matter too > Terry> much. > > Correct. The non-floating-point application I chose was the one that was > most immediately available, "make test". Note that I have no proof that > regrtest.py isn't terribly floating point intensive. I just sort of guessed > that it was. For my application caching 0.0 is by far the most important. 0.0 has ~200,000 references - the next highest reference count is only about ~200. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From fredrik at pythonware.com Tue Oct 3 10:32:07 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 03 Oct 2006 10:32:07 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE99A9@nemesis.central.ccp.cc> Message-ID: Terry Reedy wrote: > For true floating point measurements (of temperature, for instance), > 'integral' measurements (which are an artifact of the scale used (degrees F > versus C versus K)) should generally be no more common than other realized > measurements. a real-life sensor is of course where the 121.216 in my original post to this thread came from. (note that most real-life sensors involve A/D conversion at some point, which means that they provide a limited number of discrete values. but only the code dealing with the source data will be able to make any meaningful assumptions about those values.) I still think it might make sense to special-case float("0.0") (padding, default values, etc) inside PyFloat_FromDouble, and possibly also float("1.0") (scale factors, unit vectors, normalized max values, etc) but everything else is just generalizing from random observations. adding a few notes to the C API documentation won't hurt either, I suppose. (e.g. "note that each call to PyFloat_FromDouble may create a new floating point object; if you're converting data from some internal format to Python floats, it's often more efficient to map directly to preallocated shared PyFloat objects, instead of mapping first to float or double and then calling PyFloat_FromDouble on that value"). From nmm1 at cus.cam.ac.uk Tue Oct 3 11:12:04 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 10:12:04 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Mon, 02 Oct 2006 20:05:28 EDT." Message-ID: "Terry Reedy" wrote: > > For true floating point measurements (of temperature, for instance), > 'integral' measurements (which are an artifact of the scale used (degrees F > versus C versus K)) should generally be no more common than other realized > measurements. Not quite, but close enough. A lot of algorithms use a conversion to integer, or some of the values are actually counts (e.g. in statistics), which makes them a bit more likely. Not enough to get excited about, in general. > Thirty years ago, a major stat package written in Fortran (BMDP) required > that all data be stored as (Fortran 4-byte) floats for analysis. So a > column of yes/no or male/female data would be stored as 0.0/1.0 or perhaps > 1.0/2.0. That skewed the distribution of floats. But Python and, I hope, > Python apps, are more modern than that. And SPSS and Genstat and others - now even Excel .... > Float caching strikes me a a good subject for cookbook recipies, but not, > without real data and a willingness to slightly screw some users, for the > default core code. Yes. It is trivial (if tedious) to add analysis code - the problem is finding suitable representative applications. That was always my difficulty when I was analysing this sort of thing - and still is when I need to do it! > Nick Craig-Wood wrote: > > For my application caching 0.0 is by far the most important. 0.0 has > ~200,000 references - the next highest reference count is only about ~200. Yes. All the experience I have ever seen over the past 4 decades confirms that is the normal case, with the exception of floating-point representations that have a missing value indicator. Even in IEEE 754, infinities and NaN are rare unless the application is up the spout. There are claims that a lot of important ones have a lot of NaNs and use them as missing values but, despite repeated requests, none of the people claiming that have ever provided an example. There are some pretty solid grounds for believing that those claims are not based in fact, but are polemic. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From ncoghlan at gmail.com Tue Oct 3 11:34:03 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 03 Oct 2006 19:34:03 +1000 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> Message-ID: <45222E8B.30304@gmail.com> Hans Polak wrote: > Ok, I see your point. Really, I've read more about Python than worked with > it, so I'm out of my league here. > > Can I combine your suggestion with mine and come up with the following: > > do: > > > while > else: > In my example, the 3 sections (, and are all optional. A basic do-while loop would look like this: do: while (That is, is still repeated each time around the loop - it's called that because it is run before the loop evaluated condition is evaluated) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fuzzyman at voidspace.org.uk Tue Oct 3 12:00:18 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Tue, 03 Oct 2006 11:00:18 +0100 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <45222E8B.30304@gmail.com> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> <45222E8B.30304@gmail.com> Message-ID: <452234B2.8040103@voidspace.org.uk> Nick Coghlan wrote: >Hans Polak wrote: > > >>Ok, I see your point. Really, I've read more about Python than worked with >>it, so I'm out of my league here. >> >>Can I combine your suggestion with mine and come up with the following: >> >> do: >> >> >> while >> else: >> >> >> > >In my example, the 3 sections (, and code> are all optional. A basic do-while loop would look like this: > > do: > > while > >(That is, is still repeated each time around the loop - it's >called that because it is run before the loop evaluated condition is evaluated) > > +1 This looks good. The current idiom works fine, but looks unnatural : while True: if : break Would a 'while' outside of a 'do' block (but without the colon) then be a syntax error ? 'do:' would just be syntactic sugar for 'while True:' I guess. Michael Foord http://www.voidspace.org.uk >Cheers, >Nick. > > > From kristjan at ccpgames.com Tue Oct 3 12:15:26 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Tue, 3 Oct 2006 10:15:26 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99B1@nemesis.central.ccp.cc> But that is precisely the point. A non-floating point application tends to use floating point values in a predictable way, with a lot of integral values floating around and lots of zeroes. As this constitutes the majority of python applications (okay, daring assumption here) it seems to warrant some consideration. In one of my first messages on the subject I promised to report refcounts of -1.0, 0.0 and 1.0 for the EVE server as being. I didn't but instead gave you the frequency of the values reported. Well , now I can provide you with refcounts for the [-10, 10] range plus the total float count, of a server that has just started up: -10,0 589 -9,0 56 -8,0 65 -7,0 63 -6,0 243 -5,0 731 -4,0 550 -3,0 246 -2,0 246 -1,0 1096 0,0 195446 1,0 79382 2,0 9650 3,0 6224 4,0 5223 5,0 14766 6,0 2616 7,0 1303 8,0 3307 9,0 1447 10,0 8102 total: 331351 The total count of floating point numbers allocated at this point is 985794. Without the reuse, they would be 1317145, so this is a saving of 25%, and of 5Mb. Kristj?n > -----Original Message----- > From: python-dev-bounces+kristjan=ccpgames.com at python.org > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] > On Behalf Of skip at pobox.com > Sent: 3. okt?ber 2006 00:54 > To: Terry Reedy > Cc: python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > > Terry> "Kristj?n V. J?nsson" wrote: > >> Anyway, Skip noted that 50% of all floats are whole > numbers between > >> -10 and 10 inclusive, > > Terry> Please, no. He said something like this about > Terry> *non-floating-point applications* (evidence > unspecified, that I > Terry> remember). But such applications, by definition, > usually don't > Terry> have enough floats for caching (or conversion > time) to matter too > Terry> much. > > Correct. The non-floating-point application I chose was the > one that was most immediately available, "make test". Note > that I have no proof that regrtest.py isn't terribly floating > point intensive. I just sort of guessed that it was. > > Skip From nmm1 at cus.cam.ac.uk Tue Oct 3 12:32:05 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 11:32:05 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 10:15:26 -0000." <129CEF95A523704B9D46959C922A280002FE99B1@nemesis.central.ccp.cc> Message-ID: =?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?= wrote: > > The total count of floating point numbers allocated at this point is 985794. > Without the reuse, they would be 1317145, so this is a saving of 25%, and > of 5Mb. And, if you optimised just 0.0, you would get 60% of that saving at a small fraction of the cost and considerably greater generality. It isn't clear whether the effort justifies doing more. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From skip at pobox.com Tue Oct 3 13:21:08 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 3 Oct 2006 06:21:08 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE99B1@nemesis.central.ccp.cc> Message-ID: <17698.18340.82069.83941@montanaro.dyndns.org> >> The total count of floating point numbers allocated at this point is >> 985794. Without the reuse, they would be 1317145, so this is a >> saving of 25%, and of 5Mb. Nick> And, if you optimised just 0.0, you would get 60% of that saving Nick> at a small fraction of the cost and considerably greater Nick> generality. It isn't clear whether the effort justifies doing Nick> more. Doesn't that presume that optimizing just 0.0 could be done easily? Suppose 0.0 is generated all over the place in EVE? Skip From nmm1 at cus.cam.ac.uk Tue Oct 3 13:38:35 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 12:38:35 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 06:21:08 CDT." <17698.18340.82069.83941@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > > Doesn't that presume that optimizing just 0.0 could be done easily? Suppose > 0.0 is generated all over the place in EVE? Yes, and it isn't, respectively! The changes in floatobject.c would be trivial (if tedious), and my recollection of my scan is that floating values are not generated elsewhere. It would be equally easy to add a general caching algorithm, but that would be a LOT slower than a simple floating-point comparison. The problem (in Python) isn't hooking the checks into place, though it could be if Python were implemented differently. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From martin at v.loewis.de Tue Oct 3 14:25:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 14:25:38 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20061003081441.GA12283@craig-wood.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1EA85@au3010avexu1.global.avaya.com> <20061003081441.GA12283@craig-wood.com> Message-ID: <452256C2.60206@v.loewis.de> Nick Craig-Wood schrieb: > Even if 0.0 is allocated and de-allocated 10,000 times in a row, there > would be no memory savings by caching its value. > > However there would be > a) less allocator overhead - allocation objects is relatively expensive > b) better caching of the value > c) less cache thrashing > > I think you'll find that even in the no memory saving case a few > cycles spent on comparison with 0.0 (or maybe a few other values) will > speed up programs. Can you demonstrate that speedup? It is quite difficult to anticipate the performance impact of a change, in particular if there is no change in computational complexity. Various effects tend to balance out each other. Regards, Martin From martin at v.loewis.de Tue Oct 3 14:30:35 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 14:30:35 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <452257EB.6070601@v.loewis.de> Nick Maclaren schrieb: >> The total count of floating point numbers allocated at this point is 985794. >> Without the reuse, they would be 1317145, so this is a saving of 25%, and >> of 5Mb. > > And, if you optimised just 0.0, you would get 60% of that saving at > a small fraction of the cost and considerably greater generality. As Michael Hudson observed, this is difficult to implement, though: You can't distinguish between -0.0 and +0.0 easily, yet you should. Regards, Martin From fredrik at pythonware.com Tue Oct 3 14:56:54 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 3 Oct 2006 14:56:54 +0200 Subject: [Python-Dev] what's really new in python 2.5 ? Message-ID: just noticed that the first google hit for "what's new in python 2.5": http://docs.python.org/dev/whatsnew/whatsnew25.html points to a document that's a weird mix between that actual document, and a placeholder for "what's new in python 2.6". From nmm1 at cus.cam.ac.uk Tue Oct 3 15:11:27 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 14:11:27 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 14:30:35 +0200." <452257EB.6070601@v.loewis.de> Message-ID: =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > > >> The total count of floating point numbers allocated at this point is 985794. > >> Without the reuse, they would be 1317145, so this is a saving of 25%, and > >> of 5Mb. > > > > And, if you optimised just 0.0, you would get 60% of that saving at > > a small fraction of the cost and considerably greater generality. > > As Michael Hudson observed, this is difficult to implement, though: > You can't distinguish between -0.0 and +0.0 easily, yet you should. That was the point of a previous posting of mine in this thread :-( You shouldn't, despite what IEEE 754 says, at least if you are allowing for either portability or numeric validation. There are a huge number of good reasons why IEEE 754 signed zeroes fit extremely badly into any normal programming language and are seriously incompatible with numeric validation, but Python adds more. Is there any other type where there are two values that are required to be different, but where both the hash is required to be zero and both are required to evaluate to False in truth value context? Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From amk at amk.ca Tue Oct 3 15:40:51 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 3 Oct 2006 09:40:51 -0400 Subject: [Python-Dev] 2.4.4 fixes Message-ID: <20061003134051.GA21154@rogue.amk.ca> I've gone through the 'backport candidate' bugs listed on and applied most of them. Some I didn't apply because I don't understand them well enough to determine if they're correct for 2.4: * r47061 (recursionerror fix) * r46602 (tokenizer.c bug; patch doesn't apply cleanly) * r46589 (let dicts propagate eq errors; dictresize bug -- this led to a big long 2.5 discussion, so I won't backport. Maybe someone can extract just the dictresize bugfix.) * r39044 (A C threading API bug) There are also some other bugs listed on the wiki page that involve metaclasses; I'm not going to touch them. subprocess.py received a number of bugfixes in 2.5, but also some API additions. Can someone please look at these and apply the fixes? The wiki page now lists all the revisions stemming from valgrind and Klocwork errors. There are a lot of them; more volunteers will be necessary if they're all to get looked at and possibly backported. --amk From ncoghlan at gmail.com Tue Oct 3 15:51:22 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 03 Oct 2006 23:51:22 +1000 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <452234B2.8040103@voidspace.org.uk> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> <45222E8B.30304@gmail.com> <452234B2.8040103@voidspace.org.uk> Message-ID: <45226ADA.9080306@gmail.com> Fuzzyman wrote: > Nick Coghlan wrote: >> In my example, the 3 sections (, and > code> are all optional. A basic do-while loop would look like this: >> >> do: >> >> while >> >> (That is, is still repeated each time around the loop - it's >> called that because it is run before the loop evaluated condition is evaluated) >> >> > > +1 > > This looks good. I'm pretty sure it was proposed by someone else a long time ago - I was surprised to find it wasn't mentioned in PEP 315. That said, Guido's observation on PEP 315 from earlier this year holds for me too: "I kind of like it but it doesn't strike me as super important" [1] > The current idiom works fine, but looks unnatural : > > while True: > if : > break There's the rationale for the PEP in a whole 5 lines counting whitespace ;) > Would a 'while' outside of a 'do' block (but without the colon) then be > a syntax error ? > > 'do:' would just be syntactic sugar for 'while True:' I guess. That's the slight issue I still have with the idea - you could end up with multiple ways of spelling some of the basic loop forms, such as these 3 flavours of infinite loop: do: pass # Is there an implicit 'while True' at the end of the loop body? do: while True while True: pass The other issue I have is that I'm not yet 100% certain it's implementable with Python's parser and grammar. I *think* changing the definition of the while statement from: while_stmt ::= "while" expression ":" suite ["else" ":" suite] to while_stmt ::= "while" expression [":" suite ["else" ":" suite]] And adding a new AST node and a new type of compiler frame block "DO_LOOP" would do the trick (the compilation of a while statement without a trailing colon would then check that it was in a DO_LOOP block and raise an error if not). Cheers, Nick. [1] http://mail.python.org/pipermail/python-dev/2006-February/060711.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Tue Oct 3 16:10:31 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 16:10:31 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <45226F57.108@v.loewis.de> Nick Maclaren schrieb: > That was the point of a previous posting of mine in this thread :-( > > You shouldn't, despite what IEEE 754 says, at least if you are > allowing for either portability or numeric validation. > > There are a huge number of good reasons why IEEE 754 signed zeroes > fit extremely badly into any normal programming language and are > seriously incompatible with numeric validation, but Python adds more. > Is there any other type where there are two values that are required > to be different, but where both the hash is required to be zero and > both are required to evaluate to False in truth value context? Ah, you are proposing a semantic change, then: -0.0 will become unrepresentable, right? Regards, Martin From fdrake at acm.org Tue Oct 3 16:18:50 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 3 Oct 2006 10:18:50 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: Message-ID: <200610031018.50930.fdrake@acm.org> On Tuesday 03 October 2006 08:56, Fredrik Lundh wrote: > just noticed that the first google hit for "what's new in python 2.5": > > http://docs.python.org/dev/whatsnew/whatsnew25.html > > points to a document that's a weird mix between that actual document, and > a placeholder for "what's new in python 2.6". I suspect Google (and all other search engines) should be warded off from docs.python.org/dev/. -Fred -- Fred L. Drake, Jr. From fuzzyman at voidspace.org.uk Tue Oct 3 16:28:31 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Tue, 03 Oct 2006 15:28:31 +0100 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <45226ADA.9080306@gmail.com> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> <45222E8B.30304@gmail.com> <452234B2.8040103@voidspace.org.uk> <45226ADA.9080306@gmail.com> Message-ID: <4522738F.80303@voidspace.org.uk> Nick Coghlan wrote: > [snip..] > >> The current idiom works fine, but looks unnatural : >> >> while True: >> if : >> break > > > There's the rationale for the PEP in a whole 5 lines counting > whitespace ;) > >> Would a 'while' outside of a 'do' block (but without the colon) then be >> a syntax error ? >> >> 'do:' would just be syntactic sugar for 'while True:' I guess. > > > That's the slight issue I still have with the idea - you could end up > with multiple ways of spelling some of the basic loop forms, such as > these 3 flavours of infinite loop: > > do: > pass # Is there an implicit 'while True' at the end of the loop > body? > > do: > while True > > while True: > pass > Following the current idiom, isn't it more natural to repeat the loop 'until' a condition is met. If we introduced two new keywords, it would avoid ambiguity in the use of 'while'. do: until A do loop could require an 'until', meaning 'do' is not *just* a replacement for an infinite loop. (Assuming the parser can be coerced into co-operation.) It is obviously still a new construct in terms of Python syntax (not requiring a colon after ''.) I'm sure this has been suggested, but wonder if it has already been ruled out. An 'else' block could then retain its current meaning (execute if the loop is not terminated early by an explicit break.) Michael Foord http://www.voidspace.org.uk From ncoghlan at gmail.com Tue Oct 3 16:27:59 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 04 Oct 2006 00:27:59 +1000 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: Message-ID: <4522736F.9040101@gmail.com> Fredrik Lundh wrote: > just noticed that the first google hit for "what's new in python 2.5": > > http://docs.python.org/dev/whatsnew/whatsnew25.html > > points to a document that's a weird mix between that actual document, and > a placeholder for "what's new in python 2.6". D'oh. It's going to take a while for the stable docs to catch up to that one given the large number of external links to that page using that title :( Since the URL for the actual Python 2.6 What's New finishes with whatsnew26.html, perhaps this URL could be updated to redirect users to the stable version instead? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From amk at amk.ca Tue Oct 3 16:30:15 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 3 Oct 2006 10:30:15 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: Message-ID: <20061003143015.GA25511@localhost.localdomain> On Tue, Oct 03, 2006 at 02:56:54PM +0200, Fredrik Lundh wrote: > just noticed that the first google hit for "what's new in python 2.5": > > http://docs.python.org/dev/whatsnew/whatsnew25.html > > points to a document that's a weird mix between that actual document, and > a placeholder for "what's new in python 2.6". Thanks for pointing this out! I've added a redirect from /whatsnew25.html to the correct location, but am puzzled by the 2.6 document; it has section names like 'pep-308.html', which are set by a \label{pep-308} directive in the LaTeX, but no such \label exists in the 2.6 document. Neal, could you please delete all the temp files in whatever directory is used to build the documentation? I wonder if there's a *.aux file or something that still has labels from the 2.5 document. It might be easiest to just delete the whatsnew/ directory and then do an 'svn up' to get it back. --amk From amk at amk.ca Tue Oct 3 16:35:43 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 3 Oct 2006 10:35:43 -0400 Subject: [Python-Dev] 2.4.4 fixes In-Reply-To: <20061003134051.GA21154@rogue.amk.ca> References: <20061003134051.GA21154@rogue.amk.ca> Message-ID: <20061003143543.GB25511@localhost.localdomain> On Tue, Oct 03, 2006 at 09:40:51AM -0400, A.M. Kuchling wrote: > The wiki page now lists all the revisions stemming from valgrind and > Klocwork errors. There are a lot of them; more volunteers will be > necessary if they're all to get looked at and possibly backported. I've now looked at the Valgrind errors; most of them were already in 2.4 or don't matter (ctypes, sqlite3 fixes). One revision remains, changing the size of strings allocated in the confstr() wrapper in posixmodule.c. The patch doesn't apply cleanly -- can someone please look at this? --amk From fdrake at acm.org Tue Oct 3 16:39:52 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 3 Oct 2006 10:39:52 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <20061003143015.GA25511@localhost.localdomain> References: <20061003143015.GA25511@localhost.localdomain> Message-ID: <200610031039.52434.fdrake@acm.org> On Tuesday 03 October 2006 10:30, A.M. Kuchling wrote: > Neal, could you please delete all the temp files in whatever directory > is used to build the documentation? I wonder if there's a *.aux file > or something that still has labels from the 2.5 document. It might be > easiest to just delete the whatsnew/ directory and then do an 'svn up' > to get it back. I would guess this has everything to do with how the updated docs are deployed and little or nothing about the cleanliness of the working area. The mkhowto script should be cleaning out the old HTML before generating the new. I'm guessing the deployment simply unpacks the new on top of the old; the old should be removed first. For the /dev/ area, I don't think redirects are warranted. I'd rather see the crawlers just not bother with that, since those are more likely decoys than usable end-user docs. -Fred -- Fred L. Drake, Jr. From nmm1 at cus.cam.ac.uk Tue Oct 3 17:12:19 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 16:12:19 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 16:10:31 +0200." <45226F57.108@v.loewis.de> Message-ID: =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > > Ah, you are proposing a semantic change, then: -0.0 will become > unrepresentable, right? Well, it is and it isn't. Python currently supports only some of IEEE 754, and that is more by accident than design - because that is exactly what C90 implementations do! There is code in floatobject.c that assumes IEEE 754, but Python does NOT attempt to support it in toto (it is not clear if it could), not least because it uses C90. And, as far as I know, none of that is in the specification, because Python is at least in theory portable to systems that use other arithmetics and there is no current way to distinguish -0.0 from 0.0 except by comparing their representations! And even THAT depends entirely on whether the C library distinguishes the cases, as far as I can see. So distinguishing -0.0 from 0.0 isn't really in Python's current semantics at all. And, for reasons that we could go into, I assert that it should not be - which is NOT the same as not supporting branch cuts in cmath. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From martin at v.loewis.de Tue Oct 3 17:41:05 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 17:41:05 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <45228491.9010103@v.loewis.de> Nick Maclaren schrieb: > So distinguishing -0.0 from 0.0 isn't really in Python's current > semantics at all. And, for reasons that we could go into, I assert > that it should not be - which is NOT the same as not supporting > branch cuts in cmath. Are you talking about "Python the language specification" or "Python the implementation" here? It is not a change to the language specification, as this aspect of the behavior (as you point out) is unspecified. However, it is certainly a change to the observable behavior of the Python implementation, and no amount of arguing can change that. Regards, Martin P.S. For that matter, *any* kind of changes to the singleton nature of certain immutable values is a change in semantics. It's just that dropping -0.0 is an *additional* change (on top of the change that "1.0-1.0 is 0.0" would change from False to True). From nicko at nicko.org Tue Oct 3 17:45:16 2006 From: nicko at nicko.org (Nicko van Someren) Date: Tue, 3 Oct 2006 16:45:16 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45226F57.108@v.loewis.de> References: <45226F57.108@v.loewis.de> Message-ID: <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> On 3 Oct 2006, at 15:10, Martin v. L?wis wrote: > Nick Maclaren schrieb: >> That was the point of a previous posting of mine in this thread :-( >> >> You shouldn't, despite what IEEE 754 says, at least if you are >> allowing for either portability or numeric validation. >> >> There are a huge number of good reasons why IEEE 754 signed zeroes >> fit extremely badly into any normal programming language and are >> seriously incompatible with numeric validation, but Python adds more. >> Is there any other type where there are two values that are required >> to be different, but where both the hash is required to be zero and >> both are required to evaluate to False in truth value context? > > Ah, you are proposing a semantic change, then: -0.0 will become > unrepresentable, right? It's only a semantic change on platforms that "happen to" use IEEE 754 float representations, or some other representation that exposes the sign of zero. The Python docs have for many years stated with regard to the float type: "All bets on their precision are off unless you happen to know the machine you are working with." and that "You are at the mercy of the underlying machine architecture...". Not all floating point representations support sign of zero, though in the modern world it's true that the vast majority do. It would be instructive to understand how much, if any, python code would break if we lost -0.0. I'm do not believe that there is any reliable way for python code to tell the difference between all of the different types of IEEE 754 zeros and in the special case of -0.0 the best test I can come up with is repr(n)[0]=='-'. Is there an compelling case, to do with compatibility or otherwise, for exposing the sign of a zero? It seems like a numerical anomaly to me. Nicko From aahz at pythoncraft.com Tue Oct 3 18:59:04 2006 From: aahz at pythoncraft.com (Aahz) Date: Tue, 3 Oct 2006 09:59:04 -0700 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for a new issue tracker In-Reply-To: References: Message-ID: <20061003165903.GB12427@panix.com> If nothing else, Brett deserves a hearty round of applause for this work: Three cheers for Brett! -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "LL YR VWL R BLNG T S" -- www.nancybuttons.com From p.f.moore at gmail.com Tue Oct 3 19:04:39 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 3 Oct 2006 18:04:39 +0100 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for a new issue tracker In-Reply-To: <20061003165903.GB12427@panix.com> References: <20061003165903.GB12427@panix.com> Message-ID: <79990c6b0610031004t536cdb15h4d21526afc22f675@mail.gmail.com> On 10/3/06, Aahz wrote: > If nothing else, Brett deserves a hearty round of applause for this work: > > Three cheers for Brett! Definitely. Paul From foom at fuhm.net Tue Oct 3 18:47:02 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 3 Oct 2006 12:47:02 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <452257EB.6070601@v.loewis.de> References: <452257EB.6070601@v.loewis.de> Message-ID: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: > As Michael Hudson observed, this is difficult to implement, though: > You can't distinguish between -0.0 and +0.0 easily, yet you should. Of course you can. It's absolutely trivial. The only part that's even *the least bit* sketchy in this is assuming that a double is 64 bits. Practically speaking, that is true on all architectures I know of, and if it's not guaranteed, it could easily be a 'configure' time check. typedef union { double d; uint64_t i; } rawdouble; int isposzero(double a) { rawdouble zero; zero.d = 0.0; rawdouble aa; aa.d = a; return aa.i == zero.i; } int main() { if (sizeof(double) != sizeof(uint64_t)) return 1; printf("%d\n", isposzero(0.0)); printf("%d\n", isposzero(-0.0)); } James From martin at v.loewis.de Tue Oct 3 19:27:05 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 19:27:05 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> References: <45226F57.108@v.loewis.de> <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> Message-ID: <45229D69.4020407@v.loewis.de> Nicko van Someren schrieb: > It's only a semantic change on platforms that "happen to" use IEEE > 754 float representations, or some other representation that exposes > the sign of zero. Right. Later, you admit that this is vast majority of modern machines. > It would be instructive to understand how much, if any, python code > would break if we lost -0.0. I'm do not believe that there is any > reliable way for python code to tell the difference between all of > the different types of IEEE 754 zeros and in the special case of -0.0 > the best test I can come up with is repr(n)[0]=='-'. Is there an > compelling case, to do with compatibility or otherwise, for exposing > the sign of a zero? It seems like a numerical anomaly to me. I think it is reasonable to admit that a) this change is a change in semantics for the majority of the machines b) it is likely that this change won't affect a significant number of applications (I'm pretty sure someone will notice, though; someone always notices). Regards, Martin From skip at pobox.com Tue Oct 3 19:37:49 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 3 Oct 2006 12:37:49 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45228491.9010103@v.loewis.de> References: <45228491.9010103@v.loewis.de> Message-ID: <17698.40941.317868.702398@montanaro.dyndns.org> Martin> However, it is certainly a change to the observable behavior of Martin> the Python implementation, and no amount of arguing can change Martin> that. If C90 doesn't distinguish -0.0 and +0.0, how can Python? Can you give a simple example where the difference between the two is apparent to the Python programmer? Skip From skip at pobox.com Tue Oct 3 19:40:59 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 3 Oct 2006 12:40:59 -0500 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45229D69.4020407@v.loewis.de> References: <45226F57.108@v.loewis.de> <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> <45229D69.4020407@v.loewis.de> Message-ID: <17698.41131.396198.330141@montanaro.dyndns.org> Martin> b) it is likely that this change won't affect a significant Martin> number of applications (I'm pretty sure someone will notice, Martin> though; someone always notices). +1 QOTF. Skip From Scott.Daniels at Acm.Org Tue Oct 3 19:45:50 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Tue, 03 Oct 2006 10:45:50 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> Message-ID: James Y Knight wrote: > On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: >> As Michael Hudson observed, this is difficult to implement, though: >> You can't distinguish between -0.0 and +0.0 easily, yet you should. > > Of course you can. It's absolutely trivial. The only part that's even > *the least bit* sketchy in this is assuming that a double is 64 bits. > Practically speaking, that is true on all architectures I know of, > and if it's not guaranteed, it could easily be a 'configure' time check. > > typedef union { > double d; > uint64_t i; > } rawdouble; > > int isposzero(double a) { > rawdouble zero; > zero.d = 0.0; > rawdouble aa; > aa.d = a; > return aa.i == zero.i; > } > > int main() { > if (sizeof(double) != sizeof(uint64_t)) > return 1; > > printf("%d\n", isposzero(0.0)); > printf("%d\n", isposzero(-0.0)); > > } > And you should be able to cache the single positive zero with something vaguely like: PyObject * PyFloat_FromDouble(double fval) { ... if (fval == 0.0 && raw_match(&fval, cached)) { PY_INCREF(cached); return cached; } ... -- -- Scott David Daniels Scott.Daniels at Acm.Org From martin at v.loewis.de Tue Oct 3 19:55:43 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 19:55:43 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17698.40941.317868.702398@montanaro.dyndns.org> References: <45228491.9010103@v.loewis.de> <17698.40941.317868.702398@montanaro.dyndns.org> Message-ID: <4522A41F.6090704@v.loewis.de> skip at pobox.com schrieb: > If C90 doesn't distinguish -0.0 and +0.0, how can Python? Can you give a > simple example where the difference between the two is apparent to the > Python programmer? Sure: py> x=-0.0 py> y=0.0 py> x,y (-0.0, 0.0) py> hash(x),hash(y) (0, 0) py> x==y True py> str(x)==str(y) False py> str(x),str(y) ('-0.0', '0.0') py> float(str(x)),float(str(y)) (-0.0, 0.0) Imagine an application that reads floats from a text file, manipulates some of them, and then writes back the complete list of floats. Further assume that somehow, -0.0 got into the file. Currently, the sign "round-trips"; under the proposed change, it would stop doing so. Of course, there likely wouldn't be any "real" change to value, as the sign of 0 is likely of no significance to the application. Regards, Martin From amk at amk.ca Tue Oct 3 20:08:48 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 3 Oct 2006 14:08:48 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <200610031039.52434.fdrake@acm.org> References: <20061003143015.GA25511@localhost.localdomain> <200610031039.52434.fdrake@acm.org> Message-ID: <20061003180848.GB31361@localhost.localdomain> On Tue, Oct 03, 2006 at 10:39:52AM -0400, Fred L. Drake, Jr. wrote: > and little or nothing about the cleanliness of the working area. The mkhowto > script should be cleaning out the old HTML before generating the new. I'm > guessing the deployment simply unpacks the new on top of the old; the old > should be removed first. That doesn't explain it, though; the contents of whatsnew26.html contain references to pep-308.html. It's not simply a matter of new files being untarred on top of old. I've added a robots.txt to keep crawlers out of /dev/. --amk From fdrake at acm.org Tue Oct 3 20:19:27 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 3 Oct 2006 14:19:27 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <20061003180848.GB31361@localhost.localdomain> References: <200610031039.52434.fdrake@acm.org> <20061003180848.GB31361@localhost.localdomain> Message-ID: <200610031419.28281.fdrake@acm.org> On Tuesday 03 October 2006 14:08, A.M. Kuchling wrote: > That doesn't explain it, though; the contents of whatsnew26.html > contain references to pep-308.html. It's not simply a matter of new > files being untarred on top of old. Ah; I missed that the new HTML file was referring to an old heading. That does sound like a .aux file got left around. I don't know what the build process is for the material in docs.python.org/dev/; I think the right thing would be to start each build with a fresh checkout/export. -Fred -- Fred L. Drake, Jr. From nmm1 at cus.cam.ac.uk Tue Oct 3 20:26:29 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 03 Oct 2006 19:26:29 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: Your message of "Tue, 03 Oct 2006 19:55:43 +0200." <4522A41F.6090704@v.loewis.de> Message-ID: =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > > py> x=-0.0 > py> y=0.0 > py> x,y Nobody is denying that SOME C90 implementations distinguish them, but it is no part of the standard - indeed, a C90 implementation is permitted to use ANY criterion for deciding when to display -0.0 and 0.0. C99 is ambiguous to the point of internal inconsistency, except when __STDC_IEC_559__ is set to 1, though the intent is clear. And my reading of Python's code is that it relies on C's handling of such values. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From foom at fuhm.net Tue Oct 3 21:13:01 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 3 Oct 2006 15:13:01 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <0D01F7BC-1EC4-42DB-8D70-31E767E98257@fuhm.net> On Oct 3, 2006, at 2:26 PM, Nick Maclaren wrote: > =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >> >> py> x=-0.0 >> py> y=0.0 >> py> x,y > > Nobody is denying that SOME C90 implementations distinguish them, > but it is no part of the standard - indeed, a C90 implementation is > permitted to use ANY criterion for deciding when to display -0.0 and > 0.0. C99 is ambiguous to the point of internal inconsistency, except > when __STDC_IEC_559__ is set to 1, though the intent is clear. > > And my reading of Python's code is that it relies on C's handling > of such values. This is a really poor argument. Python should be moving *towards* proper '754 fp support, not away from it. On the platforms that are most important, the C implementations distinguish positive and negative 0. That the current python implementation may be defective when the underlying C implementation is defective doesn't excuse a change to intentionally break python on the common platforms. IEEE 754 is so widely implemented that IMO it would make sense to make Python's floating point specify it, and simply declare floating point operations on non-IEEE 754 machines as "use at own risk, may not conform to python language standard". (or if someone wants to use a software fp library for such machines, that's fine too). James From martin at v.loewis.de Tue Oct 3 21:37:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 Oct 2006 21:37:53 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: Message-ID: <4522BC11.4090901@v.loewis.de> Nick Maclaren schrieb: >> py> x=-0.0 >> py> y=0.0 >> py> x,y > > Nobody is denying that SOME C90 implementations distinguish them, > but it is no part of the standard - indeed, a C90 implementation is > permitted to use ANY criterion for deciding when to display -0.0 and > 0.0. C99 is ambiguous to the point of internal inconsistency, except > when __STDC_IEC_559__ is set to 1, though the intent is clear. > > And my reading of Python's code is that it relies on C's handling > of such values. So what is your conclusion? That applications will not break? People don't care that their code may break on a different platform, if they aren't using these platforms. They care if it breaks on their platform just because they use a new Python version. (Of course, they sometimes also complain that Python behaves differently on different platforms, and cannot really accept the explanation that the language didn't guarantee the same behavior on all systems. This explanation doesn't help them: they still need to modify the application). Regards, Martin From rrr at ronadam.com Tue Oct 3 21:34:59 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 03 Oct 2006 14:34:59 -0500 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <45226ADA.9080306@gmail.com> References: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> <45222E8B.30304@gmail.com> <452234B2.8040103@voidspace.org.uk> <45226ADA.9080306@gmail.com> Message-ID: <4522BB63.800@ronadam.com> Nick Coghlan wrote: > Fuzzyman wrote: >> Nick Coghlan wrote: >>> In my example, the 3 sections (, and >> code> are all optional. A basic do-while loop would look like this: >>> >>> do: >>> >>> while >>> >>> (That is, is still repeated each time around the loop - it's >>> called that because it is run before the loop evaluated condition is evaluated) >>> >>> >> +1 >> >> This looks good. > > I'm pretty sure it was proposed by someone else a long time ago - I was > surprised to find it wasn't mentioned in PEP 315. > > That said, Guido's observation on PEP 315 from earlier this year holds for me too: > > "I kind of like it but it doesn't strike me as super important" [1] I looked though a few files in the library for different while usage patterns and there really wasn't as many while loops that would fit this pattern as I expected. There are much more while loops with one or more exit conditions in the middle as things in the loop are calculated or received. So it might be smart to find out just how many places in the library it would make a difference. Ron From greg at electricrain.com Tue Oct 3 21:47:06 2006 From: greg at electricrain.com (Gregory P. Smith) Date: Tue, 3 Oct 2006 12:47:06 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45229D69.4020407@v.loewis.de> References: <45226F57.108@v.loewis.de> <79858904-DDE8-46D8-AA28-EE9B285A6463@nicko.org> <45229D69.4020407@v.loewis.de> Message-ID: <20061003194706.GE7484@zot.electricrain.com> > > It would be instructive to understand how much, if any, python code > > would break if we lost -0.0. I'm do not believe that there is any > > reliable way for python code to tell the difference between all of > > the different types of IEEE 754 zeros and in the special case of -0.0 > > the best test I can come up with is repr(n)[0]=='-'. Is there an > > compelling case, to do with compatibility or otherwise, for exposing > > the sign of a zero? It seems like a numerical anomaly to me. > > I think it is reasonable to admit that > a) this change is a change in semantics for the majority of the > machines > b) it is likely that this change won't affect a significant number > of applications (I'm pretty sure someone will notice, though; > someone always notices). If you're really going to bother doing this rather than just adding a note in the docs about testing for and reusing the most common float values to save memory when instantiating them from external input: Just do a binary comparison of the float with predefined + and - 0.0 float values or any other special values that you wish to catch rather than a floating point comparison. -g From alastair at alastairs-place.net Wed Oct 4 01:40:26 2006 From: alastair at alastairs-place.net (Alastair Houghton) Date: Wed, 4 Oct 2006 00:40:26 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> Message-ID: On 3 Oct 2006, at 17:47, James Y Knight wrote: > On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: >> As Michael Hudson observed, this is difficult to implement, though: >> You can't distinguish between -0.0 and +0.0 easily, yet you should. > > Of course you can. It's absolutely trivial. The only part that's even > *the least bit* sketchy in this is assuming that a double is 64 bits. > Practically speaking, that is true on all architectures I know of, How about doing 1.0 / x, where x is the number you want to test? On systems with sane semantics, it should result in an infinity, the sign of which should depend on the sign of the zero. While I'm sure there are any number of places where it will break, on those platforms it seems to me that you're unlikely to care about the difference between +0.0 and -0.0 anyway, since it's hard to otherwise distinguish them. e.g. double value_to_test; ... if (value_to_test == 0.0) { double my_inf = 1.0 / value_to_test; if (my_inf < 0.0) { /* We have a -ve zero */ } else if (my_inf > 0.0) { /* We have a +ve zero */ } else { /* This platform might not support infinities (though we might get a signal or something rather than getting here in that case...) */ } } (I should add that presently I've only tried it on a PowerPC, because it's late and that's what's in front of me. It seems to work OK here.) Kind regards, Alastair -- http://alastairs-place.net From jcarlson at uci.edu Wed Oct 4 03:38:43 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 03 Oct 2006 18:38:43 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> Message-ID: <20061003180809.092A.JCARLSON@uci.edu> Alastair Houghton wrote: > On 3 Oct 2006, at 17:47, James Y Knight wrote: > > > On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: > >> As Michael Hudson observed, this is difficult to implement, though: > >> You can't distinguish between -0.0 and +0.0 easily, yet you should. > > > > Of course you can. It's absolutely trivial. The only part that's even > > *the least bit* sketchy in this is assuming that a double is 64 bits. > > Practically speaking, that is true on all architectures I know of, > > How about doing 1.0 / x, where x is the number you want to test? On > systems with sane semantics, it should result in an infinity, the > sign of which should depend on the sign of the zero. While I'm sure > there are any number of places where it will break, on those > platforms it seems to me that you're unlikely to care about the > difference between +0.0 and -0.0 anyway, since it's hard to otherwise > distinguish them. There is, of course, the option of examining their representations in memory (I described the general technique in another posting on this thread). From what I understand of IEEE 764 FP doubles, -0.0 and +0.0 have different representations, and if we look at the underlying representation (perhaps by a "*((uint64*)(&float_input))"), we can easily distinguish all values we want to cache... We can observe it directly, for example on x86: >>> import struct >>> struct.pack('d', -0.0) '\x00\x00\x00\x00\x00\x00\x00\x80' >>> struct.pack('d', 0.0) '\x00\x00\x00\x00\x00\x00\x00\x00' >>> And as I stated before, we can switch on those values. Alternatively, if we can't switch on the 64 bit values directly... uint32* p = (uint32*)(&double_input) if (!p[0]) { /* p[1] on big-endian platforms */ switch p[1] { /* p[0] on big-endian platforms */ ... } } - Josiah From tonynelson at georgeanelson.com Wed Oct 4 02:28:44 2006 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Tue, 3 Oct 2006 20:28:44 -0400 Subject: [Python-Dev] 2.4.4 fix: Socketmodule Ctl-C patch Message-ID: I've put a patch for 2.4.4 of the Socketmodule Ctl-C patch for 2.5, at the old closed bug . It passes "make EXTRAOPS-=unetwork test". Should I try to put this into the wiki at Python24Fixes? I haven't used the wiki before. -- ____________________________________________________________________ TonyN.:' The Great Writ ' is no more. From steve at holdenweb.com Wed Oct 4 05:58:01 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 04 Oct 2006 04:58:01 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20061003180809.092A.JCARLSON@uci.edu> References: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <20061003180809.092A.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: [yet more on this topic] If the brainpower already expended on this issue were proportional to its significance then we'd be reading about it on CNN news. This thread has disappeared down a rat-hole, never to re-emerge with anything of significant benefit to users. C'mon, guys, implement a patch or leave it alone :-) regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From guido at python.org Wed Oct 4 06:06:54 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 3 Oct 2006 21:06:54 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <20061003180809.092A.JCARLSON@uci.edu> Message-ID: On 10/3/06, Steve Holden wrote: > If the brainpower already expended on this issue were proportional to > its significance then we'd be reading about it on CNN news. > > This thread has disappeared down a rat-hole, never to re-emerge with > anything of significant benefit to users. C'mon, guys, implement a patch > or leave it alone :-) Hear, hear. My proposal: only cache positive 0.0. My prediction: biggest bang for the buck, nobody's code will break. On platforms that don't distinguish between +/- 0.0, of course this would cache all zeros. On platforms that do distinguish them, -0.0 is left alone, which is just fine. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nnorwitz at gmail.com Wed Oct 4 06:32:43 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 3 Oct 2006 21:32:43 -0700 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <200610031419.28281.fdrake@acm.org> References: <200610031039.52434.fdrake@acm.org> <20061003180848.GB31361@localhost.localdomain> <200610031419.28281.fdrake@acm.org> Message-ID: On 10/3/06, Fred L. Drake, Jr. wrote: > On Tuesday 03 October 2006 14:08, A.M. Kuchling wrote: > > That doesn't explain it, though; the contents of whatsnew26.html > > contain references to pep-308.html. It's not simply a matter of new > > files being untarred on top of old. > > Ah; I missed that the new HTML file was referring to an old heading. That > does sound like a .aux file got left around. > > I don't know what the build process is for the material in > docs.python.org/dev/; I think the right thing would be to start each build > with a fresh checkout/export. I probably did not do that to begin with. I did rm -rf Doc && svn up Doc && cd Doc && make. Let me know if there's anything else I should do. I did this for both the 2.5 and 2.6 versions. Let me know if you see anything screwed up after an hour or so. The new versions should be up by then. n From tim.peters at gmail.com Wed Oct 4 06:42:04 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 4 Oct 2006 00:42:04 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <17698.40941.317868.702398@montanaro.dyndns.org> References: <45228491.9010103@v.loewis.de> <17698.40941.317868.702398@montanaro.dyndns.org> Message-ID: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> [skip at pobox.com] > If C90 doesn't distinguish -0.0 and +0.0, how can Python? With liberal applications of piss & vinegar ;-) > Can you give a simple example where the difference between the two is apparent > to the Python programmer? Perhaps surprsingly, many (well, comparatively many, compared to none ....) people have noticed that the platform atan2 cares a lot: >>> from math import atan2 as a >>> z = 0.0 # postive zero >>> m = -z # minus zero >>> a(z, z) # the result here is actually +0.0 0.0 >>> a(z, m) 3.1415926535897931 >>> a(m, z) # the result here is actually -0.0 0.0 >>> a(m, m) -3.1415926535897931 It work like that "even on Windows", and these are the results C99's 754-happy appendix mandates for atan2 applied to signed zeroes. I've even seen a /complaint/ on c.l.py that atan2 doesn't do the same when z = 0.0 is replaced by z = 0 That is, at least one person thought it was "a bug" that integer zeroes didn't deliver the same behaviors. Do people actually rely on this? I know I don't, but given that more than just 2 people have remarked on it seeming to like it, I expect that changing this would break /some/ code out there. BTW, on /some/ platforms all those examples trigger EDOM from the platform libm instead -- which is also fine by C99, for implementations ignoring C99's optional 754-happy appendix. From tim.peters at gmail.com Wed Oct 4 06:53:55 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 4 Oct 2006 00:53:55 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> References: <45228491.9010103@v.loewis.de> <17698.40941.317868.702398@montanaro.dyndns.org> <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> Message-ID: <1f7befae0610032153t25bd0503u27628436ce3b794f@mail.gmail.com> [skip at pobox.com] > Can you give a simple example where the difference between the two is apparent > to the Python programmer? BTW, I don't recall the details and don't care enough to reconstruct them, but when Python's front end was first changed to recognize "negative literals", it treated +0.0 and -0.0 the same, and we did get bug reports as a result. A bit more detail, because it's necessary to understand that even minimally. Python's grammar doesn't have negative numeric literals; e.g., according to the grammar, -1 and -1.1 are applications of the unary minus operator to the positive numeric literals 1 and 1.1. And for years Python generated code accordingly: LOAD_CONST followed by the unary minus opcode. Someone (Fred, I think) introduced a front-end optimization to collapse that to plain LOAD_CONST, doing the negation at compile time. The code object contains a vector of compile-time constants, and the optimized code initially didn't distinguish between +0.0 and -0.0. As a result, if the first float 0.0 in a code block "looked postive", /all/ float zeroes in the code block were in effect treated as positive; and similarly if the first float zero was -0.0, all float zeroes were in effect treated as negative. That did break code. IIRC, it was fixed by special-casing the snot out of "-0.0", leaving that single case as a LOAD_CONST followed by UNARY_NEGATIVE. From fdrake at acm.org Wed Oct 4 06:56:56 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 4 Oct 2006 00:56:56 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: <200610031419.28281.fdrake@acm.org> Message-ID: <200610040056.56632.fdrake@acm.org> On Wednesday 04 October 2006 00:32, Neal Norwitz wrote: > I probably did not do that to begin with. I did rm -rf Doc && svn up > Doc && cd Doc && make. Let me know if there's anything else I should > do. I did this for both the 2.5 and 2.6 versions. That certainly sounds like it should be sufficient. The doc build should never write anywhere but within the Doc/ tree; it doesn't even use the tempfile module to pick up any other temporary scratch space. -Fred -- Fred L. Drake, Jr. From fdrake at acm.org Wed Oct 4 07:01:06 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 4 Oct 2006 01:01:06 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <1f7befae0610032153t25bd0503u27628436ce3b794f@mail.gmail.com> References: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> <1f7befae0610032153t25bd0503u27628436ce3b794f@mail.gmail.com> Message-ID: <200610040101.06109.fdrake@acm.org> On Wednesday 04 October 2006 00:53, Tim Peters wrote: > Someone (Fred, I think) introduced a front-end optimization to > collapse that to plain LOAD_CONST, doing the negation at compile time. I did the original change to make negative integers use just LOAD_CONST, but I don't think I changed what was generated for float literals. That could be my memory going bad, though. The code changed several times as people with more numeric-fu that myself fixed all sorts of border cases. I've tried really hard to stay away from the code generator since then. :-) -Fred -- Fred L. Drake, Jr. From nnorwitz at gmail.com Wed Oct 4 07:12:50 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 3 Oct 2006 22:12:50 -0700 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: On 10/2/06, Brett Cannon wrote: > > This is why I asked for input from people on which would take less time. > Almost all the answers I got was that the the C code was delicate but that > it was workable. Several people said they wished for a Python > implementation, but hardly anyone said flat-out, "don't waste your time, the > Python version will be faster to do". I didn't respond mostly because I pushed this direction to begin with. That and I'm lazy. :-) There is a lot of string manipulation and some list manipulation that is a royal pain in C and trivial in python. Caching will be much easier to experiement with in Python too. The Python version will be much smaller. It will take far less time to code it in Python and recode in C, than to try to get it right in C the first time. If the code is fast enough, there's no reason to rewrite in C. It will probably be easier to subclass a Python based version that a C based version. > As for the bootstrapping, I am sure it is resolvable as well. There are > several ways to go about it that are all tractable. Right, I had bootstrapping with implementing xrange in Python, but it was pretty easy to resolve in the end. You might even want to use part of that patch (from pythonrun.c?). There was some re-org to make bootstrapping easier/possible (I don't remember exactly right now). n From tim.peters at gmail.com Wed Oct 4 07:29:58 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 4 Oct 2006 01:29:58 -0400 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <200610040101.06109.fdrake@acm.org> References: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> <1f7befae0610032153t25bd0503u27628436ce3b794f@mail.gmail.com> <200610040101.06109.fdrake@acm.org> Message-ID: <1f7befae0610032229i7af5c76eg6bd0c459fe7338de@mail.gmail.com> [Tim] >> Someone (Fred, I think) introduced a front-end optimization to >> collapse that to plain LOAD_CONST, doing the negation at compile time. > I did the original change to make negative integers use just LOAD_CONST, but I > don't think I changed what was generated for float literals. That could be > my memory going bad, though. It is ;-) Here under Python 2.2.3: >>> from dis import dis >>> def f(): return 0.0 + -0.0 + 1.0 + -1.0 ... >>> dis(f) 0 SET_LINENO 1 3 SET_LINENO 1 6 LOAD_CONST 1 (0.0) 9 LOAD_CONST 1 (0.0) 12 UNARY_NEGATIVE 13 BINARY_ADD 14 LOAD_CONST 2 (1.0) 17 BINARY_ADD 18 LOAD_CONST 3 (-1.0) 21 BINARY_ADD 22 RETURN_VALUE 23 LOAD_CONST 0 (None) 26 RETURN_VALUE Note there that "0.0", "1.0", and "-1.0" were all treated as literals, but that "-0.0" still triggered a UNARY_NEGATIVE opcode. That was after "the fix". You don't remember this as well as I do since I probably had to fix it, /and/ I ate enormous quantities of chopped, pressed, smoked, preservative-laden bag o' ham at the time. You really need to do both to remember floating-point trivia. Indeed, since I gave up my bag o' ham habit, I hardly ever jump into threads about fp trivia anymore. Mostly it's because I'm too weak from not eating anything, though -- how about lunch tomorrow? > The code changed several times as people with more numeric-fu that myself > fixed all sorts of border cases. I've tried really hard to stay away from > the code generator since then. :-) Successfully, too! It's admirable. From jcarlson at uci.edu Wed Oct 4 07:35:43 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 03 Oct 2006 22:35:43 -0700 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <20061003180809.092A.JCARLSON@uci.edu> Message-ID: <20061003222551.0933.JCARLSON@uci.edu> Steve Holden wrote: > Josiah Carlson wrote: > [yet more on this topic] > > If the brainpower already expended on this issue were proportional to > its significance then we'd be reading about it on CNN news. Goodness, I wasn't aware that pointer manipulation took that much brainpower. I presume you mean what others have spent time thinking about with regards to this topic. > This thread has disappeared down a rat-hole, never to re-emerge with > anything of significant benefit to users. C'mon, guys, implement a patch > or leave it alone :-) Heh. So be it. The following is untested (I lack a build system for the Python trunk). It adds a new global cache for floats, a new 'fill the global cache' function, and an updated PyFloat_FromDouble() function. All in all, it took about 10 minutes to generate, and understands the difference between fp +0.0 and -0.0 (assuming sane IEEE 754 fp double behavior on non-x86 platforms). - Josiah /* This should go into floatobject.c */ static PyFloatObject *cached_list = NULL; static PyFloatObject * fill_cached_list(void) { cached_list = (PyFloatObject *) 1; PyFloatObject *p; int i; p = (PyFloatObject *) PyMem_MALLOC(sizeof(PyFloatObject)*22); if (p == NULL) { cached_list = NULL; return (PyFloatObject *) PyErr_NoMemory(); } for (i=0;i<=10;i++) { p[i] = (PyFloatObject*) PyFloat_fromDouble((double) i); p[21-i] = (PyFloatObject*) PyFloat_fromDouble(-((double) i)); } cached_list = NULL; return p; } PyObject * PyFloat_FromDouble(double fval) { register PyFloatObject *op; register long* fvali = (int*)(&fval); if (free_list == NULL) { if ((free_list = fill_free_list()) == NULL) return NULL; } #ifdef LITTLE_ENDIAN if (!p[0]) #else if (!p[1]) #endif { if (cached_list == NULL) { if ((cached_list = fill_cached_list()) == NULL) return NULL; } if ((cached_list != 1) && (cached_list != NULL)) { #ifdef LITTLE_ENDIAN switch p[1] #else switch p[0] #endif { case 0: PY_INCREF(cached_list[0]); return cached_list[0]; case 1072693248: PY_INCREF(cached_list[1]); return cached_list[1]; case 1073741824: PY_INCREF(cached_list[2]); return cached_list[2]; case 1074266112: PY_INCREF(cached_list[3]); return cached_list[3]; case 1074790400: PY_INCREF(cached_list[4]); return cached_list[4]; case 1075052544: PY_INCREF(cached_list[5]); return cached_list[5]; case 1075314688: PY_INCREF(cached_list[6]); return cached_list[6]; case 1075576832: PY_INCREF(cached_list[7]); return cached_list[7]; case 1075838976: PY_INCREF(cached_list[8]); return cached_list[8]; case 1075970048: PY_INCREF(cached_list[9]); return cached_list[9]; case 1076101120: PY_INCREF(cached_list[10]); return cached_list[10]; case -1071382528: PY_INCREF(cached_list[11]); return cached_list[11]; case -1071513600: PY_INCREF(cached_list[12]); return cached_list[12]; case -1071644672: PY_INCREF(cached_list[13]); return cached_list[13]; case -1071906816: PY_INCREF(cached_list[14]); return cached_list[14]; case -1072168960: PY_INCREF(cached_list[15]); return cached_list[15]; case -1072431104: PY_INCREF(cached_list[16]); return cached_list[16]; case -1072693248: PY_INCREF(cached_list[17]); return cached_list[17]; case -1073217536: PY_INCREF(cached_list[18]); return cached_list[18]; case -1073741824: PY_INCREF(cached_list[19]); return cached_list[19]; case -1074790400: PY_INCREF(cached_list[20]); return cached_list[20]; case -2147483648: PY_INCREF(cached_list[21]); return cached_list[21]; default: } } } /* Inline PyObject_New */ op = free_list; free_list = (PyFloatObject *)op->ob_type; PyObject_INIT(op, &PyFloat_Type); op->ob_fval = fval; return (PyObject *) op; } From martin at v.loewis.de Wed Oct 4 07:34:51 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 04 Oct 2006 07:34:51 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> Message-ID: <452347FB.3050804@v.loewis.de> Alastair Houghton schrieb: > On 3 Oct 2006, at 17:47, James Y Knight wrote: > >> On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: >>> As Michael Hudson observed, this is difficult to implement, though: >>> You can't distinguish between -0.0 and +0.0 easily, yet you should. >> >> Of course you can. It's absolutely trivial. The only part that's even >> *the least bit* sketchy in this is assuming that a double is 64 bits. >> Practically speaking, that is true on all architectures I know of, > > How about doing 1.0 / x, where x is the number you want to test? This is a bad idea. It may cause a trap, leading to program termination. Regards, Martin From nnorwitz at gmail.com Wed Oct 4 08:16:12 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 3 Oct 2006 23:16:12 -0700 Subject: [Python-Dev] [Python-checkins] r51862 - python/branches/release25-maint/Tools/msi/msi.py In-Reply-To: <450711CE.8040201@v.loewis.de> References: <20060912091628.9CBFA1E400C@bag.python.org> <200609122116.01922.anthony@interlink.com.au> <450711CE.8040201@v.loewis.de> Message-ID: On 9/12/06, "Martin v. L?wis" wrote: > > If you wonder how this all happened: Neal added sgml_input.html after > c1, but didn't edit msi.py to make it included on Windows. I found out > after running the test suite on the installed version, edited msi.py, > and rebuilt the installer. Is there an easy to fix this sort of problem so it doesn't happen in the future (other than revoke my checkin privileges :-) ? There are already so many things to remember for changes. If we can automate finding these sorts of problems (installation, fixing something for one platform, but not another, etc), the submitter can fix these things with a little prodding from the buildbots. Or is this too minor to worry about? It would also be great if we could automate complaint emails about missing NEWS entries, doc, and tests so I wouldn't have to do it. :-) Unless anyone has better ideas how to improve Python. n From martin at v.loewis.de Wed Oct 4 08:40:10 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 04 Oct 2006 08:40:10 +0200 Subject: [Python-Dev] [Python-checkins] r51862 - python/branches/release25-maint/Tools/msi/msi.py In-Reply-To: References: <20060912091628.9CBFA1E400C@bag.python.org> <200609122116.01922.anthony@interlink.com.au> <450711CE.8040201@v.loewis.de> Message-ID: <4523574A.2010702@v.loewis.de> Neal Norwitz schrieb: > Is there an easy to fix this sort of problem so it doesn't happen in > the future (other than revoke my checkin privileges :-) ? Sure: Don't make changes after a release candidate. That files are missing can only be detected by actually producing the installer and testing whether it works; the closer the release, the less testing recent changes get. It might be possible to improve msi.py to better guess what files are test files, but I'd rather package too little than too much. One thing it *should* do is to report files that it skipped - but that really just helps me, since you have to run msi.py to see these messages. > There are already so many things to remember for changes. If we can > automate finding these sorts of problems (installation, fixing > something for one platform, but not another, etc), the submitter can > fix these things with a little prodding from the buildbots. Or is > this too minor to worry about? This specific instance is not to worry about. I noticed before making the release, and fixed it; me changing the branch while it is frozen is not even a policy violation. It's unfortunate that you can't recreate the installer from the tag that had been made, but it's just a release candidate, so that's a really minor issue. > It would also be great if we could automate complaint emails about > missing NEWS entries, doc, and tests so I wouldn't have to do it. :-) > Unless anyone has better ideas how to improve Python. I don't think this can be automated in a reasonable way. People apparently have different views on what is good policy and what is overkill; in a free software project, you can only have so much policy enforcement. If there is a wide consensus on some issue, committers will pick up the consensus; if they don't, it typically means they disagree. Regards, Martin From alastair at alastairs-place.net Wed Oct 4 10:00:19 2006 From: alastair at alastairs-place.net (Alastair Houghton) Date: Wed, 4 Oct 2006 09:00:19 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <452347FB.3050804@v.loewis.de> References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <452347FB.3050804@v.loewis.de> Message-ID: On 4 Oct 2006, at 06:34, Martin v. L?wis wrote: > Alastair Houghton schrieb: >> On 3 Oct 2006, at 17:47, James Y Knight wrote: >> >>> On Oct 3, 2006, at 8:30 AM, Martin v. L?wis wrote: >>>> As Michael Hudson observed, this is difficult to implement, though: >>>> You can't distinguish between -0.0 and +0.0 easily, yet you should. >>> >>> Of course you can. It's absolutely trivial. The only part that's >>> even >>> *the least bit* sketchy in this is assuming that a double is 64 >>> bits. >>> Practically speaking, that is true on all architectures I know of, >> >> How about doing 1.0 / x, where x is the number you want to test? > > This is a bad idea. It may cause a trap, leading to program > termination. AFAIK few systems have floating point traps enabled by default (in fact, isn't that what IEEE 754 specifies?), because they often aren't very useful. And in the specific case of the Python interpreter, why would you ever want them turned on? Surely in order to get consistent floating point semantics, they need to be *off* and Python needs to handle any exceptional cases itself; even if they're on, by your argument Python must do that to avoid being terminated. (Not to mention the problem that floating point traps are typically delivered by a signal, the problems with which were discussed extensively in a recent thread on this list.) And it does have two advantages over the other methods proposed: 1. You don't have to write the value to memory; this test will work entirely in the machine's floating point registers. 2. It doesn't rely on the machine using IEEE floating point. (Of course, neither does the binary comparison method, but it still involves a trip to memory, and assumes that the machine doesn't have multiple representations for +0.0 or -0.0.) Even if you're saying that there's a significant chance of a trap (which I don't believe, not on common platforms anyway), the configure script could test to see if this will happen and fall back to one of the other approaches, or see if it can't turn them off using the C99 APIs. (I think I'd agree with you that handling SIGFPE is undesirable, which is perhaps what you were driving at.) Anyway, it's only an idea, and I thought I'd point it out as nobody else had yet. If 0.0 is going to be cached, then I certainly think -0.0 and +0.0 should be two separate values if they exist on a given machine. I'm less concerned about exactly how that comes about. Kind regards, Alastair. -- http://alastairs-place.net From alastair at alastairs-place.net Wed Oct 4 10:05:56 2006 From: alastair at alastairs-place.net (Alastair Houghton) Date: Wed, 4 Oct 2006 09:05:56 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <20061003180809.092A.JCARLSON@uci.edu> References: <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <20061003180809.092A.JCARLSON@uci.edu> Message-ID: <6BB57169-781E-4397-9A5D-297308DBF3D9@alastairs-place.net> On 4 Oct 2006, at 02:38, Josiah Carlson wrote: > Alastair Houghton wrote: > > There is, of course, the option of examining their representations in > memory (I described the general technique in another posting on this > thread). From what I understand of IEEE 764 FP doubles, -0.0 and +0.0 > have different representations, and if we look at the underlying > representation (perhaps by a "*((uint64*)(&float_input))"), we can > easily distinguish all values we want to cache... Yes, though a trip via memory isn't necessarily cheap, and you're also assuming that the machine doesn't use an FP representation with multiple +0s or -0s. Perhaps they should be different anyway though, I suppose. > And as I stated before, we can switch on those values. Alternatively, > if we can't switch on the 64 bit values directly... > > uint32* p = (uint32*)(&double_input) > if (!p[0]) { /* p[1] on big-endian platforms */ > switch p[1] { /* p[0] on big-endian platforms */ > ... > } > } That's worse, IMHO, because it assumes more about the representation. If you're going to look directly at the binary, I think all you can reasonably do is a straight binary comparison. I don't think you should poke at the bits without first knowing that the platform uses IEEE floating point. The reason I suggested 1.0/x is that it's one of the few ways (maybe the only way?) to distinguish -0.0 and +0.0 using arithmetic, which is what people that care about the difference between the two are going to care about. Kind regards, Alastair. -- http://alastairs-place.net From Hans.Polak at capgemini.com Mon Oct 2 08:41:53 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Mon, 2 Oct 2006 08:41:53 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <451F4183.5050907@gmail.com> Message-ID: <000601c6e5ed$d99ad290$1d2c440a@spain.capgemini.com> Hi Nick, Yep, PEP 315. Sorry about that. Now, about your suggestion do: while else: This is pythonic, but not logical. The 'do' will execute at least once, so the else clause is not needed, nor is the . The should go before the while terminator. I'm bound to reiterate my proposal: do: while Example (if you know there will be at least one val). source.open() do: val = source.read(1) process(val) while val != lastitem source.close() The c syntax is: do { block of code } while (condition is satisfied); The VB syntax is: do block loop while Cheers & thanks for your reply, Hans Polak. -----Original Message----- From: Nick Coghlan [mailto:ncoghlan at gmail.com] Sent: domingo, 01 de octubre de 2006 6:18 To: Hans Polak Cc: python-dev at python.org Subject: Re: [Python-Dev] PEP 351 - do while Hans Polak wrote: > Hi, > > > > Just an opinion, but many uses of the 'while true loop' are instances of > a 'do loop'. I appreciate the language layout question, so I'll give you > an alternative: > > > > do: > > > > > > while > I believe you meant to write PEP 315 in the subject line :) To fully account for loop else clauses, this suggestion would probably need to be modified to look something like this: Basic while loop: while : else: Using break to avoid code duplication: while True: if not : break Current version of PEP 315: do: while : else: This suggestion: do: while else: I personally like that style, and if the compiler can dig through a function looking for yield statements to identify generators, it should be able to dig through a do-loop looking for the termination condition. As I recall, the main objection to this style was that it could hide the loop termination condition, but that isn't actually mentioned in the PEP (and in the typical do-while case, the loop condition will still be clearly visible at the end of the loop body). Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From Hans.Polak at capgemini.com Mon Oct 2 13:36:52 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Mon, 2 Oct 2006 13:36:52 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <4520EE65.50507@gmail.com> Message-ID: <001d01c6e617$0e980d40$1d2c440a@spain.capgemini.com> Ok, I see your point. Really, I've read more about Python than worked with it, so I'm out of my league here. Can I combine your suggestion with mine and come up with the following: do: while else: Cheers, Hans. -----Original Message----- From: Nick Coghlan [mailto:ncoghlan at gmail.com] Sent: lunes, 02 de octubre de 2006 12:48 To: Hans Polak Cc: python-dev at python.org Subject: Re: [Python-Dev] PEP 315 - do while Hans Polak wrote: > Hi Nick, > > Yep, PEP 315. Sorry about that. > > Now, about your suggestion > do: > > while > > else: > > > This is pythonic, but not logical. The 'do' will execute at least once, so > the else clause is not needed, nor is the . The body> should go before the while terminator. This objection is based on a misunderstanding of what the else clause is for in a Python loop. The else clause is only executed if the loop terminated naturally (the exit condition became false) rather than being explicitly terminated using a break statement. This behaviour is most commonly useful when using a for loop to search through an iterable (breaking when the object is found, and using the else clause to handle the 'not found' case), but it is also defined for while loops. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From Hans.Polak at capgemini.com Tue Oct 3 12:14:45 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Tue, 3 Oct 2006 12:14:45 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <452234B2.8040103@voidspace.org.uk> Message-ID: <002901c6e6d4$c05b9340$1d2c440a@spain.capgemini.com> Thanks for your reply Nick, and your support Michael. I'll leave the PEP talk to you guys :) Cheers, Hans -----Original Message----- From: Michael Foord [mailto:fuzzyman at gmail.com] On Behalf Of Fuzzyman Sent: martes, 03 de octubre de 2006 12:00 To: Nick Coghlan Cc: Hans Polak; python-dev at python.org Subject: Re: [Python-Dev] PEP 315 - do while Nick Coghlan wrote: >Hans Polak wrote: > > >>Ok, I see your point. Really, I've read more about Python than worked with >>it, so I'm out of my league here. >> >>Can I combine your suggestion with mine and come up with the following: >> >> do: >> >> >> while >> else: >> >> >> > >In my example, the 3 sections (, and code> are all optional. A basic do-while loop would look like this: > > do: > > while > >(That is, is still repeated each time around the loop - it's >called that because it is run before the loop evaluated condition is evaluated) > > +1 This looks good. The current idiom works fine, but looks unnatural : while True: if : break Would a 'while' outside of a 'do' block (but without the colon) then be a syntax error ? 'do:' would just be syntactic sugar for 'while True:' I guess. Michael Foord http://www.voidspace.org.uk >Cheers, >Nick. > > > This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From Hans.Polak at capgemini.com Tue Oct 3 16:14:12 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Tue, 3 Oct 2006 16:14:12 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <45226ADA.9080306@gmail.com> Message-ID: <003c01c6e6f6$336f6700$1d2c440a@spain.capgemini.com> I'm against infinite loops -something religious :), which explains the call for the do loop. The issue about the parser is over my head, but the thought had occurred to me. Now, it would not affect while loops inside do loops, wouldn't it? Cheers, Hans. -----Original Message----- From: Nick Coghlan [mailto:ncoghlan at gmail.com] Sent: martes, 03 de octubre de 2006 15:51 To: Fuzzyman Cc: Hans Polak; python-dev at python.org Subject: Re: [Python-Dev] PEP 315 - do while Fuzzyman wrote: > Nick Coghlan wrote: >> In my example, the 3 sections (, and > code> are all optional. A basic do-while loop would look like this: >> >> do: >> >> while >> >> (That is, is still repeated each time around the loop - it's >> called that because it is run before the loop evaluated condition is evaluated) >> >> > > +1 > > This looks good. I'm pretty sure it was proposed by someone else a long time ago - I was surprised to find it wasn't mentioned in PEP 315. That said, Guido's observation on PEP 315 from earlier this year holds for me too: "I kind of like it but it doesn't strike me as super important" [1] > The current idiom works fine, but looks unnatural : > > while True: > if : > break There's the rationale for the PEP in a whole 5 lines counting whitespace ;) > Would a 'while' outside of a 'do' block (but without the colon) then be > a syntax error ? > > 'do:' would just be syntactic sugar for 'while True:' I guess. That's the slight issue I still have with the idea - you could end up with multiple ways of spelling some of the basic loop forms, such as these 3 flavours of infinite loop: do: pass # Is there an implicit 'while True' at the end of the loop body? do: while True while True: pass The other issue I have is that I'm not yet 100% certain it's implementable with Python's parser and grammar. I *think* changing the definition of the while statement from: while_stmt ::= "while" expression ":" suite ["else" ":" suite] to while_stmt ::= "while" expression [":" suite ["else" ":" suite]] And adding a new AST node and a new type of compiler frame block "DO_LOOP" would do the trick (the compilation of a while statement without a trailing colon would then check that it was in a DO_LOOP block and raise an error if not). Cheers, Nick. [1] http://mail.python.org/pipermail/python-dev/2006-February/060711.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From Hans.Polak at capgemini.com Tue Oct 3 17:17:36 2006 From: Hans.Polak at capgemini.com (Hans Polak) Date: Tue, 3 Oct 2006 17:17:36 +0200 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <4522738F.80303@voidspace.org.uk> Message-ID: <003d01c6e6ff$0ec7ec70$1d2c440a@spain.capgemini.com> Please note that until <==> while not. do: until count > 10 do: while count <= 10 Cheers, Hans. -----Original Message----- From: Michael Foord [mailto:fuzzyman at gmail.com] On Behalf Of Fuzzyman Sent: martes, 03 de octubre de 2006 16:29 To: Nick Coghlan Cc: Hans Polak; python-dev at python.org Subject: Re: [Python-Dev] PEP 315 - do while Nick Coghlan wrote: > [snip..] > >> The current idiom works fine, but looks unnatural : >> >> while True: >> if : >> break > > > There's the rationale for the PEP in a whole 5 lines counting > whitespace ;) > >> Would a 'while' outside of a 'do' block (but without the colon) then be >> a syntax error ? >> >> 'do:' would just be syntactic sugar for 'while True:' I guess. > > > That's the slight issue I still have with the idea - you could end up > with multiple ways of spelling some of the basic loop forms, such as > these 3 flavours of infinite loop: > > do: > pass # Is there an implicit 'while True' at the end of the loop > body? > > do: > while True > > while True: > pass > Following the current idiom, isn't it more natural to repeat the loop 'until' a condition is met. If we introduced two new keywords, it would avoid ambiguity in the use of 'while'. do: until A do loop could require an 'until', meaning 'do' is not *just* a replacement for an infinite loop. (Assuming the parser can be coerced into co-operation.) It is obviously still a new construct in terms of Python syntax (not requiring a colon after ''.) I'm sure this has been suggested, but wonder if it has already been ruled out. An 'else' block could then retain its current meaning (execute if the loop is not terminated early by an explicit break.) Michael Foord http://www.voidspace.org.uk This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. From nmm1 at cus.cam.ac.uk Wed Oct 4 13:26:46 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Wed, 04 Oct 2006 12:26:46 +0100 Subject: [Python-Dev] Caching float(0.0) Message-ID: Alastair Houghton wrote: > > AFAIK few systems have floating point traps enabled by default (in > fact, isn't that what IEEE 754 specifies?), because they often aren't > very useful. The first two statements are true; the last isn't. They are extremely useful, not least because they are the only practical way to locate numeric errors in most 3 GL programs (including C, Fortran etc.) > And in the specific case of the Python interpreter, why > would you ever want them turned on? Surely in order to get > consistent floating point semantics, they need to be *off* and Python > needs to handle any exceptional cases itself; even if they're on, by > your argument Python must do that to avoid being terminated. Grrk. Why are you assuming that turning them off means that the result is what you expect? That isn't always so - sometimes it merely means that you get wrong answers but no indication of that. > or see if it can't turn them off using the C99 APIs. That is a REALLY bad idea. You have no idea how broken that is, and what the impact it would be on Python. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Wed Oct 4 13:39:07 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Wed, 04 Oct 2006 12:39:07 +0100 Subject: [Python-Dev] Caching float(0.0) Message-ID: James Y Knight wrote: > > This is a really poor argument. Python should be moving *towards* > proper '754 fp support, not away from it. On the platforms that are > most important, the C implementations distinguish positive and > negative 0. That the current python implementation may be defective > when the underlying C implementation is defective doesn't excuse a > change to intentionally break python on the common platforms. Perhaps you might like to think why only IBM POWERx (and NOT the Cell or most embedded POWERs) is the ONLY mainstream system to have implemented all of IEEE 754 in hardware after 22 years? Or why NO programming language has provided support in those 22 years, and only Java and C have even claimed to? See Kahan's "How Javas Floating-Point Hurts Everyone Everywhere", note that C99 is much WORSE, and then note that Java and C99 are the only languages that have even attempted to include IEEE 754. You have also misunderstood the issue. The fact that a C implementation doesn't support it does NOT mean that the implementation is defective; quite the contrary. The issue always has been that IEEE 754's basic model is incompatible with the basic models of all programming languages that I am familiar with (which is a lot). And the specific problems with C99 are in the STANDARD, not the IMPLEMENTATIONS. > IEEE 754 is so widely implemented that IMO it would make sense to > make Python's floating point specify it, and simply declare floating > point operations on non-IEEE 754 machines as "use at own risk, may > not conform to python language standard". (or if someone wants to use > a software fp library for such machines, that's fine too). Firstly, see the above. Secondly, Python would need MAJOR semantic changes to conform to IEEE 754R. Thirdly, what would you say to the people who want reliable error detection on floating-point of the form that Python currently provides? Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nick at craig-wood.com Wed Oct 4 13:52:32 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Wed, 4 Oct 2006 12:52:32 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> References: <45228491.9010103@v.loewis.de> <17698.40941.317868.702398@montanaro.dyndns.org> <1f7befae0610032142n2a52e7c3r1ff258b4d04e4adf@mail.gmail.com> Message-ID: <20061004115232.GA10725@craig-wood.com> On Wed, Oct 04, 2006 at 12:42:04AM -0400, Tim Peters wrote: > [skip at pobox.com] > > If C90 doesn't distinguish -0.0 and +0.0, how can Python? > > With liberal applications of piss & vinegar ;-) > > > Can you give a simple example where the difference between the two is apparent > > to the Python programmer? > > Perhaps surprsingly, many (well, comparatively many, compared to none > ....) people have noticed that the platform atan2 cares a lot: > > >>> from math import atan2 as a > >>> z = 0.0 # postive zero > >>> m = -z # minus zero > >>> a(z, z) # the result here is actually +0.0 > 0.0 > >>> a(z, m) > 3.1415926535897931 > >>> a(m, z) # the result here is actually -0.0 > 0.0 This actually returns -0.0 under linux... > >>> a(m, m) > -3.1415926535897931 > > It work like that "even on Windows", and these are the results C99's > 754-happy appendix mandates for atan2 applied to signed zeroes. I've > even seen a /complaint/ on c.l.py that atan2 doesn't do the same when > > z = 0.0 > > is replaced by > > z = 0 > > That is, at least one person thought it was "a bug" that integer > zeroes didn't deliver the same behaviors. > > Do people actually rely on this? I know I don't, but given that more > than just 2 people have remarked on it seeming to like it, I expect > that changing this would break /some/ code out there. Probably! It surely isn't a big problem though is it? instead of writing if (result == 0.0) returned cached_float_0; we just write something like if (memcmp((&result, &static_zero, sizeof(double)) == 0)) returned cached_float_0; Eg the below prints (gcc/linux) The memcmp() way 1: 0 == 0.0 2: -0 != 0.0 The == way 3: 0 == 0.0 4: -0 == 0.0 #include #include int main(void) { static double zero_value = 0.0; double result; printf("The memcmp() way\n"); result = 0.0; if (memcmp(&result, &zero_value, sizeof(double)) == 0) printf("1: %g == 0.0\n", result); else printf("1: %g != 0.0\n", result); result = -0.0; if (memcmp(&result, &zero_value, sizeof(double)) == 0) printf("2: %g == 0.0\n", result); else printf("2: %g != 0.0\n", result); printf("The == way\n"); result = 0.0; if (result == 0.0) printf("3: %g == 0.0\n", result); else printf("3: %g != 0.0\n", result); result = -0.0; if (result == 0.0) printf("4: %g == 0.0\n", result); else printf("4: %g != 0.0\n", result); return 0; } -- Nick Craig-Wood -- http://www.craig-wood.com/nick From kristjan at ccpgames.com Wed Oct 4 13:56:49 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Wed, 4 Oct 2006 11:56:49 -0000 Subject: [Python-Dev] Caching float(0.0) Message-ID: <129CEF95A523704B9D46959C922A280002FE99BD@nemesis.central.ccp.cc> Hm, doesn?t seem to be so for my regular python. Python 2.3.3 Stackless 3.0 040407 (#51, Apr 7 2004, 19:28:46) [MSC v.1200 32 bi t (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> x = -0.0 >>> y = 0.0 >>> x,y (0.0, 0.0) >>> maybe it is 2.3.3, or maybe it is stackless from back then. K > -----Original Message----- > From: python-dev-bounces+kristjan=ccpgames.com at python.org > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] > On Behalf Of "Martin v. L?wis" > Sent: 3. okt?ber 2006 17:56 > To: skip at pobox.com > Cc: Nick Maclaren; python-dev at python.org > Subject: Re: [Python-Dev] Caching float(0.0) > > skip at pobox.com schrieb: > > If C90 doesn't distinguish -0.0 and +0.0, how can Python? Can you > > give a simple example where the difference between the two > is apparent > > to the Python programmer? > > Sure: > > py> x=-0.0 > py> y=0.0 > py> x,y From nmm1 at cus.cam.ac.uk Wed Oct 4 14:12:06 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Wed, 04 Oct 2006 13:12:06 +0100 Subject: [Python-Dev] Caching float(0.0) Message-ID: On Wed, Oct 04, 2006 at 12:42:04AM -0400, Tim Peters wrote: > > > If C90 doesn't distinguish -0.0 and +0.0, how can Python? > > > Can you give a simple example where the difference between the two > > is apparent to the Python programmer? > > Perhaps surprsingly, many (well, comparatively many, compared to none > ....) people have noticed that the platform atan2 cares a lot: Once upon a time, floating-point was used as an approximation to mathematical real numbers, and anything which was mathematically undefined in real arithmetic was regarded as an error in floating- point. This allowed a reasonable amount of numeric validation, because the main remaining discrepancy was that floating-point has only limited precision and range. Most of the numerical experts that I know of still favour that approach, and it is the one standardised by the ISO LIA-1, LIA-2 and LIA-3 standards for floating-point arithmetic. atan2(0.0,0.0) should be an error. But C99 differs. While words do not fail me, they are inappropriate for this mailing list :-( Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From amk at amk.ca Wed Oct 4 14:28:45 2006 From: amk at amk.ca (A.M. Kuchling) Date: Wed, 4 Oct 2006 08:28:45 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: References: <200610031039.52434.fdrake@acm.org> <20061003180848.GB31361@localhost.localdomain> <200610031419.28281.fdrake@acm.org> Message-ID: <20061004122845.GA22146@rogue.amk.ca> On Tue, Oct 03, 2006 at 09:32:43PM -0700, Neal Norwitz wrote: > Let me know if you see anything screwed up after an hour or so. The > new versions should be up by then. Thanks! That seems to have cleared things up -- the section names are now node2.html, node3.html, ..., which is what I'd expect for the 2.6 document. --amk From guido at python.org Wed Oct 4 16:30:57 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 4 Oct 2006 07:30:57 -0700 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: <003d01c6e6ff$0ec7ec70$1d2c440a@spain.capgemini.com> References: <4522738F.80303@voidspace.org.uk> <003d01c6e6ff$0ec7ec70$1d2c440a@spain.capgemini.com> Message-ID: You are all wasting your time on this. It won't go in. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Wed Oct 4 17:44:08 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed, 4 Oct 2006 11:44:08 -0400 Subject: [Python-Dev] PEP 315 - do while In-Reply-To: References: <4522738F.80303@voidspace.org.uk> <003d01c6e6ff$0ec7ec70$1d2c440a@spain.capgemini.com> Message-ID: On 10/4/06, Guido van Rossum wrote: > You are all wasting your time on this. It won't go in. +1 from me. Should you mark PEP 315 as rejected? Jeremy > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From rhettinger at ewtllc.com Wed Oct 4 18:07:52 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Wed, 4 Oct 2006 09:07:52 -0700 Subject: [Python-Dev] PEP 315 - do while Message-ID: <34FE2A7A34BC3544BC3127D023DF3D12128737@EWTEXCH.office.bhtrader.com> I'll mark it as withdrawn. Raymond -----Original Message----- From: python-dev-bounces+rhettinger=ewtllc.com at python.org [mailto:python-dev-bounces+rhettinger=ewtllc.com at python.org] On Behalf Of Jeremy Hylton Sent: Wednesday, October 04, 2006 8:44 AM To: Guido van Rossum Cc: Hans Polak; python-dev at python.org Subject: Re: [Python-Dev] PEP 315 - do while On 10/4/06, Guido van Rossum wrote: > You are all wasting your time on this. It won't go in. +1 from me. Should you mark PEP 315 as rejected? Jeremy > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/rhettinger%40ewtllc.co m From brett at python.org Wed Oct 4 21:13:18 2006 From: brett at python.org (Brett Cannon) Date: Wed, 4 Oct 2006 12:13:18 -0700 Subject: [Python-Dev] Created branch for PEP 302 phase 2 work (in C) In-Reply-To: References: <5.1.1.6.0.20061002164622.028067d8@sparrow.telecommunity.com> <79990c6b0610021527s1e822f8dj26fbe429cf2c686c@mail.gmail.com> Message-ID: On 10/3/06, Neal Norwitz wrote: > > On 10/2/06, Brett Cannon wrote: > > > > This is why I asked for input from people on which would take less time. > > Almost all the answers I got was that the the C code was delicate but > that > > it was workable. Several people said they wished for a Python > > implementation, but hardly anyone said flat-out, "don't waste your time, > the > > Python version will be faster to do". > > I didn't respond mostly because I pushed this direction to begin with. > That and I'm lazy. :-) But couldn't you be lazy in a timely fashion? There is a lot of string manipulation and some list manipulation that > is a royal pain in C and trivial in python. Caching will be much > easier to experiement with in Python too. The Python version will be > much smaller. It will take far less time to code it in Python and > recode in C, than to try to get it right in C the first time. If the > code is fast enough, there's no reason to rewrite in C. It will > probably be easier to subclass a Python based version that a C based > version. > > > As for the bootstrapping, I am sure it is resolvable as well. There are > > several ways to go about it that are all tractable. > > Right, I had bootstrapping with implementing xrange in Python, but it > was pretty easy to resolve in the end. You might even want to use > part of that patch (from pythonrun.c?). There was some re-org to make > bootstrapping easier/possible (I don't remember exactly right now). OK, OK, I get the hint. I will rewrite import in Python and just make it my research work and personal project. Probably will do the initial pure Python stuff in the sandbox to really isolate it and then move it over to the pep302_phase2 branch when C code has to be changed. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061004/7d84fdb0/attachment.htm From martin at v.loewis.de Wed Oct 4 21:14:46 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 04 Oct 2006 21:14:46 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <452347FB.3050804@v.loewis.de> Message-ID: <45240826.6040600@v.loewis.de> Alastair Houghton schrieb: > AFAIK few systems have floating point traps enabled by default (in fact, > isn't that what IEEE 754 specifies?), because they often aren't very > useful. And in the specific case of the Python interpreter, why would > you ever want them turned on? That reasoning is irrelevant. If it breaks a few systems, that already is some systems too many. Python should never crash; and we have no control over the floating point exception handling in any portable manner. Regards, Martin From martin at v.loewis.de Wed Oct 4 21:29:19 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 04 Oct 2006 21:29:19 +0200 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99BD@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE99BD@nemesis.central.ccp.cc> Message-ID: <45240B8F.6070106@v.loewis.de> Kristj?n V. J?nsson schrieb: > Hm, doesn?t seem to be so for my regular python. > > maybe it is 2.3.3, or maybe it is stackless from back then. It's because you are using Windows. The way -0.0 gets rendered depends on the platform. As Tim points out, try math.atan2(0.0, -0.0) vs math.atan2(0.0, 0.0). Regards, Martin From alastair at alastairs-place.net Wed Oct 4 22:14:49 2006 From: alastair at alastairs-place.net (Alastair Houghton) Date: Wed, 4 Oct 2006 21:14:49 +0100 Subject: [Python-Dev] Caching float(0.0) In-Reply-To: <45240826.6040600@v.loewis.de> References: <452257EB.6070601@v.loewis.de> <3161E092-940E-4364-910E-6D2973ECC48E@fuhm.net> <452347FB.3050804@v.loewis.de> <45240826.6040600@v.loewis.de> Message-ID: On Oct 4, 2006, at 8:14 PM, Martin v. L?wis wrote: > If it breaks a few systems, that already is some systems too many. > Python should never crash; and we have no control over the floating > point exception handling in any portable manner. You're quite right, though there is already plenty of platform dependent code in Python for just that purpose (see fpectlmodule.c, for instance). Anyway, all I originally wanted was to point out that using division was one possible way to tell the difference that didn't involve relying on the representation being IEEE compliant. It's true that there are problems with FP exceptions. Kind regards, Alastair. -- http://alastairs-place.net From hasan.diwan at gmail.com Wed Oct 4 22:39:50 2006 From: hasan.diwan at gmail.com (Hasan Diwan) Date: Wed, 4 Oct 2006 13:39:50 -0700 Subject: [Python-Dev] Fwd: [ python-Feature Requests-1567948 ] poplib.py list interface In-Reply-To: References: Message-ID: <2cda2fc90610041339saf5b6b9sa0ea97e3cf13cb74@mail.gmail.com> I've made some changes to poplib.py, submitted them to Sourceforge, and emailed Piers regarding taking over maintenance of the module. I have his support to do so, along with Guido's. However, I would like to ask one of the more senior developers to review the change and commit it. Many thanks for your kind assistance! ---------- Forwarded message ---------- From: SourceForge.net Date: 04-Oct-2006 13:29 Subject: [ python-Feature Requests-1567948 ] poplib.py list interface To: noreply at sourceforge.net Feature Requests item #1567948, was opened at 2006-09-29 11:51 Message generated for change (Comment added) made by hdiwan650 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1567948&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.6 Status: Open Resolution: None Priority: 5 Submitted By: Hasan Diwan (hdiwan650) Assigned to: Nobody/Anonymous (nobody) Summary: poplib.py list interface Initial Comment: Adds a list-like interface to poplib.py, poplib_as_list. ---------------------------------------------------------------------- >Comment By: Hasan Diwan (hdiwan650) Date: 2006-10-04 13:29 Message: Logged In: YES user_id=1185570 I changed it a little bit, added my name at the top of the file as the maintainer. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1567948&group_id=5470 -- Cheers, Hasan Diwan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061004/a59230c2/attachment.html From larry at hastings.org Wed Oct 4 20:08:16 2006 From: larry at hastings.org (Larry Hastings) Date: Wed, 04 Oct 2006 11:08:16 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom Message-ID: <4523F890.9060804@hastings.org> I've never liked the "".join([]) idiom for string concatenation; in my opinion it violates the principles "Beautiful is better than ugly." and "There should be one-- and preferably only one --obvious way to do it.". (And perhaps several others.) To that end I've submitted patch #1569040 to SourceForge: http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 This patch speeds up using + for string concatenation. It's been in discussion on c.l.p for about a week, here: http://groups.google.com/group/comp.lang.python/browse_frm/thread/b8a8f20bc3c81bcf I'm not a Python guru, and my initial benchmark had many mistakes. With help from the community correct benchmarks emerged: + for string concatenation is now roughly as fast as the usual "".join() idiom when appending. (It appears to be *much* faster for prepending.) The patched Python passes all the tests in regrtest.py for which I have source; I didn't install external packages such as bsddb and sqlite3. My approach was to add a "string concatenation" object; I have since learned this is also called a "rope". Internally, a PyStringConcatationObject is exactly like a PyStringObject but with a few extra members taking an additional thirty-six bytes of storage. When you add two PyStringObjects together, string_concat() returns a PyStringConcatationObject which contains references to the two strings. Concatenating any mixture of PyStringObjects and PyStringConcatationObjects works similarly, though there are some internal optimizations. These changes are almost entirely contained within Objects/stringobject.c and Include/stringobject.h. There is one major externally-visible change in this patch: PyStringObject.ob_sval is no longer a char[1] array, but a char *. Happily, this only requires a recompile, because the CPython source is *marvelously* consistent about using the macro PyString_AS_STRING(). (One hopes extension authors are as consistent.) I only had to touch two other files (Python/ceval.c and Objects/codeobject.c) and those were one-line changes. There is one remaining place that still needs fixing: the self-described "hack" in Mac/Modules/MacOS.c. Fixing that is beyond my pay grade. I changed the representation of ob_sval for two reasons: first, it is initially NULL for a string concatenation object, and second, because it may point to separately-allocated memory. That's where the speedup came from--it doesn't render the string until someone asks for the string's value. It is telling to see my new implementation of PyString_AS_STRING, as follows (casts and extra parentheses removed for legibility): #define PyString_AS_STRING(x) ( x->ob_sval ? x->ob_sval : PyString_AsString(x) ) This adds a layer of indirection for the string and a branch, adding a tiny (but measurable) slowdown to the general case. Again, because the changes to PyStringObject are hidden by this macro, external users of these objects don't notice the difference. The patch is posted, and I have donned the thickest skin I have handy. I look forward to your feedback. Cheers, /larry/ From greg at electricrain.com Thu Oct 5 21:28:58 2006 From: greg at electricrain.com (Gregory P. Smith) Date: Thu, 5 Oct 2006 12:28:58 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <4523F890.9060804@hastings.org> References: <4523F890.9060804@hastings.org> Message-ID: <20061005192858.GA9435@zot.electricrain.com> > I've never liked the "".join([]) idiom for string concatenation; in my > opinion it violates the principles "Beautiful is better than ugly." and > "There should be one-- and preferably only one --obvious way to do it.". > (And perhaps several others.) To that end I've submitted patch #1569040 > to SourceForge: > > http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 > This patch speeds up using + for string concatenation. yay! i'm glad to see this. i hate the "".join syntax. i still write that as string.join() because thats at least readable). it also fixes the python idiom for fast string concatenation as intended; anyone whos ever written code that builds a large string value by pushing substrings into a list only to call join later should agree. mystr = "prefix" while bla: #... mystr += moredata is much nicer to read than mystr = "prefix" strParts = [mystr] while bla: #... strParts.append(moredata) mystr = "".join(strParts) have you run any generic benchmarks such as pystone to get a better idea of what the net effect on "typical" python code is? From jcarlson at uci.edu Thu Oct 5 22:05:09 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 05 Oct 2006 13:05:09 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <20061005192858.GA9435@zot.electricrain.com> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> Message-ID: <20061005130119.0951.JCARLSON@uci.edu> "Gregory P. Smith" wrote: > > > I've never liked the "".join([]) idiom for string concatenation; in my > > opinion it violates the principles "Beautiful is better than ugly." and > > "There should be one-- and preferably only one --obvious way to do it.". > > (And perhaps several others.) To that end I've submitted patch #1569040 > > to SourceForge: > > > > http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 > > This patch speeds up using + for string concatenation. > > yay! i'm glad to see this. i hate the "".join syntax. i still write > that as string.join() because thats at least readable). it also fixes > the python idiom for fast string concatenation as intended; anyone > whos ever written code that builds a large string value by pushing > substrings into a list only to call join later should agree. > > mystr = "prefix" > while bla: > #... > mystr += moredata Regardless of "nicer to read", I would just point out that Guido has stated that Python will not have strings implemented as trees. Also, Python 3.x will have a data type called 'bytes', which will be the default return of file.read() (when files are opened as binary), which uses an over-allocation strategy like lists to get relatively fast concatenation (on the order of lst1 += lst2). - Josiah From nicko at nicko.org Thu Oct 5 22:29:28 2006 From: nicko at nicko.org (Nicko van Someren) Date: Thu, 5 Oct 2006 21:29:28 +0100 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <20061005192858.GA9435@zot.electricrain.com> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> Message-ID: On 5 Oct 2006, at 20:28, Gregory P. Smith wrote: >> I've never liked the "".join([]) idiom for string concatenation; >> in my >> opinion it violates the principles "Beautiful is better than >> ugly." and >> "There should be one-- and preferably only one --obvious way to do >> it.". >> (And perhaps several others.) To that end I've submitted patch >> #1569040 >> to SourceForge: >> >> http://sourceforge.net/tracker/index.php? >> func=detail&aid=1569040&group_id=5470&atid=305470 >> This patch speeds up using + for string concatenation. > > yay! i'm glad to see this. i hate the "".join syntax. Here here. Being able to write what you mean and have the language get decent performance none the less seems to me to be a "good thing". > have you run any generic benchmarks such as pystone to get a better > idea of what the net effect on "typical" python code is? Yeah, "real world" performance testing is always important with anything that uses lazy evaluation. If you get to control if and when the computation actually happens you have even more scope than usual for getting the benchmark answer you want to see! Cheers, Nicko From larry at hastings.org Thu Oct 5 23:23:08 2006 From: larry at hastings.org (Larry Hastings) Date: Thu, 05 Oct 2006 14:23:08 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> Message-ID: <452577BC.9010506@hastings.org> Gregory P. Smith wrote: > have you run any generic benchmarks such as pystone to get a better > idea of what the net effect on "typical" python code is? I hadn't, but I'm happy to. On my machine (a fire-breathing Athlon 64 x2 4400+), best of three runs: Python 2.5 release: Pystone(1.1) time for 50000 passes = 1.01757 This machine benchmarks at 49136.8 pystones/second Python 2.5 concat: Pystone(1.1) time for 50000 passes = 0.963191 This machine benchmarks at 51910.8 pystones/second I'm surprised by this; I had expected it to be slightly *slower*, not the other way 'round. I'm not sure why this is. A cursory glance at pystone.py doesn't reveal any string concatenation using +, so I doubt it's benefiting from my speedup. And I didn't change the optimization flags when I compiled Python, so that should be the same. Josiah Carlson wrote: > Regardless of "nicer to read", I would just point out that Guido has > stated that Python will not have strings implemented as trees. > I suspect it was more a caution that Python wouldn't *permanently* store strings as "ropes". In my patch, the rope only exists until someone asks for the string's value, at which point the tree is rendered and dereferenced. From that point on the object is exactly like a normal PyStringObject to the external viewer. But you and I are, as I believe the saying goes, "channeling Guido (badly)". Perhaps some adult supervision will intervene soon and make its opinions known. For what it's worth, I've realized two things I want to change about my patch: * I left in a couple of /* lch */ comments I used during development as markers to find my own code. Whoops; I'll strip those out. * I realized that, because of struct packing, all PyStringObjects are currently wasting an average of two bytes apiece. (As in, that's something Python 2.5 does, not something added by my code.) I'll change my patch so strings are allocated more precisely. If my string concatenation patch is declined, I'll be sure to submit this patch separately. I'll try to submit an updated patch today. Cheers, /larry/ From steve at holdenweb.com Fri Oct 6 08:35:19 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 06 Oct 2006 07:35:19 +0100 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <20061005192858.GA9435@zot.electricrain.com> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> Message-ID: Gregory P. Smith wrote: >>I've never liked the "".join([]) idiom for string concatenation; in my >>opinion it violates the principles "Beautiful is better than ugly." and >>"There should be one-- and preferably only one --obvious way to do it.". >>(And perhaps several others.) To that end I've submitted patch #1569040 >>to SourceForge: >> >>http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 >>This patch speeds up using + for string concatenation. > > > yay! i'm glad to see this. i hate the "".join syntax. i still write > that as string.join() [...] instance.method(*args) <==> type.method(instance, *args) You can nowadays spell this as str.join("", lst) - no need to import a whole module! regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From fredrik at pythonware.com Fri Oct 6 08:38:24 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 06 Oct 2006 08:38:24 +0200 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> Message-ID: Steve Holden wrote: > instance.method(*args) <==> type.method(instance, *args) > > You can nowadays spell this as str.join("", lst) - no need to import a > whole module! except that str.join isn't polymorphic: >>> str.join(u",", ["1", "2", "3"]) Traceback (most recent call last): File "", line 1, in TypeError: descriptor 'join' requires a 'str' object but received a 'unicode' >>> string.join(["1", "2", "3"], u",") u'1,2,3' From skip at pobox.com Fri Oct 6 12:45:10 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 6 Oct 2006 05:45:10 -0500 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <20061005192858.GA9435@zot.electricrain.com> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> Message-ID: <17702.13238.684094.6289@montanaro.dyndns.org> Greg> have you run any generic benchmarks such as pystone to get a Greg> better idea of what the net effect on "typical" python code is? MAL's pybench would probably be better for this presuming it does some addition with string operands. Skip From fredrik at pythonware.com Fri Oct 6 12:54:13 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 06 Oct 2006 12:54:13 +0200 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <17702.13238.684094.6289@montanaro.dyndns.org> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <17702.13238.684094.6289@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > Greg> have you run any generic benchmarks such as pystone to get a > Greg> better idea of what the net effect on "typical" python code is? > > MAL's pybench would probably be better for this presuming it does some > addition with string operands. or stringbench. From rrr at ronadam.com Fri Oct 6 13:37:09 2006 From: rrr at ronadam.com (Ron Adam) Date: Fri, 06 Oct 2006 06:37:09 -0500 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <20061005192858.GA9435@zot.electricrain.com> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> Message-ID: <45263FE5.3070604@ronadam.com> Gregory P. Smith wrote: >> I've never liked the "".join([]) idiom for string concatenation; in my >> opinion it violates the principles "Beautiful is better than ugly." and >> "There should be one-- and preferably only one --obvious way to do it.". >> (And perhaps several others.) To that end I've submitted patch #1569040 >> to SourceForge: >> >> http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 >> This patch speeds up using + for string concatenation. > > yay! i'm glad to see this. i hate the "".join syntax. i still write > that as string.join() because thats at least readable). it also fixes > the python idiom for fast string concatenation as intended; anyone > whos ever written code that builds a large string value by pushing > substrings into a list only to call join later should agree. Well I always like things to run faster, but I disagree that this idiom is broken. I like using lists to store sub strings and I think it's just a matter of changing your frame of reference in how you think about them. For example it doesn't bother me to have an numeric type with many digits, and to have lists of many, many digit numbers, and work with those. Working with lists of many character strings is not that different. I've even come to the conclusion (just my opinion) that mutable lists of strings probably would work better than a long mutable string of characters in most situations. What I've found is there seems to be an optimum string length depending on what you are doing. Too long (hundreds or thousands of characters) and repeating some string operations (not just concatenations) can be slow (relative to short strings), and using many short (single character) strings would use more memory than is needed. So a list of medium length strings is actually a very nice compromise. I'm not sure what the optimal strings length is, but lines of about 80 columns seems to work very well for most things. I think what may be missing is a larger set of higher level string functions that will work with lists of strings directly. Then lists of strings can be thought of as a mutable string type by its use, and then working with substrings in lists and using ''.join() will not seem as out of place. So maybe instead of splitting, modifying, then joining, (and again, etc ...), just pass the whole list around and have operations that work directly on the list of strings and return a list of strings as the result. Pretty much what the Patch does under the covers, but it only works with concatenation. Having more functions that work with lists of strings directly will reduce the need for concatenation as well. Some operations that could work well with whole lists of strings of lines may be indent_lines, dedent_lines, prepend_lines, wrap_lines, and of course join_lines as in '\n'.join(L), the inverse of s.splitlines(), and there also readlines() and writelines(). Also possilby find_line or find_in_lines(). These really shouldn't seem anymore out of place than numeric operations that work with lists such as sum, max, and min. So to me... "".join(L) as a string operation that works on a list of strings seems perfectly natural. :-) Cheers, Ron From fredrik at pythonware.com Fri Oct 6 13:55:09 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 06 Oct 2006 13:55:09 +0200 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <45263FE5.3070604@ronadam.com> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <45263FE5.3070604@ronadam.com> Message-ID: Ron Adam wrote: > I think what may be missing is a larger set of higher level string functions > that will work with lists of strings directly. Then lists of strings can be > thought of as a mutable string type by its use, and then working with substrings > in lists and using ''.join() will not seem as out of place. as important is the observation that you don't necessarily have to join string lists; if the data ends up being sent over a wire or written to disk, you might as well skip the join step, and work directly from the list. (it's no accident that ET has grown "tostringlist" and "fromstringlist" functions, for example ;-) From amk at amk.ca Fri Oct 6 15:40:58 2006 From: amk at amk.ca (A.M. Kuchling) Date: Fri, 6 Oct 2006 09:40:58 -0400 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? Message-ID: <20061006134058.GA16266@localhost.localdomain> I was looking at the logs for classobject.c and noticed this commit that adds Py_TPFLAGS_HAVE_WEAKREFS to the instance type. Should it be backported to 2.4? (It looks to me like it should, but I don't know anything about weakref implementation and want to get approval from someone who knows.) --amk r39038 | rhettinger | 2005-06-19 04:42:20 -0400 (Sun, 19 Jun 2005) | 2 lines Insert missing flag. ------------------------------------------------------------------------ Index: classobject.c =================================================================== --- classobject.c (revision 39037) +++ classobject.c (revision 39038) @@ -2486,7 +2486,7 @@ (getattrofunc)instancemethod_getattro, /* tp_getattro */ PyObject_GenericSetAttr, /* tp_setattro */ 0, /* tp_as_buffer */ - Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */ + Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_HAVE_WEAKREFS, /* tp_flags */ instancemethod_doc, /* tp_doc */ (traverseproc)instancemethod_traverse, /* tp_traverse */ 0, /* tp_clear */ svn merge -r 39037:39038 svn+ssh://pythondev at svn.python.org/python/trunk From rhettinger at ewtllc.com Fri Oct 6 17:48:15 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Fri, 6 Oct 2006 08:48:15 -0700 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? Message-ID: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com> No need to backport. Py_TPFLAGS_DEFAULT implies Py_TPFLAGS_HAVE_WEAKREFS. The change was for clarity -- most things that have the weakref slots filled-in will also make the flag explicit -- that makes it easier on the brain when verifying code that checks the weakref flag. Raymond -----Original Message----- From: python-dev-bounces+rhettinger=ewtllc.com at python.org [mailto:python-dev-bounces+rhettinger=ewtllc.com at python.org] On Behalf Of A.M. Kuchling Sent: Friday, October 06, 2006 6:41 AM To: python-dev at python.org Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? I was looking at the logs for classobject.c and noticed this commit that adds Py_TPFLAGS_HAVE_WEAKREFS to the instance type. Should it be backported to 2.4? (It looks to me like it should, but I don't know anything about weakref implementation and want to get approval from someone who knows.) --amk r39038 | rhettinger | 2005-06-19 04:42:20 -0400 (Sun, 19 Jun 2005) | 2 lines Insert missing flag. ------------------------------------------------------------------------ Index: classobject.c =================================================================== --- classobject.c (revision 39037) +++ classobject.c (revision 39038) @@ -2486,7 +2486,7 @@ (getattrofunc)instancemethod_getattro, /* tp_getattro */ PyObject_GenericSetAttr, /* tp_setattro */ 0, /* tp_as_buffer */ - Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */ + Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_HAVE_WEAKREFS, /* tp_flags */ instancemethod_doc, /* tp_doc */ (traverseproc)instancemethod_traverse, /* tp_traverse */ 0, /* tp_clear */ svn merge -r 39037:39038 svn+ssh://pythondev at svn.python.org/python/trunk _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/rhettinger%40ewtllc.co m From jcarlson at uci.edu Fri Oct 6 18:03:29 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 06 Oct 2006 09:03:29 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <45263FE5.3070604@ronadam.com> Message-ID: <20061006085913.0963.JCARLSON@uci.edu> Fredrik Lundh wrote: > > Ron Adam wrote: > > > I think what may be missing is a larger set of higher level string functions > > that will work with lists of strings directly. Then lists of strings can be > > thought of as a mutable string type by its use, and then working with substrings > > in lists and using ''.join() will not seem as out of place. > > as important is the observation that you don't necessarily have to join > string lists; if the data ends up being sent over a wire or written to > disk, you might as well skip the join step, and work directly from the list. > > (it's no accident that ET has grown "tostringlist" and "fromstringlist" > functions, for example ;-) I've personally added a line-based abstraction with indent/dedent handling, etc., for the editor I use, which helps make macros and underlying editor functionality easier to write. - Josiah From amk at amk.ca Fri Oct 6 18:34:56 2006 From: amk at amk.ca (A.M. Kuchling) Date: Fri, 6 Oct 2006 12:34:56 -0400 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com> References: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com> Message-ID: <20061006163456.GA24036@rogue.amk.ca> On Fri, Oct 06, 2006 at 08:48:15AM -0700, Raymond Hettinger wrote: > The change was for clarity -- most things that have the weakref slots > filled-in will also make the flag explicit -- that makes it easier on > the brain when verifying code that checks the weakref flag. OK; I won't backport this. Thanks! --amk From bob at redivi.com Fri Oct 6 19:41:16 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 6 Oct 2006 10:41:16 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <45263FE5.3070604@ronadam.com> Message-ID: <6a36e7290610061041j6e501552kcb4f525668c96f55@mail.gmail.com> On 10/6/06, Fredrik Lundh wrote: > Ron Adam wrote: > > > I think what may be missing is a larger set of higher level string functions > > that will work with lists of strings directly. Then lists of strings can be > > thought of as a mutable string type by its use, and then working with substrings > > in lists and using ''.join() will not seem as out of place. > > as important is the observation that you don't necessarily have to join > string lists; if the data ends up being sent over a wire or written to > disk, you might as well skip the join step, and work directly from the list. > > (it's no accident that ET has grown "tostringlist" and "fromstringlist" > functions, for example ;-) The just make lists paradigm is used by Erlang too, it's called "iolist" there (it's not a type, just a convention). The lists can be nested though, so concatenating chunks of data for IO is always a constant time operation even if the chunks are already iolists. -bob From rrr at ronadam.com Fri Oct 6 21:53:01 2006 From: rrr at ronadam.com (Ron Adam) Date: Fri, 06 Oct 2006 14:53:01 -0500 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <20061006085913.0963.JCARLSON@uci.edu> References: <45263FE5.3070604@ronadam.com> <20061006085913.0963.JCARLSON@uci.edu> Message-ID: <4526B41D.50805@ronadam.com> Josiah Carlson wrote: > Fredrik Lundh wrote: >> Ron Adam wrote: >> >>> I think what may be missing is a larger set of higher level string functions >>> that will work with lists of strings directly. Then lists of strings can be >>> thought of as a mutable string type by its use, and then working with substrings >>> in lists and using ''.join() will not seem as out of place. >> as important is the observation that you don't necessarily have to join >> string lists; if the data ends up being sent over a wire or written to >> disk, you might as well skip the join step, and work directly from the list. >> >> (it's no accident that ET has grown "tostringlist" and "fromstringlist" >> functions, for example ;-) > > I've personally added a line-based abstraction with indent/dedent > handling, etc., for the editor I use, which helps make macros and > underlying editor functionality easier to write. > > > - Josiah I've done the same thing just last week. I've started to collect them into a module called stringtools, but I see no reason why they can't reside in the string module. I think this may be just a case of collecting these type of routines together in one place so they can be reused easily because they already are scattered around pythons library in some form or another. Another tool I found tucked away within a pydoc is the console pager that is used in pydoc. I think it could easily be a separate module it self. And it benefits from the line-based abstraction as well. Cheers, Ron From nicko at nicko.org Sat Oct 7 04:21:08 2006 From: nicko at nicko.org (Nicko van Someren) Date: Sat, 7 Oct 2006 03:21:08 +0100 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <45263FE5.3070604@ronadam.com> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <45263FE5.3070604@ronadam.com> Message-ID: On 6 Oct 2006, at 12:37, Ron Adam wrote: >>> I've never liked the "".join([]) idiom for string concatenation; >>> in my >>> opinion it violates the principles "Beautiful is better than >>> ugly." and >>> "There should be one-- and preferably only one --obvious way to >>> do it.". ... > Well I always like things to run faster, but I disagree that this > idiom is broken. > > I like using lists to store sub strings and I think it's just a > matter of > changing your frame of reference in how you think about them. I think that you've hit on exactly the reason why this patch is a good idea. You happen to like to store strings in lists, and in many situations this is a fine thing to do, but if one is forced to change ones frame of reference in order to get decent performance then as well as violating the maxims Larry originally cited you're also hitting both "readability counts" and "Correctness and clarity before speed." The "".join(L) idiom is not "broken" in the sense that, to the fluent Python programmer, it does convey the intent as well as the action. That said, there are plenty of places that you'll see it not being used because it fails to convey the intent. It's pretty rare to see someone write: for k,v in d.items(): print " has value: ".join([k,v]) but, despite the utility of the % operator on strings it's pretty common to see: print k + " has value: " + v This patch _seems_ to be able to provide better performance for this sort of usage and provide a major speed-up for some other common usage forms without causing the programmer to resort making their code more complicated. The cost seems to be a small memory hit on the size of a string object, a tiny increase in code size and some well isolated, under-the-hood complexity. It's not like having this patch is going to force anyone to change the way they write their code. As far as I can tell it simply offers better performance if you choose to express your code in some common ways. If it speeds up pystone by 5.5% with such minimal down side I'm hard pressed to see a reason not to use it. Cheers, Nicko From kbk at shore.net Sat Oct 7 06:18:50 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat, 7 Oct 2006 00:18:50 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200610070418.k974IoNN008046@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 428 open ( +6) / 3417 closed ( +2) / 3845 total ( +8) Bugs : 939 open ( +6) / 6229 closed (+17) / 7168 total (+23) RFE : 240 open ( +3) / 239 closed ( +0) / 479 total ( +3) New / Reopened Patches ______________________ Speed up using + for string concatenation (2006-10-02) http://python.org/sf/1569040 opened by Larry Hastings Speed-up in array_repeat() (2006-10-02) http://python.org/sf/1569291 opened by Lars Skovlund Fix building the source within exec_prefix (2006-10-03) http://python.org/sf/1569798 opened by Matthias Klose distutils - python 2.5 vc8 - non working setup (2006-10-03) CLOSED http://python.org/sf/1570119 opened by Grzegorz Makarewicz Fix for compilation errors in the 2.4 branch (2006-10-03) CLOSED http://python.org/sf/1570253 opened by ?iga Seilnacht qtsupport.py mistake leads to bad _Qt module (2006-10-04) http://python.org/sf/1570672 opened by Jeff Senn Generate numeric/space/linebreak from Unicode database. (2006-10-05) http://python.org/sf/1571184 opened by Anders Chrigstr?m make trace.py --ignore-dir work (2006-10-05) http://python.org/sf/1571379 opened by Clinton Roy Patches Closed ______________ distutils - python 2.5 vc8 - non working setup (2006-10-03) http://python.org/sf/1570119 closed by loewis Fix for compilation errors in the 2.4 branch (2006-10-03) http://python.org/sf/1570253 closed by loewis New / Reopened Bugs ___________________ Test for uintptr_t seems to be incorrect (2006-10-01) CLOSED http://python.org/sf/1568842 opened by Ronald Oussoren http redirect does not pass 'post' data (2006-10-02) CLOSED http://python.org/sf/1568897 opened by hans_moleman 'all' documentation missing online (2006-09-26) CLOSED http://python.org/sf/1565797 reopened by aisaac0 Using .next() on file open in write mode writes junk to file (2006-10-01) http://python.org/sf/1569057 opened by andrei kulakov External codecs no longer usable (2006-10-02) CLOSED http://python.org/sf/1569084 opened by Ivan Vilata i Balaguer sys.settrace cause curried parms to show up as attributes (2006-10-02) http://python.org/sf/1569356 opened by applebucks sys.settrace cause curried parms to show up as attributes (2006-10-02) CLOSED http://python.org/sf/1569374 opened by applebucks PGIRelease linkage fails on pgodb80.dll (2006-10-02) http://python.org/sf/1569517 opened by Coatimundi Backward incompatibility in logging.py (2006-10-02) CLOSED http://python.org/sf/1569622 opened by Mike Klaas datetime.datetime subtraction bug (2006-10-02) CLOSED http://python.org/sf/1569623 opened by David Fugate mailbox.Maildir.get_folder() loses factory information (2006-10-03) http://python.org/sf/1569790 opened by Matthias Klose distutils don't respect standard env variables (2006-10-03) CLOSED http://python.org/sf/1569886 opened by Lukas Lalinsky 2.5 incorrectly permits break inside try statement (2006-10-04) CLOSED http://python.org/sf/1569998 opened by Nick Coghlan redirected cookies (2006-10-04) http://python.org/sf/1570255 opened by hans_moleman Launcher reset to factory button provides bad command-line (2006-10-03) http://python.org/sf/1570284 opened by jjackson 2.4 & 2.5 can't create win installer on linux (2006-10-04) http://python.org/sf/1570417 opened by Richard Jones _ssl module can't be built on windows (2006-10-05) CLOSED http://python.org/sf/1571023 opened by ?iga Seilnacht simple moves freeze IDLE (2006-10-04) http://python.org/sf/1571112 opened by Douglas W. Goodall Some numeric characters are still not recognized (2006-10-05) http://python.org/sf/1571170 opened by Anders Chrigstr?m round() producing -0.0 (2006-10-05) CLOSED http://python.org/sf/1571620 opened by Ron Frye Building using Sleepycat db 4.5.20 is broken (2006-10-05) http://python.org/sf/1571754 opened by Robert Scheck email module does not complay with RFC 2046: CRLF issue (2006-10-05) http://python.org/sf/1571841 opened by Andy Leszczynski .eml attachments in email (2006-10-06) http://python.org/sf/1572084 opened by rainwolf8472 parser stack overflow (2006-10-06) http://python.org/sf/1572320 opened by j?rgen urner csv "dialect = 'excel-tab'" to use excel_tab (2006-10-06) http://python.org/sf/1572471 opened by Dan Goldner Bugs Closed ___________ Test for uintptr_t seems to be incorrect (2006-10-01) http://python.org/sf/1568842 closed by loewis http redirect does not pass 'post' data (2006-10-01) http://python.org/sf/1568897 closed by loewis Spurious Tabnanny error (2006-09-21) http://python.org/sf/1562716 closed by kbk Spurious Tab/space error (2006-09-21) http://python.org/sf/1562719 closed by kbk plistlib should be moved out of plat-mac (2003-07-29) http://python.org/sf/779460 closed by gbrandl Pythonw doesn't get rebuilt if version number changes (2006-09-05) http://python.org/sf/1552935 closed by sf-robot 'all' documentation missing online (2006-09-26) http://python.org/sf/1565797 closed by loewis External codecs no longer usable (2006-10-02) http://python.org/sf/1569084 closed by lemburg sys.settrace cause curried parms to show up as attributes (2006-10-02) http://python.org/sf/1569374 closed by gbrandl Backward incompatibility in logging.py (2006-10-02) http://python.org/sf/1569622 closed by vsajip datetime.datetime subtraction bug (2006-10-02) http://python.org/sf/1569623 closed by tim_one Output of KlocWork on Python2.4.3 sources (2006-05-09) http://python.org/sf/1484556 closed by loewis possible bug in mystrtol.c with recent gcc (2006-07-13) http://python.org/sf/1521947 closed by arigo gcc trunk (4.2) exposes a signed integer overflows (2006-08-24) http://python.org/sf/1545668 closed by arigo distutils don't respect standard env variables (2006-10-03) http://python.org/sf/1569886 closed by loewis 2.5 incorrectly permits break inside try statement (2006-10-03) http://python.org/sf/1569998 closed by jhylton _ssl module can't be built on windows (2006-10-05) http://python.org/sf/1571023 closed by loewis round() producing -0.0 (2006-10-05) http://python.org/sf/1571620 closed by rhettinger New style classes and __hash__ (2002-12-30) http://python.org/sf/660098 closed by gvanrossum New / Reopened RFE __________________ Improvements to socket module exceptions (2006-10-06) http://python.org/sf/1571878 opened by GaryD help(x) for for keywords too (2006-10-06) http://python.org/sf/1572210 opened by Jim Jewett From rrr at ronadam.com Sat Oct 7 06:23:00 2006 From: rrr at ronadam.com (Ron Adam) Date: Fri, 06 Oct 2006 23:23:00 -0500 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <45263FE5.3070604@ronadam.com> Message-ID: <45272BA4.2020208@ronadam.com> Nicko van Someren wrote: > On 6 Oct 2006, at 12:37, Ron Adam wrote: > >>>> I've never liked the "".join([]) idiom for string concatenation; in my >>>> opinion it violates the principles "Beautiful is better than ugly." and >>>> "There should be one-- and preferably only one --obvious way to do >>>> it.". > ... >> Well I always like things to run faster, but I disagree that this >> idiom is broken. >> >> I like using lists to store sub strings and I think it's just a matter of >> changing your frame of reference in how you think about them. > > I think that you've hit on exactly the reason why this patch is a good > idea. You happen to like to store strings in lists, and in many > situations this is a fine thing to do, but if one is forced to change > ones frame of reference in order to get decent performance then as well > as violating the maxims Larry originally cited you're also hitting both > "readability counts" and "Correctness and clarity before speed." The statement ".. if one is forced to change .." is a bit overstated I think. The situation is more a matter of increasing awareness so the frame of reference comes to mind more naturally and doesn't seem forced. And the suggestion of how to do that is by adding additional functions and methods that can use lists-of-strings instead of having to join or concatenate them first. Added examples and documentation can also do that as well. The two ideas are non-competing. They are related because they realize their benefits by reducing redundant underlying operations in a similar way. Cheers, Ron From jcarlson at uci.edu Sat Oct 7 09:51:23 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 07 Oct 2006 00:51:23 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <45263FE5.3070604@ronadam.com> Message-ID: <20061007004620.0979.JCARLSON@uci.edu> Nicko van Someren wrote: > It's not like having this patch is going to force anyone to change > the way they write their code. As far as I can tell it simply offers > better performance if you choose to express your code in some common > ways. If it speeds up pystone by 5.5% with such minimal down side > I'm hard pressed to see a reason not to use it. This has to wait until Python 2.6 (which is anywhere from 14-24 months away, according to history); including it would destroy binary capatability with modules compiled for 2.5, nevermind that it is a nontrivial feature addition. I also think that the original author (or one of this patch's supporters) should write a PEP outlining the Python 2.5 and earlier drawbacks, what changes this implementation brings, its improvements, and any potential drawbacks. - Josiah From fredrik at pythonware.com Sat Oct 7 10:17:23 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 07 Oct 2006 10:17:23 +0200 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <45263FE5.3070604@ronadam.com> Message-ID: Nicko van Someren wrote: > If it speeds up pystone by 5.5% with such minimal down side > I'm hard pressed to see a reason not to use it. can you tell me where exactly "pystone" does string concatenations? From skip at pobox.com Sat Oct 7 13:53:43 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 7 Oct 2006 06:53:43 -0500 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <45263FE5.3070604@ronadam.com> Message-ID: <17703.38215.196524.424167@montanaro.dyndns.org> Fredrik> Nicko van Someren wrote: >> If it speeds up pystone by 5.5% with such minimal down side I'm hard >> pressed to see a reason not to use it. Fredrik> can you tell me where exactly "pystone" does string Fredrik> concatenations? I wondered about that as well. While I'm not prepared to assert without a doubt that pystone does no simpleminded string concatenation, a couple minutes scanning the pystone source didn't turn up any. If the pystone speedup isn't an artifact, the absence of string concatention in pystone suggests it's happening somewhere in the interpreter. I applied the patch, ran the interpreter under gdb with a breakpoint set in string_concat where the PyStringConcatenationObject is created, then ran pystone. The first hit was in site.py -> distutils/util.py -> string.py All told, there were only 22 hits, none for very long strings, so that doesn't explain the performance improvement. BTW, on my Mac (OSX 10.4.8) max() is not defined. I had to add a macro definition to string_concat. Skip From g.brandl at gmx.net Sat Oct 7 14:01:50 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 07 Oct 2006 14:01:50 +0200 Subject: [Python-Dev] Security Advisory for unicode repr() bug? Message-ID: [ Bug http://python.org/sf/1541585 ] This seems to be handled like a security issue by linux distributors, it's also a news item on security related pages. Should a security advisory be written and official patches be provided? Georg From skip at pobox.com Sat Oct 7 14:16:37 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 7 Oct 2006 07:16:37 -0500 Subject: [Python-Dev] Security Advisory for unicode repr() bug? In-Reply-To: References: Message-ID: <17703.39589.473518.217002@montanaro.dyndns.org> Georg> [ Bug http://python.org/sf/1541585 ] Georg> This seems to be handled like a security issue by linux Georg> distributors, it's also a news item on security related pages. Georg> Should a security advisory be written and official patches be Georg> provided? I asked about this a few weeks ago. I got no direct response. Secunia sent mail to webmaster and the SF project admins asking about how this could be exploited. (Isn't figuring that stuff out their job?) This was corrected before 2.5 was released and the 2.4 source has (I think) already been patched, with 2.4.4 right around the corner. The bulk of the Python installations in the field are probably running on Windows (most of them provided by HP/Compaq), and it seems the Linux vendors are all over it. I don't know if Apple has picked up on it (or if the version they currently distribute is affected - 2.3.5 built Oct 5 2005). Would you provide a patch of some sort for Windows or just refer people to corrected installers? Given the apparently miserable results trying to get Windows users to install security fixes manually, I doubt a new 2.4.3 Windows installer would get much exercise. Skip From g.brandl at gmx.net Sat Oct 7 14:27:09 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 07 Oct 2006 14:27:09 +0200 Subject: [Python-Dev] Security Advisory for unicode repr() bug? In-Reply-To: <17703.39589.473518.217002@montanaro.dyndns.org> References: <17703.39589.473518.217002@montanaro.dyndns.org> Message-ID: <45279D1D.10102@gmx.net> skip at pobox.com wrote: > Georg> [ Bug http://python.org/sf/1541585 ] > > Georg> This seems to be handled like a security issue by linux > Georg> distributors, it's also a news item on security related pages. > > Georg> Should a security advisory be written and official patches be > Georg> provided? > > I asked about this a few weeks ago. I got no direct response. Secunia sent > mail to webmaster and the SF project admins asking about how this could be > exploited. (Isn't figuring that stuff out their job?) Perhaps, judging from the name :) > This was corrected before 2.5 was released and the 2.4 source has (I think) > already been patched, with 2.4.4 right around the corner. The bulk of the > Python installations in the field are probably running on Windows (most of > them provided by HP/Compaq), and it seems the Linux vendors are all over it. > I don't know if Apple has picked up on it (or if the version they currently > distribute is affected - 2.3.5 built Oct 5 2005). Would you provide a patch > of some sort for Windows or just refer people to corrected installers? > Given the apparently miserable results trying to get Windows users to > install security fixes manually, I doubt a new 2.4.3 Windows installer would > get much exercise. Even if the patch / corrected installer is used by only 1% of all installations, reacting quickly and providing it in the first place is going to make a much better impression than saying "well, nobody is going to apply it and the next release is due in a few weeks". [CC'ing security at python.org] Georg From mal at egenix.com Sat Oct 7 16:36:00 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 07 Oct 2006 16:36:00 +0200 Subject: [Python-Dev] Security Advisory for unicode repr() bug? In-Reply-To: <45279D1D.10102@gmx.net> References: <17703.39589.473518.217002@montanaro.dyndns.org> <45279D1D.10102@gmx.net> Message-ID: <4527BB50.3040503@egenix.com> Georg Brandl wrote: > skip at pobox.com wrote: >> Georg> [ Bug http://python.org/sf/1541585 ] >> >> Georg> This seems to be handled like a security issue by linux >> Georg> distributors, it's also a news item on security related pages. >> >> Georg> Should a security advisory be written and official patches be >> Georg> provided? >> >> I asked about this a few weeks ago. I got no direct response. Secunia sent >> mail to webmaster and the SF project admins asking about how this could be >> exploited. (Isn't figuring that stuff out their job?) > > Perhaps, judging from the name :) > >> This was corrected before 2.5 was released and the 2.4 source has (I think) >> already been patched, with 2.4.4 right around the corner. The bulk of the >> Python installations in the field are probably running on Windows (most of >> them provided by HP/Compaq), and it seems the Linux vendors are all over it. >> I don't know if Apple has picked up on it (or if the version they currently >> distribute is affected - 2.3.5 built Oct 5 2005). Would you provide a patch >> of some sort for Windows or just refer people to corrected installers? >> Given the apparently miserable results trying to get Windows users to >> install security fixes manually, I doubt a new 2.4.3 Windows installer would >> get much exercise. > > Even if the patch / corrected installer is used by only 1% of all installations, > reacting quickly and providing it in the first place is going to make a much > better impression than saying "well, nobody is going to apply it and the next > release is due in a few weeks". Note that the bug refers to a UCS4 Python build. Most Linux distros ship UCS4 builds nowadays, so they care. The Windows builds are UCS2 (except maybe the ones for Win64 - don't know) which doesn't seem to be affected. +1 on publishing the patch for 2.4. It's always better to react quickly in such cases, even if it just gives users a fuzzy warm feeling of being cared for :-) Whether such patches get installed or not is not really a question to ask, since it's not within our responsibility. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 07 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From nnorwitz at gmail.com Sat Oct 7 22:33:44 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sat, 7 Oct 2006 13:33:44 -0700 Subject: [Python-Dev] Security Advisory for unicode repr() bug? In-Reply-To: <17703.39589.473518.217002@montanaro.dyndns.org> References: <17703.39589.473518.217002@montanaro.dyndns.org> Message-ID: On 10/7/06, skip at pobox.com wrote: > > Georg> [ Bug http://python.org/sf/1541585 ] > > Georg> This seems to be handled like a security issue by linux > Georg> distributors, it's also a news item on security related pages. > > Georg> Should a security advisory be written and official patches be > Georg> provided? > > I asked about this a few weeks ago. I got no direct response. Secunia sent > mail to webmaster and the SF project admins asking about how this could be > exploited. (Isn't figuring that stuff out their job?) FWIW, I responded to the original mail from Secunia with what little I know about the problem. Everyone on the original mail was copied. However, I got ~30 bounces for all the Source Forge addresses due to some issue between SF and Google mail. n From talin at acm.org Sun Oct 8 00:10:58 2006 From: talin at acm.org (Talin) Date: Sat, 07 Oct 2006 15:10:58 -0700 Subject: [Python-Dev] Python Doc problems In-Reply-To: <17693.6670.189595.646482@montanaro.dyndns.org> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <20060929121035.GA4884@localhost.localdomain> <17693.6670.189595.646482@montanaro.dyndns.org> Message-ID: <452825F2.4000803@acm.org> skip at pobox.com wrote: > Andrew> In such autogenerated documentation, you wind up with a list of > Andrew> every single class and function, and both trivial and important > Andrew> classes are given exactly the same emphasis. > > I find this true where I work as well. Doxygen is used as a documentation > generation tool for our C++ class libraries. Too many people use that as a > crutch to often avoid writing documentation altogether. It's worse in many > ways than tools like epydoc, because you don't need to write any docstrings > (or specially formatted comments) to generate reams and reams of virtual > paper. This sort of documentation is all but useless for a Python > programmer like myself. I don't really need to know the five syntactic > constructor variants. I need to know how to use the classes which have been > exposed to me. As someone who has submitted patches to Doxygen (and actually had them accepted), I have to say that I agree as well. At my work, it used to be standard practice for each project to have a web site of "documentation" that was generated by Doxygen. Part of the reason for my patches (which added support for parsing of C# doctags) was in support of this effort. However, I gradually realized that there's no actual use-case for Doxygen-generated docs in our environment. Think about the work cycle of a typical C++ programmer. Generally when you need to look up something in the docs for a module, you either need specific information on the type of a variable or params of a function, or you need "overview" docs that explain the general theory of the module. Bear in mind also that the typical C++ programmer is working inside of an IDE or other smart editor. Most such editors have a simple one-keystroke method of navigating from a symbol to its definition. In other words, it is *far* easier for a programmer to jump directly to the actual declaration in a header file - and its accompanying documentation comments - than it is to switch over to a web browser, navigate to the documentation site, type in the name of the symbol, hit search...why would I *ever* use HTML reference documentation when I can just look at the source, which is much easier to get to? Especially since the source often tells me much more than the docs would. The only reason for generated reference docs is when you are working on a module where you don't have the source code - which, even in a proprietary environment, is something to be avoided whenever possible. (The source may not be 'open', but that doesn't mean that *you* can't have access to it.) If you have the source - and a good indexing system in your IDE - there's really no need for Doxygen. Of course, the web-based docs are useful when you need an overview - but Doxygen doesn't give you that. As a result, I have been trying to get people to stop using Doxygen as a "crutch" as you say - in other words, if a team has the responsibility to write docs for their code, they can't just run Doxygen over the source and call it done. (Too bad there's no way to automatically generate the overview! :) While I am in rant mode (sorry), I also want to mention that most Documentation markup systems also have a source readability cost - i.e having embedded tags like @param make the original source less readable; and given what I said above about the source being the primary reference doc, it doesn't make sense to clutter up the code with funny @#$ characters. If I was going to use any markup system in the future, the first thing I would insist is that the markup be "invisible" - in other words, the markup should look just like normal comments, and the markup scanner should be smart enough to pick out the structure without needing a lot of hand-holding. For example: /* Plot a point at position x, y. 'x' - The x-coordinate. 'y' - The y-coordinate. */ void Plot( int x, int y ); The scanner should note that: 'x' and 'y' are in single-quotes, so they probably refer to code identifiers. The scanner can see that they are both parameters to the function, so there's no need to tell it that 'x' is an @param. In other words, the programmer should never have to type anything that can be deduced from looking at the code itself. And the reader shouldn't have to read a bunch of redundant information which they can easily see for themselves. > I guess this is a long-winded way of saying, "me too". > > Skip ditto. -- Talin From nicko at nicko.org Sun Oct 8 01:01:11 2006 From: nicko at nicko.org (Nicko van Someren) Date: Sun, 8 Oct 2006 00:01:11 +0100 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <45263FE5.3070604@ronadam.com> Message-ID: On 7 Oct 2006, at 09:17, Fredrik Lundh wrote: > Nicko van Someren wrote: > >> If it speeds up pystone by 5.5% with such minimal down side >> I'm hard pressed to see a reason not to use it. > > can you tell me where exactly "pystone" does string concatenations? No, not without more in depth examination, but it is a pretty common operation in all sorts of cases including inside the interpreter. Larry's message in reply to Gregory Smith's request for a pystone score showed a 5.5% improvement and as yet I have no reason to doubt it. If the patch provides a measurable performance improvement for code that merely happens to use strings as opposed to being explicitly heavy on string addition then all the better. It's clear that this needs to be more carefully measured before it goes in (which is why that quote above starts "If"). As I've mentioned before in this thread, getting good performance measures on code that does lazy evaluation is often tricky. pystone is a good place to start but I'm sure that there are use cases that it does not cover. As for counting up the downsides, Josiah Carlson rightly points out that it breaks binary compatibility for modules, so the change can not be taken lightly and clearly it will have to wait for a major release. Still, if the benefits outweigh the costs it seems worth doing. Cheers, Nicko From fredrik at pythonware.com Sun Oct 8 08:38:31 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 08 Oct 2006 08:38:31 +0200 Subject: [Python-Dev] Python Doc problems In-Reply-To: <452825F2.4000803@acm.org> References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org> <20060928095951.08BF.JCARLSON@uci.edu> <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp> <20060929121035.GA4884@localhost.localdomain> <17693.6670.189595.646482@montanaro.dyndns.org> <452825F2.4000803@acm.org> Message-ID: Talin wrote: > /* > Plot a point at position x, y. > 'x' - The x-coordinate. > 'y' - The y-coordinate. > */ > void Plot( int x, int y ); > > The scanner should note that: 'x' and 'y' are in single-quotes, so they > probably refer to code identifiers. or maybe they're string literals? > The scanner can see that they are > both parameters to the function, so there's no need to tell it that 'x' > is an @param. PythonDoc provides multiple parameter markers, so you can distinguish between positional parameters and keyword arguments. > In other words, the programmer should never have to type anything that > can be deduced from looking at the code itself. And the reader shouldn't > have to read a bunch of redundant information which they can easily see > for themselves. that's exactly why you need parameter markers in today's Python: Python's function definition syntax doesn't allow the programmer to fully communicate the intent behind the design. (what's this post doing on python-dev, btw? should this discussion take place on the doc-sig?) From alastair at alastairs-place.net Sun Oct 8 14:20:59 2006 From: alastair at alastairs-place.net (Alastair Houghton) Date: Sun, 8 Oct 2006 13:20:59 +0100 Subject: [Python-Dev] Security Advisory for unicode repr() bug? In-Reply-To: <4527BB50.3040503@egenix.com> References: <17703.39589.473518.217002@montanaro.dyndns.org> <45279D1D.10102@gmx.net> <4527BB50.3040503@egenix.com> Message-ID: On Oct 7, 2006, at 3:36 PM, M.-A. Lemburg wrote: > Georg Brandl wrote: >> skip at pobox.com wrote: >>> I don't know if Apple has picked up on it (or if the version they >>> currently >>> distribute is affected - 2.3.5 built Oct 5 2005). > Note that the bug refers to a UCS4 Python build. Most Linux > distros ship UCS4 builds nowadays, so they care. The Windows > builds are UCS2 (except maybe the ones for Win64 - don't know) > which doesn't seem to be affected. AFAIK the version Apple ship is a UCS2 build, therefore not affected. Kind regards, Alastair. -- http://alastairs-place.net From skip at pobox.com Sun Oct 8 18:07:16 2006 From: skip at pobox.com (skip at pobox.com) Date: Sun, 8 Oct 2006 11:07:16 -0500 Subject: [Python-Dev] Can't check in on release25-maint branch Message-ID: <17705.8756.16093.615833@montanaro.dyndns.org> (I sent a note to pydotorg yesterday but got no response. Trying here.) I checked in a change to Doc/lib/libcsv.tex on the trunk yesterday, then tried backporting it to the release25-maint branch but failed due to permission problems. Thinking it might be lock contention, I waited a few minutes and tried a couple more times. Same result. I just tried again: subversion/libsvn_client/commit.c:832: (apr_err=13) svn: Commit failed (details follow): subversion/libsvn_ra_dav/util.c:368: (apr_err=13) svn: Can't create directory '/data/repos/projects/db/transactions/52226-1.txn': Permission denied subversion/clients/cmdline/util.c:380: (apr_err=13) svn: Your commit message was left in a temporary file: subversion/clients/cmdline/util.c:380: (apr_err=13) svn: '/Users/skip/src/python-svn/release25-maint/Doc/lib/svn-commit.4.tmp' Here's my svn status output: Path: . URL: http://svn.python.org/projects/python/branches/release25-maint Repository UUID: 6015fed2-1504-0410-9fe1-9d1591cc4771 Revision: 52226 Node Kind: directory Schedule: normal Last Changed Author: hyeshik.chang Last Changed Rev: 52225 Last Changed Date: 2006-10-08 09:01:45 -0500 (Sun, 08 Oct 2006) Properties Last Updated: 2006-08-17 11:05:19 -0500 (Thu, 17 Aug 2006) I believe I've got the right thing checked out. Can someone look into this? Thanks, Skip From gerrit at nl.linux.org Fri Oct 6 14:35:21 2006 From: gerrit at nl.linux.org (Gerrit Holl) Date: Fri, 6 Oct 2006 14:35:21 +0200 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <20061003180848.GB31361@localhost.localdomain> References: <20061003143015.GA25511@localhost.localdomain> <200610031039.52434.fdrake@acm.org> <20061003180848.GB31361@localhost.localdomain> Message-ID: <20061006123521.GA30474@topjaklont.student.utwente.nl> On 2006-10-03 20:10:14 +0200, A.M. Kuchling wrote: > I've added a robots.txt to keep crawlers out of /dev/. Isn't there a lot of useful, search-engine worthy stuff in /dev? I search for peps with google, and I suppose the 'explanation' section, as well as the developer faq and subversion instructions, are good pages that deserve to be in the google index. Should /dev really be Disallow:'ed entirely in robots.txt? kind regards, Gerrit Holl. From okuda1 at llnl.gov Fri Oct 6 17:06:12 2006 From: okuda1 at llnl.gov (Chuzo Okuda) Date: Fri, 06 Oct 2006 08:06:12 -0700 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for anew issue tracker Message-ID: <452670E4.3050201@llnl.gov> I am willing to volunteer. I emailed previously, but it bounced back. Hope this time it reaches you. Chuzo From okuda1 at llnl.gov Sat Oct 7 00:58:26 2006 From: okuda1 at llnl.gov (Chuzo Okuda) Date: Fri, 06 Oct 2006 15:58:26 -0700 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for anew issue tracker Message-ID: <4526DF92.5080800@llnl.gov> I received the bounced email as follow. How do I become a member? Thank you Chuzo Your mail to 'Python-Dev' with the subject [Python-Dev] PSF Infrastructure Committee's recommendation for anew issue tracker Is being held until the list moderator can review it for approval. The reason it is being held: Post by non-member to a members-only list From bsittler at gmail.com Sun Oct 8 01:57:04 2006 From: bsittler at gmail.com (Benjamin C. Wiley Sittler) Date: Sat, 07 Oct 2006 16:57:04 -0700 Subject: [Python-Dev] Security Advisory for unicode repr() bug? Message-ID: <1160265424.5695.37.camel@localhost.localdomain> (i'm not on python-dev, so i dunno whether this will make it through...) basically, this bug does not affect the vast majority (mac and windows users with UTF-16 "narrow" unicode Python builds) because the unpatched code allocates sufficient memory in this case. only the minority treating this as a serious vulnerability (linux users with UTF-32 "wide" unicode Python builds, possibly some other Unix-like operating systems too) are affected by the buffer overrun. as for secunia, they need to do their own homework ;) i found this bug and wrote the patch that's been applied by the linux distros, so i thought i should clear up a couple of apparent misconceptions. please pardon me if i'm writing stuff you already know... the bug concerns allocation in repr() for unicode objects. previously repr() always allocated 6 bytes in the output buffer per input unicode string element; this is enough for the six-byte "\uffff" notation and on UTF-16 python builds enough for the ten-byte "\U0010ffff" notation, since on UTF-16 python builds the input unicode string contains a surrogate pair (two consecutive elements) to represent unicode characters requiring this longer notation, meaning five bytes per element. however on UTF-32 builds ten bytes per unicode string element are needed, and this is what the patch accomplishes. the previous (incorrect) algorithm extended the buffer by 100 bytes in some cases when encountering such a character, however this fixed-size heuristic extension fails when the string contains many subsequent characters in the six-byte "\uffff" form, as demonstrated by this test which will fail in an unpatched non-debug wide python build: python2.4 -c 'assert(repr(u"\U00010000" * 39 + u"\uffff" * 4096)) == (repr(u"\U00010000" * 39 + u"\uffff" * 4096))' yes, a sufficiently motivated person could probably discover enough about the memory layout of a process to use this for data or code injection, but the more usual (and sometimes accidental) consequence is a crash. more background: python comes in two flavors, UTF-16 ("narrow") and UTF-32 ("wide"), depending on whether the unicode chars are represented. This is generally configured to match the C library's wchar_t. UTF-16: Windows (at least 32-bit builds), Mac OS X (at least 32-bit builds), probably others too -- this uses a 16-bit variable-length encoding for Unicode characters: 1 16-bit word for U+0000 ... U+FFFF (identity mapped to 0x0000 ... 0xffff resp., a.k.a. the "UCS-2" range or Basic Multilingual Plane) and 2 16-bit words for U+00010000 ... U +0010FFFF (mapped as "surrogate pairs" to 0xd800; 0xdc00 ... 0xdbff; 0xdfff resp., corresponding to planes 1 through 16.) UTF-32/UCS-4: Linux, possibly others? -- this uses 1 32-bit word per unicode character: 1 word for all codepoints allowed by Python U +0000 ... U+0010FFFF (identity mapped to 0x00000000L ... 0x0010ffffL resp.) > On 10/7/06, skip[at]pobox.com wrote: > > > > Georg> [ Bug http://python.org/sf/1541585 ] > > > > Georg> This seems to be handled like a security issue by linux > > Georg> distributors, it's also a news item on security related > pages. > > > > Georg> Should a security advisory be written and official patches > be > > Georg> provided? > > > > I asked about this a few weeks ago. I got no direct response. > Secunia sent > > mail to webmaster and the SF project admins asking about how this > could be > > exploited. (Isn't figuring that stuff out their job?) > > FWIW, I responded to the original mail from Secunia with what little > I > know about the problem. Everyone on the original mail was copied. > However, I got ~30 bounces for all the Source Forge addresses due to > some issue between SF and Google mail. > > n From g.brandl at gmx.net Sun Oct 8 18:16:43 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 08 Oct 2006 18:16:43 +0200 Subject: [Python-Dev] Can't check in on release25-maint branch In-Reply-To: <17705.8756.16093.615833@montanaro.dyndns.org> References: <17705.8756.16093.615833@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > (I sent a note to pydotorg yesterday but got no response. Trying here.) > > I checked in a change to Doc/lib/libcsv.tex on the trunk yesterday, then > tried backporting it to the release25-maint branch but failed due to > permission problems. Thinking it might be lock contention, I waited a few > minutes and tried a couple more times. Same result. I just tried again: > > subversion/libsvn_client/commit.c:832: (apr_err=13) > svn: Commit failed (details follow): > subversion/libsvn_ra_dav/util.c:368: (apr_err=13) > svn: Can't create directory '/data/repos/projects/db/transactions/52226-1.txn': Permission denied > subversion/clients/cmdline/util.c:380: (apr_err=13) > svn: Your commit message was left in a temporary file: > subversion/clients/cmdline/util.c:380: (apr_err=13) > svn: '/Users/skip/src/python-svn/release25-maint/Doc/lib/svn-commit.4.tmp' > > Here's my svn status output: > > Path: . > URL: http://svn.python.org/projects/python/branches/release25-maint > Repository UUID: 6015fed2-1504-0410-9fe1-9d1591cc4771 > Revision: 52226 > Node Kind: directory > Schedule: normal > Last Changed Author: hyeshik.chang > Last Changed Rev: 52225 > Last Changed Date: 2006-10-08 09:01:45 -0500 (Sun, 08 Oct 2006) > Properties Last Updated: 2006-08-17 11:05:19 -0500 (Thu, 17 Aug 2006) > > I believe I've got the right thing checked out. It looks like you checked out from http://..., IIRC that's read-only. svn+ssh://pythondev at svn.python.org/python/... might work better. Georg From g.brandl at gmx.net Sun Oct 8 18:27:54 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 08 Oct 2006 18:27:54 +0200 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <20061006123521.GA30474@topjaklont.student.utwente.nl> References: <20061003143015.GA25511@localhost.localdomain> <200610031039.52434.fdrake@acm.org> <20061003180848.GB31361@localhost.localdomain> <20061006123521.GA30474@topjaklont.student.utwente.nl> Message-ID: Gerrit Holl wrote: > On 2006-10-03 20:10:14 +0200, A.M. Kuchling wrote: >> I've added a robots.txt to keep crawlers out of /dev/. > > Isn't there a lot of useful, search-engine worthy stuff in /dev? > I search for peps with google, and I suppose the 'explanation' section, > as well as the developer faq and subversion instructions, are good pages > that deserve to be in the google index. Should /dev really be > Disallow:'ed entirely in robots.txt? I think that refers to docs.python.org/dev. Georg From tim.peters at gmail.com Sun Oct 8 19:01:55 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Oct 2006 13:01:55 -0400 Subject: [Python-Dev] Can't check in on release25-maint branch In-Reply-To: References: <17705.8756.16093.615833@montanaro.dyndns.org> Message-ID: <1f7befae0610081001h7896d648n426dad287130cb44@mail.gmail.com> [Skip] > I checked in a change to Doc/lib/libcsv.tex on the trunk yesterday, then > tried backporting it to the release25-maint branch but failed due to > permission problems. Thinking it might be lock contention, I waited a few > minutes and tried a couple more times. Same result. I just tried again: ... > Here's my svn status output: > > Path: . > URL: http://svn.python.org/projects/python/branches/release25-maint As Georg said, looks like you did a read-only checkout. It /may/ (can't recall for sure, but think so) get you unstuck to do: svn switch --relocate \ http://svn.python.org/projects/python/branches/release25-maint \ svn+ssh://svn.python.org/python/branches/release25-maint from your checkout directory. If that works, it will go fast; if not, start over with an svn+ssh checkout. From fdrake at acm.org Sun Oct 8 19:34:08 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 8 Oct 2006 13:34:08 -0400 Subject: [Python-Dev] what's really new in python 2.5 ? In-Reply-To: <20061006123521.GA30474@topjaklont.student.utwente.nl> References: <20061003180848.GB31361@localhost.localdomain> <20061006123521.GA30474@topjaklont.student.utwente.nl> Message-ID: <200610081334.09333.fdrake@acm.org> On Friday 06 October 2006 08:35, Gerrit Holl wrote: > Isn't there a lot of useful, search-engine worthy stuff in /dev? > I search for peps with google, and I suppose the 'explanation' section, > as well as the developer faq and subversion instructions, are good pages > that deserve to be in the google index. Should /dev really be > Disallow:'ed entirely in robots.txt? As Georg noted, we've been discussing docs.python.org/dev/, which contains nightly builds of the documentation on a couple of branches. The material at www.python.org/dev/ is generally interesting, as you note, and remains open to crawlers. -Fred -- Fred L. Drake, Jr. From aahz at pythoncraft.com Sun Oct 8 19:48:10 2006 From: aahz at pythoncraft.com (Aahz) Date: Sun, 8 Oct 2006 10:48:10 -0700 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for anew issue tracker In-Reply-To: <4526DF92.5080800@llnl.gov> References: <4526DF92.5080800@llnl.gov> Message-ID: <20061008174810.GA16606@panix.com> On Fri, Oct 06, 2006, Chuzo Okuda wrote: > > I received the bounced email as follow. How do I become a member? Subscribe to the list. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you don't know what your program is supposed to do, you'd better not start writing it." --Dijkstra From skip at pobox.com Sun Oct 8 19:53:01 2006 From: skip at pobox.com (skip at pobox.com) Date: Sun, 8 Oct 2006 12:53:01 -0500 Subject: [Python-Dev] Can't check in on release25-maint branch In-Reply-To: <1f7befae0610081001h7896d648n426dad287130cb44@mail.gmail.com> References: <17705.8756.16093.615833@montanaro.dyndns.org> <1f7befae0610081001h7896d648n426dad287130cb44@mail.gmail.com> Message-ID: <17705.15101.350439.510383@montanaro.dyndns.org> Tim> As Georg said, looks like you did a read-only checkout. Thanks Georg & Tim. That was indeed the problem. I don't know why I've had such a hard time wrapping my head around Subversion. Skip From tim.peters at gmail.com Sun Oct 8 20:07:18 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 8 Oct 2006 14:07:18 -0400 Subject: [Python-Dev] Can't check in on release25-maint branch In-Reply-To: <17705.15101.350439.510383@montanaro.dyndns.org> References: <17705.8756.16093.615833@montanaro.dyndns.org> <1f7befae0610081001h7896d648n426dad287130cb44@mail.gmail.com> <17705.15101.350439.510383@montanaro.dyndns.org> Message-ID: <1f7befae0610081107i56dfb731u529d699e257d409f@mail.gmail.com> [Skip] > Thanks Georg & Tim. That was indeed the problem. I don't know why I've had > such a hard time wrapping my head around Subversion. I have a theory about that: it's software <0.5 wink>. If it's any consolation, at the NFS sprint earlier this year, I totally blanked out on how to do a merge using SVN, despite that I've merged hundreds of times when working on ZODB's seemingly infinite collection of active branches. Luckily, I was only trying to help someone else do a merge at the time, so it frustrated them more than me ;-) From fredrik at pythonware.com Sun Oct 8 20:16:02 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 08 Oct 2006 20:16:02 +0200 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for anew issue tracker In-Reply-To: <4526DF92.5080800@llnl.gov> References: <4526DF92.5080800@llnl.gov> Message-ID: Chuzo Okuda wrote: > I received the bounced email as follow. How do I become a member? the moderator has approved your message, and it has reached the right persons. I'm sure they'll get back to you soon. From brett at python.org Sun Oct 8 22:11:55 2006 From: brett at python.org (Brett Cannon) Date: Sun, 8 Oct 2006 13:11:55 -0700 Subject: [Python-Dev] PSF Infrastructure Committee's recommendation for anew issue tracker In-Reply-To: <452670E4.3050201@llnl.gov> References: <452670E4.3050201@llnl.gov> Message-ID: The email didn't bounce; it was just held for moderator approval (and it made it through). Just sit tight and we will be getting back to all of the volunteers in the near future (probably next week, no later than after this upcoming week). -Brett On 10/6/06, Chuzo Okuda wrote: > > I am willing to volunteer. I emailed previously, but it bounced back. > Hope this time it reaches you. > Chuzo > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061008/26213996/attachment.htm From larry at hastings.org Mon Oct 9 07:47:58 2006 From: larry at hastings.org (Larry Hastings) Date: Sun, 08 Oct 2006 22:47:58 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <17702.13238.684094.6289@montanaro.dyndns.org> Message-ID: <4529E28E.3070800@hastings.org> Fredrik Lundh wrote: > skip at pobox.com wrote: > >> MAL's pybench would probably be better for this presuming it does some >> addition with string operands. >> > or stringbench. > I ran 'em, and they are strangely consistent with pystone. With concat, stringbench is ever-so-slightly faster overall. "172.82" vs "174.85" for the "ascii" column, I guess that's in seconds. I'm just happy it's not slower. (I only ran stringbench once; it seems to take *forever*). I ran pybench three times for each build. The slowest concat overall time was still 2.9% faster than the fastest release time. "ConcatStrings" is a big winner, at around 150% faster; since the test doesn't *do* anything with the concatenated values, it never renders the concatenation objects, so it does a lot less work. "CreateStringsWithConcat" is generally 18-19% faster, as expected. After that, the timings are all over the place, but some tests were consistently faster: "CompareInternedStrings" was 8-12% faster, "DictWithFloatKeys" was 9-11% faster, "SmallLists" was 8-15% faster, "CompareLongs" was 6-10% faster, and "PyMethodCalls" was 4-6% faster. (These are all comparing the "average run-time" results, though the "minimum run-time" results were similar.) I still couldn't tell you why my results are faster. I swear on my mother's eyes I didn't touch anything major involved in "DictWithFloatKeys", "SmallLists", or "CompareLongs". I didn't touch the compiler settings, so that shouldn't be it. I acknowledge not only that it could all be a mistake, and that I don't know enough about it to speculate.// The speedup mystery continues, *larry* From ironfroggy at gmail.com Mon Oct 9 09:24:28 2006 From: ironfroggy at gmail.com (Calvin Spealman) Date: Mon, 9 Oct 2006 03:24:28 -0400 Subject: [Python-Dev] if __debug__: except Exception, e: pdb.set_trace() Message-ID: <76fd5acf0610090024u1caa7868ka336f1456faee93e@mail.gmail.com> I know I can not do this, but what are the chances on changing the rules so that we can? Basically, since the if __debug__: lines are processed before runtime, would it be possible to allow them to be used to control the inclusion or omission or entire blocks (except, else, elif, etc.) with them being included as if they were at the same level as the 'if __debug__:' above them? I want to allow this: try: foo() if __debug__: except Exception, e: import pdb pdb.set_trace() So that when __debug__ is false, the except block doesn't even exist at all. -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://ironfroggy-code.blogspot.com/ From jcarlson at uci.edu Mon Oct 9 09:45:39 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 09 Oct 2006 00:45:39 -0700 Subject: [Python-Dev] if __debug__: except Exception, e: pdb.set_trace() In-Reply-To: <76fd5acf0610090024u1caa7868ka336f1456faee93e@mail.gmail.com> References: <76fd5acf0610090024u1caa7868ka336f1456faee93e@mail.gmail.com> Message-ID: <20061009003949.0982.JCARLSON@uci.edu> "Calvin Spealman" wrote: > > I know I can not do this, but what are the chances on changing the > rules so that we can? Basically, since the if __debug__: lines are > processed before runtime, would it be possible to allow them to be > used to control the inclusion or omission or entire blocks (except, > else, elif, etc.) with them being included as if they were at the same > level as the 'if __debug__:' above them? I would say very low. try/except/finally, if/elif/else, for/else, while/else, etc., pairings of statements historically have only been grouped together when they share indent levels. If one makes two statements that don't share indent levels paired in this way, then what is stopping us from doing the following monstronsity? if ...: ... if __debug__: elif ...: ... Remember, Special cases aren't special enough to break the rules. This would be a bad special case that doesn't generalize in a satisfactory manner. > I want to allow this: > > try: > foo() > if __debug__: > except Exception, e: > import pdb > pdb.set_trace() > > So that when __debug__ is false, the except block doesn't even exist at all. And if the except clause doesn't exist at all, then unless you are following it with the finally clause of a 2.5+ unified try/except/finally, it is a syntax error. Regardless, it would be easier to read to have the following... try: foo() except Exception, e: if __debug__: import pdb pdb.set_trace() else: raise - Josiah From mal at egenix.com Mon Oct 9 11:30:25 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 09 Oct 2006 11:30:25 +0200 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <4529E28E.3070800@hastings.org> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <17702.13238.684094.6289@montanaro.dyndns.org> <4529E28E.3070800@hastings.org> Message-ID: <452A16B1.9070109@egenix.com> Larry Hastings wrote: > Fredrik Lundh wrote: >> skip at pobox.com wrote: >> >>> MAL's pybench would probably be better for this presuming it does some >>> addition with string operands. >>> >> or stringbench. >> > > I ran 'em, and they are strangely consistent with pystone. > > With concat, stringbench is ever-so-slightly faster overall. "172.82" > vs "174.85" for the "ascii" column, I guess that's in seconds. I'm just > happy it's not slower. (I only ran stringbench once; it seems to take > *forever*). > > I ran pybench three times for each build. The slowest concat overall > time was still 2.9% faster than the fastest release time. > "ConcatStrings" is a big winner, at around 150% faster; since the test > doesn't *do* anything with the concatenated values, it never renders the > concatenation objects, so it does a lot less work. > "CreateStringsWithConcat" is generally 18-19% faster, as expected. > After that, the timings are all over the place, but some tests were > consistently faster: "CompareInternedStrings" was 8-12% faster, > "DictWithFloatKeys" was 9-11% faster, "SmallLists" was 8-15% faster, > "CompareLongs" was 6-10% faster, and "PyMethodCalls" was 4-6% faster. > (These are all comparing the "average run-time" results, though the > "minimum run-time" results were similar.) When comparing results, please look at the minimum runtime. The average times are just given to indicate how much the mintime differs from the average of all runs. > I still couldn't tell you why my results are faster. I swear on my > mother's eyes I didn't touch anything major involved in > "DictWithFloatKeys", "SmallLists", or "CompareLongs". I didn't touch > the compiler settings, so that shouldn't be it. I acknowledge not only > that it could all be a mistake, and that I don't know enough about it to > speculate.// Depending on what you changed, it is possible that the layout of the code in memory better fits your CPU architecture. If however the speedups are not consistent across several runs of pybench, then it's likely that you have some background activity going on on the machine which causes a slowdown in the unmodified run you chose as basis for the comparison. Just to make sure: you are using pybench 2.0, right ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 09 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From kristjan at ccpgames.com Mon Oct 9 11:55:00 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Mon, 9 Oct 2006 09:55:00 -0000 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom Message-ID: <129CEF95A523704B9D46959C922A280002FE99D7@nemesis.central.ccp.cc> This patch looks really nice to use here at CCP. Our code is full of string contcatenations so I will probably try to apply the patch soon and see what it gives us in a real life app. The floating point integer cache was also a big win. Soon, standard python won't be able to keep up with the patched versions out there :) Oh, and since I have fixed the pcbuild8 thingy in the 2.5 branch, why don't you give the PGO version a whirl too? Even the non-PGO dll, with link-time code generation, should be faster than your vanilla PCBuild one. Read the Readme.txt for details. Cheers, Kristj?n > -----Original Message----- > From: python-dev-bounces+kristjan=ccpgames.com at python.org > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] > On Behalf Of M.-A. Lemburg > Sent: 9. okt?ber 2006 09:30 > To: Larry Hastings > Cc: python-dev at python.org > Subject: Re: [Python-Dev] PATCH submitted: Speed up + for > string concatenation, now as fast as "".join(x) idiom > > Larry Hastings wrote: > > Fredrik Lundh wrote: > >> skip at pobox.com wrote: > >> > >>> MAL's pybench would probably be better for this presuming it does > >>> some addition with string operands. > >>> > >> or stringbench. > >> > > > > I ran 'em, and they are strangely consistent with pystone. > > > > With concat, stringbench is ever-so-slightly faster > overall. "172.82" > > vs "174.85" for the "ascii" column, I guess that's in seconds. I'm > > just happy it's not slower. (I only ran stringbench once; > it seems to > > take *forever*). > > > > I ran pybench three times for each build. The slowest > concat overall > > time was still 2.9% faster than the fastest release time. > > "ConcatStrings" is a big winner, at around 150% faster; > since the test > > doesn't *do* anything with the concatenated values, it > never renders > > the concatenation objects, so it does a lot less work. > > "CreateStringsWithConcat" is generally 18-19% faster, as expected. > > After that, the timings are all over the place, but some tests were > > consistently faster: "CompareInternedStrings" was 8-12% faster, > > "DictWithFloatKeys" was 9-11% faster, "SmallLists" was > 8-15% faster, > > "CompareLongs" was 6-10% faster, and "PyMethodCalls" was > 4-6% faster. > > (These are all comparing the "average run-time" results, though the > > "minimum run-time" results were similar.) > > When comparing results, please look at the minimum runtime. > The average times are just given to indicate how much the > mintime differs from the average of all runs. > > > I still couldn't tell you why my results are faster. I swear on my > > mother's eyes I didn't touch anything major involved in > > "DictWithFloatKeys", "SmallLists", or "CompareLongs". I > didn't touch > > the compiler settings, so that shouldn't be it. I acknowledge not > > only that it could all be a mistake, and that I don't know enough > > about it to speculate.// > > Depending on what you changed, it is possible that the layout > of the code in memory better fits your CPU architecture. > > If however the speedups are not consistent across several > runs of pybench, then it's likely that you have some > background activity going on on the machine which causes a > slowdown in the unmodified run you chose as basis for the comparison. > > Just to make sure: you are using pybench 2.0, right ? > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source (#1, > Oct 09 2006) > >>> Python/Zope Consulting and Support ... > http://www.egenix.com/ > >>> mxODBC.Zope.Database.Adapter ... > http://zope.egenix.com/ > >>> mxODBC, mxDateTime, mxTextTools ... > http://python.egenix.com/ > ______________________________________________________________ > __________ > > ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for > free ! :::: > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/kristjan%40c cpgames.com > From kristjan at ccpgames.com Mon Oct 9 12:07:30 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Mon, 9 Oct 2006 10:07:30 -0000 Subject: [Python-Dev] 2.5, 64 bit Message-ID: <129CEF95A523704B9D46959C922A280002FE99D8@nemesis.central.ccp.cc> the VisualStudio8 64 bit build of 2.5 doesn't compile clean. We have a number of warnings of truncation from 64 bit to 32: Often it is a question of doing an explicit cast, but sometimes we are using "int" for results from strlen and such. Is there any interest in fixing this up? Cheers, Kristj?n -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061009/ccded2cb/attachment.htm From g.brandl at gmx.net Mon Oct 9 12:27:31 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 09 Oct 2006 12:27:31 +0200 Subject: [Python-Dev] Security Advisory for unicode repr() bug? In-Reply-To: References: Message-ID: <452A2413.1060708@gmx.net> Georg Brandl wrote: > [ Bug http://python.org/sf/1541585 ] > > This seems to be handled like a security issue by linux distributors, > it's also a news item on security related pages. > > Should a security advisory be written and official patches be > provided? May I ask again whether this is handled by the PSRT at all? Georg From tlesher at gmail.com Mon Oct 9 16:52:44 2006 From: tlesher at gmail.com (Tim Lesher) Date: Mon, 9 Oct 2006 10:52:44 -0400 Subject: [Python-Dev] Iterating over marshal/pickle Message-ID: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com> Both marshal and pickle allow multiple objects to be serialized to the same file-like object. The pattern for deserializing an unknown number of serialized objects looks like this: objs = [] while True: try: objs.append(marshal.load(fobj)) # or objs.append(unpickler.load()) except EOFError: break This seems like a good use case for an generator: def some_name(fobj): while True: try: yield marshal.load(fobj) # or yield unpickler.load() except EOFError: raise StopIteration 1. Does this seem like a reasonable addition to the standard library? 2. Where should it go, and what should it be called? >From an end-user point of view, this "feels" right: import pickle u = pickle.Unpickler(open('picklefile')) for x in u: print x import marshal for x in marshal.unmarshalled(open('marshalfile')): print x But I'm not hung up on the actual names or the use of sequence semantics in the Unpickler case. Incidentally, I know that pickle is preferred over marshal, but some third-party tools (like the Perforce client) still use the marshal library for serialization, so I've included it in the discussion -- Tim Lesher From fredrik at pythonware.com Mon Oct 9 17:28:24 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 09 Oct 2006 17:28:24 +0200 Subject: [Python-Dev] Iterating over marshal/pickle In-Reply-To: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com> References: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com> Message-ID: Tim Lesher wrote: > 1. Does this seem like a reasonable addition to the standard library? I cannot remember ever doing this, or seeing anyone except Perforce doing this, and it'll only save you a few lines of code every other year or so, so my answer is definitely no. (if you're serious about P4 integration, you probably don't want to use Python's marshal.load to deal with the P4 output either; the marshalling code has had a tendency to crash Python when it sees malformed or pre- maturely terminated output). > Incidentally, I know that pickle is preferred over marshal, but some > third-party tools (like the Perforce client) still use the marshal > library for serialization, so I've included it in the discussion Perforce is the only 3rd party component I'm aware of that uses a standard Python serialization format in this way. As the x windows people have observed, the only thing worse than generalizing from one example is generalizing from no examples at all.. From barry at python.org Mon Oct 9 18:01:40 2006 From: barry at python.org (Barry Warsaw) Date: Mon, 9 Oct 2006 12:01:40 -0400 Subject: [Python-Dev] Iterating over marshal/pickle In-Reply-To: References: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 9, 2006, at 11:28 AM, Fredrik Lundh wrote: >> 1. Does this seem like a reasonable addition to the standard library? > > I cannot remember ever doing this, or seeing anyone except Perforce > doing this, and it'll only save you a few lines of code every other > year > or so, so my answer is definitely no. FWIW, Mailman uses pickle files with multiple pickled objects in them, to implement its queue files. We first dump the Message object, followed by a dictionary of metadata. OTOH, I know there's only two objects in the pickle, so I don't have to iterate over it; I just load the message and then load the dictionary. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRSpyaHEjvBPtnXfVAQLqcgP/VqKqwZfReaQyRGP2DG61978CmbqLOvSY nsXP/AE88VvO+IHYajfNuJt/okmTIfHTl9Jcx77YzxZ9ErtpKWbmrX6zo7OkaZPv 5aYQ7zsYwJocL5u6nFqXAs+9zvUOXLvwhKFDc5K/rp4cb02QAYOgn5gpRirJNSAm ESMiMNRmdQ8= =3Ih4 -----END PGP SIGNATURE----- From faassen at infrae.com Mon Oct 9 18:57:41 2006 From: faassen at infrae.com (Martijn Faassen) Date: Mon, 09 Oct 2006 18:57:41 +0200 Subject: [Python-Dev] Security Advisory for unicode repr() bug? In-Reply-To: <452A2413.1060708@gmx.net> References: <452A2413.1060708@gmx.net> Message-ID: <452A7F85.30503@infrae.com> Georg Brandl wrote: > Georg Brandl wrote: >> [ Bug http://python.org/sf/1541585 ] >> >> This seems to be handled like a security issue by linux distributors, >> it's also a news item on security related pages. >> >> Should a security advisory be written and official patches be >> provided? > > May I ask again whether this is handled by the PSRT at all? I agree that having an official Python security advisory would be good to see. I was also assuming automatically that fixed versions of Python 2.4 and Python 2.3 would be released. It's a serious issue for web-facing Python applications that handle unicode strings. Regards, Martijn From martin at v.loewis.de Mon Oct 9 19:53:23 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 09 Oct 2006 19:53:23 +0200 Subject: [Python-Dev] 2.5, 64 bit In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99D8@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE99D8@nemesis.central.ccp.cc> Message-ID: <452A8C93.7080901@v.loewis.de> Kristj?n V. J?nsson schrieb: > the VisualStudio8 64 bit build of 2.5 doesn't compile clean. We have a > number of warnings of truncation from 64 bit to 32: > Often it is a question of doing an explicit cast, but sometimes we are > using "int" for results from strlen and such. > > Is there any interest in fixing this up? Yes; I had fixed many of them already for the Python 2.5 release (there were *way* more of these before I started). Notice that many of them are bogus. For example, if I do strlen on a buffer that is known to have MAXPATH bytes, the strlen result *can't* exceed an int. So care is necessary for each case: - if there is a conceivable case where it can overflow (i.e. if you could come up with a Python program that makes it overflow), fix the types appropriately - if it is certain through inspection that it can't overflow, add a cast (Py_SAFE_DOWNCAST, or, when it is really obvious, a plain cast), and a comment on why the cast is correct. Notice that Py_SAFE_DOWNCAST has an assertion in debug mode. Also notice that it evaluates it argument twice. - if it shouldn't overflow as long as extension modules play by the rules, it's your choice of either adding a runtime error, or just widening the representation. IIRC, the biggest chunk of "real" work left is SRE: this can realistically overflow when it operates on large strings. You have to really understand SRE before fixing it. For example, I believe that large strings might have impacts on compilation, too (e.g. if the regex itself is >2GiB, or some repetition count is >2**31). In these cases, it might be saner to guarantee an exception (and document the limitation) than to try expanding the SRE bytecode. Another set of remaining changes deals with limitations on byte code and reflection. For example, there is currently a limit on the number of local variables imposed by the Python bytecode. From this limit, it follows that certain casts are correct. One should document each limit first, and then refer to these limits when adding casts. Helping here would be definitely appreciated. Regards, Martin From mal at egenix.com Mon Oct 9 21:48:04 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 09 Oct 2006 21:48:04 +0200 Subject: [Python-Dev] Iterating over marshal/pickle In-Reply-To: References: <9613db600610090752w1641d5a4o6881dd038befdd7@mail.gmail.com> Message-ID: <452AA774.7040105@egenix.com> Fredrik Lundh wrote: > Tim Lesher wrote: > >> 1. Does this seem like a reasonable addition to the standard library? > > I cannot remember ever doing this, or seeing anyone except Perforce > doing this, and it'll only save you a few lines of code every other year > or so, so my answer is definitely no. > > (if you're serious about P4 integration, you probably don't want to use > Python's marshal.load to deal with the P4 output either; the marshalling > code has had a tendency to crash Python when it sees malformed or pre- > maturely terminated output). > >> Incidentally, I know that pickle is preferred over marshal, but some >> third-party tools (like the Perforce client) still use the marshal >> library for serialization, so I've included it in the discussion > > Perforce is the only 3rd party component I'm aware of that uses a > standard Python serialization format in this way. > > As the x windows people have observed, the only thing worse than > generalizing from one example is generalizing from no examples at > all.. FWIW, we've been and are using this quite a lot for dumping database content to a backup file. It's a simple and terse format, preserves full data precision and doesn't cause problems when moving between platforms. That makes two use cases and I'm sure there are more ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 09 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From tim.peters at gmail.com Mon Oct 9 22:44:38 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 9 Oct 2006 16:44:38 -0400 Subject: [Python-Dev] 2.4 vs Windows vs bsddb Message-ID: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> I just noticed that the bsddb portion of Python fails to compile on the 2.4 Windows buildbots, but for some reason the buildbot machinery doesn't notice the failure: """ Compiling... _bsddb.c Linking... Creating library .\./_bsddb_d.lib and object .\./_bsddb_d.exp _bsddb.obj : warning LNK4217: locally defined symbol _malloc imported in function __db_associateCallback _bsddb.obj : warning LNK4217: locally defined symbol _free imported in function __DB_consume _bsddb.obj : warning LNK4217: locally defined symbol _fclose imported in function _DB_verify _bsddb.obj : warning LNK4217: locally defined symbol _fopen imported in function _DB_verify _bsddb.obj : warning LNK4217: locally defined symbol _strncpy imported in function _init_pybsddb _bsddb.obj : error LNK2019: unresolved external symbol __imp__strncat referenced in function _makeDBError _bsddb.obj : error LNK2019: unresolved external symbol __imp___assert referenced in function _makeDBError ./_bsddb_d.pyd : fatal error LNK1120: 2 unresolved externals ... _bsddb - 3 error(s), 5 warning(s) Build: 15 succeeded, 1 failed, 0 skipped """ The warnings there are old news, but no idea when the errors started. The test suite doesn't care that bsddb is missing either, just ending with: 1 skip unexpected on win32: test_bsddb Same kind of things when building from my 2.4 checkout. No clues. From docwhat+list.python-dev at gerf.org Mon Oct 9 22:44:39 2006 From: docwhat+list.python-dev at gerf.org (The Doctor What) Date: Mon, 09 Oct 2006 16:44:39 -0400 Subject: [Python-Dev] BUG (urllib2) Authentication request header is broken on long usernames and passwords Message-ID: <452AB4B7.4030103@gerf.org> I found a bug in urllib2's handling of basic HTTP authentication. urllib2 uses the base64.encodestring() method to encode the username:password. The problem is that base64.encodestring() adds newlines to wrap the encoded characters at the 76th column. This produces bogus request headers like this: ---------->8---------cut---------8<---------------- GET /some/url HTTP/1.1 Host: some.host Accept-Encoding: identity Authorization: Basic cmVhbGx5bG9uZ3VzZXJuYW1lOmFuZXZlbmxvbmdlcnBhc3N3b3JkdGhhdGdvZXNvbmFuZG9uYW5k b25hbmRvbmFuZG9u User-agent: some-agent ---------->8---------cut---------8<---------------- This can be worked around by forcing the base64.MAXBINSIZE to something huge, but really it should be something passed into base64.encodestring(). # echo example of it wrapping... # python -c 'import base64; print base64.encodestring("f"*58)' # excho example of forcing it not to wrap... # python -c 'import base64; base64.MAXBINSIZE=1000000; print base64.encodestring("f"*58)' Symptoms of this bug are receiving HTTP 400 responses from the remote server, spurious authentication errors, or various parts of the header "vanishing" (because of the double newline). Thanks! -- ** Ridiculous Quotes ** "I want to say this about my state: When Strom Thurmond ran for president, we voted for him. We're proud of it. And if the rest of the country had followed our lead, we wouldn't have had all these problems over all these years, either." -- Senate Minority Leader Trent Lott (R-MS), praising Strom Thurmond's segregationist presidential campaign [12/5/02] The Doctor What: Second Baseman http://docwhat.gerf.org/ docwhat *at* gerf *dot* org KF6VNC From aahz at pythoncraft.com Mon Oct 9 23:35:23 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 9 Oct 2006 14:35:23 -0700 Subject: [Python-Dev] BUG (urllib2) Authentication request header is broken on long usernames and passwords In-Reply-To: <452AB4B7.4030103@gerf.org> References: <452AB4B7.4030103@gerf.org> Message-ID: <20061009213523.GA27418@panix.com> On Mon, Oct 09, 2006, The Doctor What wrote: > > I found a bug in urllib2's handling of basic HTTP authentication. Please submit your bug to SourceForge, then (optional) post the bug number back here. See http://www.python.org/dev/faq/#bugs -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you don't know what your program is supposed to do, you'd better not start writing it." --Dijkstra From scott+python-dev at scottdial.com Mon Oct 9 23:44:51 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Mon, 09 Oct 2006 17:44:51 -0400 Subject: [Python-Dev] BUG (urllib2) Authentication request header is broken on long usernames and passwords In-Reply-To: <452AB4B7.4030103@gerf.org> References: <452AB4B7.4030103@gerf.org> Message-ID: <452AC2D3.40200@scottdial.com> The Doctor What wrote: > The problem is that base64.encodestring() adds newlines to wrap the > encoded characters at the 76th column. > The encodestring is following RFC 1521 which speficies: The output stream (encoded bytes) must be represented in lines of no more than 76 characters each. All line breaks or other characters not found in Table 1 must be ignored by decoding software. In retrospect, perhaps "{de|en}codestring" was a poor name choice. urllib2 should be calling b64encode directly. I have submitted a patch to the tracker: [ 1574068 ] urllib2 - Fix line breaks in authorization headers. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From martin at v.loewis.de Tue Oct 10 00:31:34 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Oct 2006 00:31:34 +0200 Subject: [Python-Dev] 2.4 vs Windows vs bsddb In-Reply-To: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> Message-ID: <452ACDC6.90204@v.loewis.de> Tim Peters schrieb: > I just noticed that the bsddb portion of Python fails to compile on > the 2.4 Windows buildbots, but for some reason the buildbot machinery > doesn't notice the failure: It's been a while that a failure to build some extension modules doesn't cause the "compile" step to fail; this just happened with the _ssl.pyd module before. I'm not sure how build.bat communicates an error, or whether devenv.com fails in some way when some build step fails. Revision 43156 may contribute here, which adds additional commands into build.bat after devenv.com is invoked. Regards, Martin From tim.peters at gmail.com Tue Oct 10 01:26:41 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 9 Oct 2006 19:26:41 -0400 Subject: [Python-Dev] 2.4 vs Windows vs bsddb In-Reply-To: <452ACDC6.90204@v.loewis.de> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> <452ACDC6.90204@v.loewis.de> Message-ID: <1f7befae0610091626l18ace335y98052cb9ee481843@mail.gmail.com> [Tim Peters] >> I just noticed that the bsddb portion of Python fails to compile on >> the 2.4 Windows buildbots, but for some reason the buildbot machinery >> doesn't notice the failure: [Martin v. L?wis] > It's been a while that a failure to build some extension modules doesn't > cause the "compile" step to fail; this just happened with the _ssl.pyd > module before. I'm guessing only on the release24-maint branch? > I'm not sure how build.bat communicates an error, or whether devenv.com > fails in some way when some build step fails. > > Revision 43156 may contribute here, which adds additional commands > into build.bat after devenv.com is invoked. More guessing: devenv gives a non-zero exit code when it fails, and a .bat script passes on the exit code of the last command it executes. True or false, after making changes based on those guesses, the 2.4 Windows buildbots now say they fail the compile step. It was my fault to begin with (boo! /bad/ Timmy), but should have been unique to the 24 branch (2.5 and trunk fetch Unicode test files all by themselves). From tim.peters at gmail.com Tue Oct 10 02:11:59 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 9 Oct 2006 20:11:59 -0400 Subject: [Python-Dev] 2.4 vs Windows vs bsddb In-Reply-To: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> Message-ID: <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com> [Tim] > I just noticed that the bsddb portion of Python fails to compile on > the 2.4 Windows buildbots, but for some reason the buildbot machinery > doesn't notice the failure: But it does now. This is the revision that broke the Windows build: """ r52170 | andrew.kuchling | 2006-10-05 14:49:36 -0400 (Thu, 05 Oct 2006) | 12 lines [Backport r50783 | neal.norwitz. The bytes_left code is complicated, but looks correct on a casual inspection and hasn't been modified in the trunk. Does anyone want to review further?] Ensure we don't write beyond errText. I think I got this right, but it definitely could use some review to ensure I'm not off by one and there's no possible overflow/wrap-around of bytes_left. Reported by Klocwork #1. Fix a problem if there is a failure allocating self->db. Found with failmalloc. """ It introduces uses of assert() and strncat(), and the linker can't resolve them. I suppose that's because the Windows link step for the _bsddb subproject explicitly excludes msvcrt (in the release build) and msvcrtd (in the debug build), but I don't know why that's done. OTOH, we got a lot more errors (about duplicate code definitions) if the standard MS libraries aren't explicitly excluded, so that's no fix. From jjl at pobox.com Tue Oct 10 02:14:51 2006 From: jjl at pobox.com (John J Lee) Date: Tue, 10 Oct 2006 00:14:51 +0000 (UTC) Subject: [Python-Dev] BUG (urllib2) Authentication request header is broken on long usernames and passwords In-Reply-To: <452AC2D3.40200@scottdial.com> References: <452AB4B7.4030103@gerf.org> <452AC2D3.40200@scottdial.com> Message-ID: On Mon, 9 Oct 2006, Scott Dial wrote: [...] > In retrospect, perhaps "{de|en}codestring" was a poor name choice. > urllib2 should be calling b64encode directly. > > I have submitted a patch to the tracker: [ 1574068 ] urllib2 - Fix line > breaks in authorization headers. urllib should also be fixed in the same way (assuming your fix is correct), since urllib also uses base64.{de,en}codestring(). John From martin at v.loewis.de Tue Oct 10 08:31:20 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 10 Oct 2006 08:31:20 +0200 Subject: [Python-Dev] 2.4 vs Windows vs bsddb In-Reply-To: <1f7befae0610091626l18ace335y98052cb9ee481843@mail.gmail.com> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> <452ACDC6.90204@v.loewis.de> <1f7befae0610091626l18ace335y98052cb9ee481843@mail.gmail.com> Message-ID: <452B3E38.7020801@v.loewis.de> Tim Peters schrieb: > [Martin v. L?wis] >> It's been a while that a failure to build some extension modules doesn't >> cause the "compile" step to fail; this just happened with the _ssl.pyd >> module before. > > I'm guessing only on the release24-maint branch? Yes. I backported some change which broke the build (not so on my own installation for a strange reason), and the buildbot didn't complain, either. I was surprised to see a bug report on SF that it wouldn't build. > More guessing: devenv gives a non-zero exit code when it fails, and a > .bat script passes on the exit code of the last command it executes. That's my theory also. Thanks for fixing it, Martin From ncoghlan at gmail.com Tue Oct 10 11:32:43 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 Oct 2006 19:32:43 +1000 Subject: [Python-Dev] [Python-3000] Sky pie: a "var" keyword In-Reply-To: <452B66C8.6020703@gmail.com> References: <452A027B.8060009@cs.byu.edu> <452B66C8.6020703@gmail.com> Message-ID: <452B68BB.4070005@gmail.com> Nick Coghlan wrote: > Any proposal such as this also needs to addresses all of the *other* name > binding statements in Python: > > try/except > for loop > with statement > def statement > class statement I forgot the import statement (especially the * version) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fredrik at pythonware.com Tue Oct 10 12:00:57 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 10 Oct 2006 12:00:57 +0200 Subject: [Python-Dev] [Python-3000] Sky pie: a "var" keyword References: <452A027B.8060009@cs.byu.edu> <452B66C8.6020703@gmail.com> <452B68BB.4070005@gmail.com> Message-ID: > I forgot the import statement (especially the * version) not only that, you also forgot what mailing list you were posting to... From ndbecker2 at gmail.com Tue Oct 10 13:53:16 2006 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 10 Oct 2006 07:53:16 -0400 Subject: [Python-Dev] Proprietary code in python? Message-ID: http://www.google.com/codesearch?q=+doc:DxlBcBw4TXo+proprietary+confidential+show:DxlBcBw4TXo:BwgQSUaGDCc:1s0hP8rbIGE&sa=N&cd=1&ct=ri&cs_p=http://www.python.org/download/releases/binaries-1.3/python-IRIX-5.3-full.tar.gz&cs_f=lib/python/irix5/AWARE.py#a0 From ncoghlan at gmail.com Tue Oct 10 14:06:24 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 Oct 2006 22:06:24 +1000 Subject: [Python-Dev] Proprietary code in python? In-Reply-To: References: Message-ID: <452B8CC0.3030206@gmail.com> Neal Becker wrote: > http://www.google.com/codesearch?q=+doc:DxlBcBw4TXo+proprietary+confidential+show:DxlBcBw4TXo:BwgQSUaGDCc:1s0hP8rbIGE&sa=N&cd=1&ct=ri&cs_p=http://www.python.org/download/releases/binaries-1.3/python-IRIX-5.3-full.tar.gz&cs_f=lib/python/irix5/AWARE.py#a0 That file isn't there any more [1] The file appears to have been removed with the change in license for Python 2.0 (the last tag I can find containing that file is related 1.5.2). (Note that the linked version is Python 1.3) Cheers, Nick. [1] http://svn.python.org/view/python/trunk/Lib/plat-irix5/ -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fredrik at pythonware.com Tue Oct 10 15:17:25 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 10 Oct 2006 15:17:25 +0200 Subject: [Python-Dev] Proprietary code in python? References: Message-ID: "Neal Becker" wrote: > http://www.google.com/codesearch?q=+doc:DxlBcBw4TXo+proprietary+confidential+show:DxlBcBw4TXo:BwgQSUaGDCc:1s0hP8rbIGE&sa=N&cd=1&ct=ri&cs_p=http://www.python.org/download/releases/binaries-1.3/python-IRIX-5.3-full.tar.gz&cs_f=lib/python/irix5/AWARE.py#a0 that's most likely code that's been automatically generated from corresponding header files in IRIX. in most jurisdictions, laws about corporate secrets doesn't apply to things that you've intentionally published. (their file is still copyrighted, but I'm not sure to what extent you can use copyright to protect a few integers). From r.m.oudkerk at googlemail.com Mon Oct 9 13:59:06 2006 From: r.m.oudkerk at googlemail.com (Richard Oudkerk) Date: Mon, 9 Oct 2006 12:59:06 +0100 Subject: [Python-Dev] Cloning threading.py using proccesses Message-ID: I am not sure how sensible the idea is, but I have had a first stab at writing a module processing.py which is a near clone of threading.py but uses processes and sockets for communication. (It is one way of avoiding the GIL.) I have tested it on unix and windows and it seem to work pretty well. (Getting round the lack of os.fork on windows is a bit awkward.) There is also another module dummy_processing.py which has the same api but is just a wrapper round threading.py. Queues, Locks, RLocks, Conditions, Semaphores and some other shared objects are implemented. People are welcome to try out the tests in test_processing.py contained in the zipfile. More information is included in the README file. As a quick example, the code . from processing import Process, Queue, ObjectManager . . def f(token): . q = proxy(token) . for i in range(10): . q.put(i*i) . q.put('STOP') . . if __name__ == '__main__': . manager = ObjectManager() . token = manager.new(Queue) . queue = proxy(token) . . t = Process(target=f, args=[token]) . t.start() . . result = None . while result != 'STOP': . result = queue.get() . print result . . t.join() is not very different from the normal threaded equivalent . from threading import Thread . from Queue import Queue . . def f(q): . for i in range(10): . q.put(i*i) . q.put('STOP') . . if __name__ == '__main__': . queue = Queue() . . t = Thread(target=f, args=[queue]) . t.start() . . result = None . while result != 'STOP': . result = queue.get() . print result . . t.join() Richard -------------- next part -------------- A non-text attachment was scrubbed... Name: processing.zip Type: application/zip Size: 16648 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061009/e707ab8d/attachment-0001.zip From docwhat+list.python-dev at gerf.org Tue Oct 10 16:18:39 2006 From: docwhat+list.python-dev at gerf.org (The Doctor What) Date: Tue, 10 Oct 2006 10:18:39 -0400 Subject: [Python-Dev] BUG (urllib2) Authentication request header is broken on long usernames and passwords In-Reply-To: <20061009213523.GA27418@panix.com> References: <452AB4B7.4030103@gerf.org> <20061009213523.GA27418@panix.com> Message-ID: <452BABBF.2050403@gerf.org> Aahz wrote: > On Mon, Oct 09, 2006, The Doctor What wrote: >> I found a bug in urllib2's handling of basic HTTP authentication. > > Please submit your bug to SourceForge, then (optional) post the bug > number back here. > > See http://www.python.org/dev/faq/#bugs Thank you! I couldn't find the bug system for python (never had to submit a bug before) and was looking all over the python.org site. I see someone else submitted the bug as 1574068. Ciao! -- I'd horsewhip you if I had a horse. -- Groucho Marx The Doctor What: Da Man http://docwhat.gerf.org/ docwhat *at* gerf *dot* org KF6VNC From steven.bethard at gmail.com Tue Oct 10 17:36:46 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 10 Oct 2006 09:36:46 -0600 Subject: [Python-Dev] DRAFT: python-dev summary for 2006-08-16 to 2006-08-31 Message-ID: Here's the draft summary for the second half of August. As always, comments and corrections are greatly appreciated! ============= Announcements ============= --------------------------- Python communnity buildbots --------------------------- Want to make sure your package works with the latest and greatest development and release versions of Python? Thanks to Grig Gheorghiu, you can add your test suite to the `Python community buildbots`_ and the results of these tests will show up on the `Python buildbot results page`_. .. _Python community buildbots: http://www.pybots.org/ .. _Python buildbot results page: http://www.python.org/dev/buildbot/ Contributing thread: - `link to community buildbot? `__ ========= Summaries ========= --------------------- Fast subclass testing --------------------- Neal Norwitz was playing around with a patch that would make subclass testing for certain builtin types faster by stealing some bits from tp_flags. Georg Brandl thought this could be useful for exception handling in Python 3000 when all exceptions must be subclasses of BaseException. Guido also liked the patch and suggested it be checked into the `Python 3000 branch`_. .. _Python 3000 branch: http://svn.python.org/view/python/branches/p3yk/ Contributing thread: - `Type of range object members `__ ----------------------------- gcc 4.2 and integer overflows ----------------------------- Jack Howarth pointed out that intobject.c was using the test ``x < 0 && x == -x`` to determine if the signed integer ``x`` was the most negative integer on the platform. However, the C standard says overflow is undefined, so despite this code actually working on pretty much all known hardware, `gcc 4.2 assumes that overflow won't happen`_ and so optimizes away the entire clause. David Hopwood and Tim Peters provided a patch that casts ``x`` to an unsigned long (the "unnecessary" ``0`` is to make the Microsoft compilers happy):: ``x < 0 && (unsigned long)x == 0-(unsigned long)x`` .. _gcc 4.2 assumes that overflow won't happen: http://bugs.python.org/1545668 Contributing thread: - `gcc 4.2 exposes signed integer overflows `__ -------------------------- Python and 64-bit machines -------------------------- Thomas Heller explained that the _ctypes extension module was still a fair ways from building on Win64 and had to be removed from the installer for that platform. There was some discussion about in general how "experimental" the Win64 build of Python was, but Martin v. L?wis explained that despite the compiler warnings, Python has been running mostly fine on Win64 since version 2.4. In fact, Python has been running in 64-bit machines since 1993 (when Tim Peters ported it to 64-bit Crays) though of course not with the support that Python 2.5 brought through the Py_ssize_t changes. Contributing thread: - `ctypes and win64 `__ ------------------------------------------ Guidelines for submitting bugs and patches ------------------------------------------ Brett Cannon put together a rewrite of the `bug and patch guidelines`_. The bug guidelines now includes sections on how to: * Get a SourceForge account * Start a new bug * Specify the Python version * Specify special settings for your Python interpreter * Give sample code to reproduce bug * Submit! * Respond to requests from developers And the patch guidelines now includes sections on how to: * Read the Developer Intro to understand the scope of your proposed change * Add the appropriate unit tests * Add the proper document changes * Make your code follow the style guidelines * Generate a patch * Create a tracker item on SourceForge * Reference the patch in proper bug reports * Wait for a developer to contact you At Chad Whitacre's suggestion, Brett also included a section on the 5-for-1 rule, where some python-devvers have agreed to review your one patch if you post reviews of five others. The updates had not been posted to python.org at the time of this summary. .. _bug and patch guidelines: http://www.python.org/dev/patches/ Contributing threads: - `draft for bug guidelines `__ - `draft of patch guidelines `__ --------------------------------- Corner cases for continue/finally --------------------------------- Dino Viehland pointed out an odd corner case with ``continue`` in a ``finally`` clause that was causing Python to crash:: for abc in range(10): try: pass finally: try: continue except: pass The bug was present at least all the way back to Python 2.3. People tossed a few patches back and forth (and a few tests which broke various versions of the patches) before `Neal Norwitz posted a patch`_ that people seemed to like. .. _Neal Norwitz posted a patch: http://bugs.python.org/1542451 Contributing thread: - `2.4 & 2.5 beta 3 crash `__ --------------------------------------- PEP 343: decimal module context manager --------------------------------------- Raymond Hettinger pointed out that the updates to the decimal module to take advantage of the ``with``-statement differed dramatically from `PEP 343`_ and were misdocumented in a number of places. Nick Coghlan explained that the API was a result of the introduction and then later removal of the ``__context__`` method. After some discussion, Raymond convinced everyone to change the API from:: with decimal.getcontext().copy().get_manager() as ctx: ... to the simpler API originally introduced in `PEP 343`:: with decimal.localcontext() as ctx: ... As a result of the changes needed to fix this API, Anthony Baxter decided that another release candidate was necessary before Python 2.5 final could be released. .. _PEP 343: http://www.python.org/dev/peps/pep-0343/ Contributing thread: - `Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented `__ ---------------------------- Python 2.6 development goals ---------------------------- Guido suggested that since Python 3.0 is now being developed in parallel with the 2.X trunk, the major work for Python 2.6 should be in making the transition to Python 3.0 as smooth as possible. This meant: * Adding warnings (suppressed by default) for code incompatible with Python 3.0. * Making all Python 2.X library code as Python 3.0-compatible as possible. * Converting all unittests to unittest or doctest format. Brett Cannon suggested adding to this list: * Improving tests and classifying them better * Updating and improving the documentation In general, people seemed to think this was a pretty good approach, particularly as it would address some of the complaints about the speed of addition of new features to Python. The discussion then moved off to the `python-3000 list`_. .. _python-3000 list: http://mail.python.org/mailman/listinfo/python-3000 Contributing threads: - `What should the focus for 2.6 be? `__ - `[Python-3000] What should the focus for 2.6 be? `__ ----------------------- Python 2.5, VC8 and PGO ----------------------- Muguntharaj Subramanian asked about building Python 2.5 with the VC8 compiler. Christopher Baus had recently provided a few patches to get the VC8 build working better and Kristj?n V. J?nsson said that he's working on updating the PCBuild8 directory in the trunk in a number of ways, including better support for profile-guided optimization (PGO) builds. He said once he got everything working right, he'd backport to Python 2.5. Contributing threads: - `Failed building 2.5rc1 pythoncore on VC8 `__ - `patch to add socket module project to vc8 solution `__ - `Error while building 2.5rc1 pythoncore_pgo on VC8 `__ ---------------------------------------------- PEP 342: using return instead of GeneratorExit ---------------------------------------------- Igor Bukanov suggested that the GeneratorExit exception introduced by `PEP 342`_ could be eliminated by replacing it with the semantics of the ``return`` statement. This would allow code like the following, which under the GeneratorExit paradigm would execute the ``except`` clause, to only execute the ``finally`` clause:: def gen(): try: yield 0 except Exception: print "Unexpected exception!" finally: print "Finally" for i in gen(): print i break Phillip J. Eby and others liked the approach, but suggested that it was much too late in the release process to be making such a major language change. Guido was open to making a change like this, perhaps in Python 3.0, but wanted the new generator enhancements to have some time in the field to see what was really needed here. .. _PEP 342: http://www.python.org/dev/peps/pep-0342/ Contributing thread: - `GeneratorExit is unintuitive and uneccessary `__ ------------------------------------------ String formatting, __str__ and __unicode__ ------------------------------------------ John J Lee noticed that in Python 2.5, the ``%s`` format specifier calls ``__unicode__`` on objects if their ``__str__`` method returns a unicode object:: >>> class a(object): ... def __str__(self): ... print '__str__' ... return u'str' ... def __unicode__(self): ... print '__unicode__' ... return u'unicode' ... >>> '%s%s' % (a(), a()) __str__ __unicode__ __unicode__ u'unicodeunicode' Nick Coghlan explained that string formatting first tries to build and return a str object, but starts over if any of the objects to be formatted by the ``%s`` specifier are unicode. So if a ``__str__`` method is called during string formatting and it returns a unicode object, Python will decide that the string formatting operation needs to return a unicode object, and will therefore start over, calling the ``__unicode__`` methods. Nick promised to look into making the documentation for this a bit clearer. Contributing thread: - `String formatting / unicode 2.5 bug? `__ ------------------------- Optimizing global lookups ------------------------- K.S. Sreeram asked about replacing the current LOAD_GLOBAL dict lookup with an array indexing along the lines of what is done for local names. Brett Cannon explained that globals can be altered from the outside, e.g. ``import mod; mod.name = value``, and thus globals aren't necessarily known at compile time. Tim Peters pointed out that a number of PEPs have been written in this area of optimization, with `PEP 280`_ being a good place to start. Most people were not opposed to the idea in general, but without an implementation to benchmark, there wasn't really much to discuss. .. _PEP 280: http://www.python.org/dev/peps/pep-0280/ Contributing thread: - `Can LOAD_GLOBAL be optimized to a simple array lookup? `__ --------------------- ElementTree and PEP 8 --------------------- Greg Ewing asked about changing the ElementTree names to be more `PEP 8`_ compliant. Being that Python was already in the release candidate stage for Python 2.5, this was not possible. Even had the issue been raised earlier, such a change would have been unlikely, as it would have discouraged people who needed some backward compatibility from using the version in the stdlib. .. _PEP 8: http://www.python.org/dev/peps/pep-0008/ Contributing thread: - `Doc suggestion for Elementtree (for 2.5? a bit late, I know...) `__ -------- rslice() -------- Nick Coghlan suggested that since reversing slices could be somewhat complicated, e.g. ``(stop - 1) % abs(step) : start - 1 : -step``, it would be helpful to introduce a ``rslice()`` builtin so that this could be written ``rslice(start, stop, step)``. Most people felt that this was unnecessary and didn't gain much over using ``reversed()`` on the sliced sequence. Contributing thread: - `Adding an rslice() builtin? `__ ---------------------------------- PEP 362: Function Signature Object ---------------------------------- Brett Cannon spent his time at the Google sprint working on `PEP 362`_, which introduces a signature object for functions to describe what arguments they take. He asked for some feedback on two points: * Should the signature object be an attribute on all functions or should it be requested through the inspect module? * Should the dict returned by ``Signature.bind()`` key by name or by a tuple of names for argument lists like ``def f((a, b)):``? He didn't get much of a response. .. _PEP 362: http://www.python.org/dev/peps/pep-0362/ Contributing threads: - `[Python-checkins] r51458 - peps/trunk/pep-0000.txt peps/trunk/pep-0362.txt `__ - `PEP 362 open issues `__ ---------------------------------------- Warn about mixing tabs and spaces in 2.6 ---------------------------------------- Thomas Wouters suggested making the ``-t`` flag the default in Python 2.6. This would make Python always issue warnings if users mixed tabs and spaces. People generally seemed in favor of the idea. Contributing thread: - `Making 'python -t' the default. `__ --------------------- xrange() and non-ints --------------------- Neal Norwitz was playing around with some patches that would allow ``xrange`` in Python 2.6 to accept longs or objects with an ``__index__`` method instead of just ints as it does now. He looked at two Python implementations, a Python-C hybrid implementation and a C implementation, and found that for his benchmark, the Python-C hybrid was as good as the C implementation. People suggested that the benchmark wasn't testing function call overhead well enough, and the pure C implementation was probably still the way to go. Contributing thread: - `xrange accepting non-ints `__ ================ Deferred Threads ================ - `Interest in a Python 2.3.6? `__ - `That library reference, yet again `__ ================== Previous Summaries ================== - `no remaining issues blocking 2.5 release `__ =============== Skipped Threads =============== - `IDLE patches - bugfix or not? `__ - `TRUNK FREEZE for 2.5c1, 00:00 UTC, Thursday 17th August `__ - `Weekly Python Patch/Bug Summary `__ - `Benchmarking the int allocator (Was: Type of range object members) `__ - `2.5: recently introduced sgmllib regexp bug hangs Python `__ - `[wwwsearch-general] 2.5: recently introduced sgmllib regexp bug hangs Python `__ - `recently introduced sgmllib regexp bughangs Python `__ - `RELEASED Python 2.5 (release candidate 1) `__ - `TRUNK IS UNFROZEN, available for 2.6 work if you are so inclined `__ - `[Python-checkins] TRUNK IS UNFROZEN, available for 2.6 work if you are so inclined `__ - `Fixing 2.5 windows buildbots `__ - `uuid tests failing on Windows `__ - `Sprints next week at Google `__ - `__del__ unexpectedly being called twice `__ - `How does this help? Re: [Python-checkins] r51366 - python/trunk/Lib/idlelib/NEWS.txt python/trunk/Lib/idlelib/idlever.py `__ - `One-line fix for urllib2 regression `__ - `os.spawnlp() missing on Windows in 2.4? `__ - `Questions on unittest behaviour `__ - `[Python-checkins] How does this help? Re: r51366 - python/trunk/Lib/idlelib/NEWS.txt python/trunk/Lib/idlelib/idlever.py `__ - `SSH Key Added `__ - `uuid module - byte order issue `__ - `A cast from Py_ssize_t to long `__ - `Python + Java Integration `__ - `[4suite] cDomlette deallocation bug? `__ - `[Python-checkins] r51525 - in python/trunk: Lib/test/test_float.py Objects/floatobject.c `__ - `for 2.5 issues `__ - `Need help with test_mutants.py `__ - `zip -> izip; is __length_hint__ required? `__ - `Removing anachronisms from logging module `__ - `distutils patch `__ - `32-bit and 64-bit python on Solaris `__ - `Small Py3k task: fix modulefinder.py `__ - `Windows build slave downtime `__ From jcarlson at uci.edu Tue Oct 10 17:58:47 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 10 Oct 2006 08:58:47 -0700 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: References: Message-ID: <20061010084306.09AE.JCARLSON@uci.edu> "Richard Oudkerk" wrote: > I am not sure how sensible the idea is, but I have had a first stab at > writing a module processing.py which is a near clone of threading.py > but uses processes and sockets for communication. (It is one way of > avoiding the GIL.) On non-windows platforms, you should check on unix domain sockets, I've found they can run a couple times faster than standard sockets on the local machine. And if you are using fork or a variant of subprocess to start processes on linux or Windows, you should consider using pipes, they can be competitive with sockets (though using a bunch on Windows can be a pain). > I have tested it on unix and windows and it seem to work pretty well. > (Getting round the lack of os.fork on windows is a bit awkward.) > There is also another module dummy_processing.py which has the same > api but is just a wrapper round threading.py. > > Queues, Locks, RLocks, Conditions, Semaphores and some other shared > objects are implemented. > > People are welcome to try out the tests in test_processing.py > contained in the zipfile. More information is included in the README > file. > > As a quick example, the code [snip] Looks interesting. Maybe it would become clearer with docs (I hope you've written some). Right now there is a difference, and it is basically that there are tokens and proxies, which could confuse some users. Presumably with this library you have created, you have also written a fast object encoder/decoder (like marshal or pickle). If it isn't any faster than cPickle or marshal, then users may bypass the module and opt for fork/etc. + XML-RPC; which works pretty well and gets them multi-machine calling, milti-language interoperability, and some other goodies, though it is a bit slow in terms of communication. - Josiah From fredrik at pythonware.com Tue Oct 10 18:03:32 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 10 Oct 2006 18:03:32 +0200 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <20061010084306.09AE.JCARLSON@uci.edu> References: <20061010084306.09AE.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > Presumably with this library you have created, you have also written a > fast object encoder/decoder (like marshal or pickle). If it isn't any > faster than cPickle or marshal, then users may bypass the module and opt > for fork/etc. + XML-RPC XML-RPC isn't close to marshal and cPickle in performance, though, so that statement is a bit misleading. the really interesting thing here is a ready-made threading-style API, I think. reimplementing queues, locks, and semaphores can be a reasonable amount of work; might as well use an existing implementation. From anthony at interlink.com.au Tue Oct 10 18:48:51 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed, 11 Oct 2006 02:48:51 +1000 Subject: [Python-Dev] BRANCH FREEZE, release24-maint for 2.4.4c1. 00:00UTC, 11 October 2006 Message-ID: <200610110248.52613.anthony@interlink.com.au> The release24-maint branch is frozen for the 2.4.4c1 release from 00:00UTC on the 11th of October. That's about 7 hours from now. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From greg at electricrain.com Tue Oct 10 21:11:39 2006 From: greg at electricrain.com (Gregory P. Smith) Date: Tue, 10 Oct 2006 12:11:39 -0700 Subject: [Python-Dev] 2.4 vs Windows vs bsddb In-Reply-To: <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com> Message-ID: <20061010191139.GA31184@zot.electricrain.com> On Mon, Oct 09, 2006 at 08:11:59PM -0400, Tim Peters wrote: > [Tim] > > I just noticed that the bsddb portion of Python fails to compile on > > the 2.4 Windows buildbots, but for some reason the buildbot machinery > > doesn't notice the failure: > > But it does now. This is the revision that broke the Windows build: > > """ > r52170 | andrew.kuchling | 2006-10-05 14:49:36 -0400 (Thu, 05 Oct > 2006) | 12 lines > > [Backport r50783 | neal.norwitz. The bytes_left code is complicated, > but looks correct on a casual inspection and hasn't been modified > in the trunk. Does anyone want to review further?] > > Ensure we don't write beyond errText. I think I got this right, but > it definitely could use some review to ensure I'm not off by one > and there's no possible overflow/wrap-around of bytes_left. > Reported by Klocwork #1. > > Fix a problem if there is a failure allocating self->db. > Found with failmalloc. > """ > > It introduces uses of assert() and strncat(), and the linker can't > resolve them. I suppose that's because the Windows link step for the > _bsddb subproject explicitly excludes msvcrt (in the release build) > and msvcrtd (in the debug build), but I don't know why that's done. > > OTOH, we got a lot more errors (about duplicate code definitions) if > the standard MS libraries aren't explicitly excluded, so that's no > fix. It seems bad form to C assert() within a python extension. crashing is bad. Just code it to not copy the string in that case. The exception type should convey enough info alone and if someone actually looks at the string description of the exception they're welcome to notice that its missing info and file a bug (it won't happen; the strings come from the BerkeleyDB or C library itself). As for the strncat instead of strcat that is good practice. The buffer is way more than large enough for any of the error messages defined in the berkeleydb common/db_err.c db_strerror() function but the C library could supply its own unreasonably long one in some unforseen circumstance. -greg From tim.peters at gmail.com Tue Oct 10 21:35:11 2006 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 10 Oct 2006 15:35:11 -0400 Subject: [Python-Dev] 2.4 vs Windows vs bsddb In-Reply-To: <20061010191139.GA31184@zot.electricrain.com> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com> <20061010191139.GA31184@zot.electricrain.com> Message-ID: <1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com> [Gregory P. Smith] > It seems bad form to C assert() within a python extension. crashing > is bad. Just code it to not copy the string in that case. The > exception type should convey enough info alone and if someone actually > looks at the string description of the exception they're welcome to > notice that its missing info and file a bug (it won't happen; the > strings come from the BerkeleyDB or C library itself). The proper use of C's assert() in Python (whether core or extension) is to strongly document a condition the author believes /must/ be true. It's a strong sanity-check on the programmer's beliefs about necessary invariants, things that must be true under all possible conditions. For example, it would always be wrong to assert that the result of calling malloc() with a non-zero argument is non-NULL; it would be correct (although trivially and unhelpfully so) to assert that the result is NULL or is not NULL. Given that, the assert() in question looks fine to me: if (_db_errmsg[0] && bytes_left < (sizeof(errTxt) - 4)) { bytes_left = sizeof(errTxt) - bytes_left - 4 - 1; assert(bytes_left >= 0); We can't get into the block unless bytes_left < sizeof(errTxt) - 4 is true. Subtracting bytes_left from both sides, then swapping LHS and RHS: sizeof(errTxt) - bytes_left - 4 > 0 which implies sizeof(errTxt) - bytes_left - 4 >= 1 Subtracting 1 from both sides: sizeof(errTxt) - bytes_left - 4 - 1 >= 0 And since the LHS of that is the new value of bytes_left, it must be true that bytes_left >= 0 Either that, or the original author (and me, just above) made an error in analyzing what must be true at this point. From bytes_left < sizeof(errTxt) - 4 it's not /instantly/ obvious that bytes_left >= 0 inside the block, so there's value in assert'ing that it's true. It's both documentation and an executable sanity check. In any case, assert() statements are thrown away in a release build, so can't be a cause of abnormal termination then. > As for the strncat instead of strcat that is good practice. The > buffer is way more than large enough for any of the error messages > defined in the berkeleydb common/db_err.c db_strerror() function but > the C library could supply its own unreasonably long one in some > unforseen circumstance. That's fine -- there "shouldn't have been" a problem with using any standard C function here. It was just the funky linker step on Windows on the 2.4 branch that was hosed. Martin figured out how to repair it, and there's no longer any problem here. In fact, even the been-there-forever linker warnings in 2.4 on Windows have gone away now. From arigo at tunes.org Tue Oct 10 23:10:37 2006 From: arigo at tunes.org (Armin Rigo) Date: Tue, 10 Oct 2006 23:10:37 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com> References: <34FE2A7A34BC3544BC3127D023DF3D1212873F@EWTEXCH.office.bhtrader.com> Message-ID: <20061010211037.GA4271@code0.codespeak.net> Hi Raymond, On Fri, Oct 06, 2006 at 08:48:15AM -0700, Raymond Hettinger wrote: > No need to backport. Py_TPFLAGS_DEFAULT implies > Py_TPFLAGS_HAVE_WEAKREFS. > > > The change was for clarity -- most things that have the weakref slots > filled-in will also make the flag explicit -- that makes it easier on > the brain when verifying code that checks the weakref flag. I don't understand why you added this flag here; there are many other flags with a meaning very similar to Py_TPFLAGS_HAVE_WEAKREFS, which are also implied by Py_TPFLAGS_DEFAULT. Also, *all* other types in a CPython build use Py_TPFLAGS_DEFAULT as well, so have Py_TPFLAGS_HAVE_WEAKREFS set. Why would explicitly spelling just this flag, on just this type, help make the overall code clearer? It seems to only further confuse the matter -- the slightly obscure bit that requires some getting used to is that all these flags don't really mean "I have such and such feature" but just "I could have such and such feature, if the corresponding tp_xxx field were set". A bientot, Armin From jcarlson at uci.edu Tue Oct 10 23:49:50 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 10 Oct 2006 14:49:50 -0700 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: References: <20061010084306.09AE.JCARLSON@uci.edu> Message-ID: <20061010130901.09B1.JCARLSON@uci.edu> Fredrik Lundh wrote: > > Josiah Carlson wrote: > > > Presumably with this library you have created, you have also written a > > fast object encoder/decoder (like marshal or pickle). If it isn't any > > faster than cPickle or marshal, then users may bypass the module and opt > > for fork/etc. + XML-RPC > > XML-RPC isn't close to marshal and cPickle in performance, though, so > that statement is a bit misleading. You are correct, it is misleading, and relies on a few unstated assumptions. In my own personal delving into process splitting, RPC, etc., I usually end up with one of two cases; I need really fast call/return, or I need not slow call/return. The not slow call/return is (in my opinion) satisfactorally solved with XML-RPC. But I've personally not been satisfied with the speed of any remote 'fast call/return' packages, as they usually rely on cPickle or marshal, which are slow compared to even moderately fast 100mbit network connections. When we are talking about local connections, I have even seen cases where the cPickle/marshal calls can make it so that forking the process is faster than encoding the input to a called function. I've had an idea for a fast object encoder/decoder (with limited support for certain built-in Python objects), but I haven't gotten around to actually implementing it as of yet. > the really interesting thing here is a ready-made threading-style API, I > think. reimplementing queues, locks, and semaphores can be a reasonable > amount of work; might as well use an existing implementation. Really, it is a matter of asking what kind of API is desireable. Do we want to have threading plus other stuff be the style of API that we want to replicate? Do we want to have shared queue objects, or would an XML-RPC-esque remote.queue_put('queue_X', value) and remote.queue_get('queue_X', blocking=1) be better? - Josiah From rhettinger at ewtllc.com Tue Oct 10 23:47:26 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 10 Oct 2006 14:47:26 -0700 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? Message-ID: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> > The change was for clarity -- most things that have the weakref slots > filled-in will also make the flag explicit -- that makes it easier on > the brain when verifying code that checks the weakref flag. > I don't understand why you added this flag here; Perhaps my other post wasn't clear. The change wasn't necessary, so if it bugs you, feel free to take it out. Essentially, it was a "note to self" so that I didn't have to keep looking up what was implied by Py_TPFLAGS_DEFAULT. > the slightly obscure bit that requires some getting used to is > that all these flags don't really mean "I have such and such > feature" but just "I could have such and such > feature, if the corresponding tp_xxx field were set". I would like to see that cleaned-up for Py3k. Ideally, the NULL or non_NULL status of a slot should serve as its flag. Raymond From barry at python.org Wed Oct 11 00:48:36 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 10 Oct 2006 18:48:36 -0400 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> Message-ID: <5CBC9CDA-BBF9-4956-B0A3-7C7373C74EB4@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 10, 2006, at 5:47 PM, Raymond Hettinger wrote: >> the slightly obscure bit that requires some getting used to is >> that all these flags don't really mean "I have such and such >> feature" but just "I could have such and such >> feature, if the corresponding tp_xxx field were set". > > I would like to see that cleaned-up for Py3k. Ideally, the NULL or > non_NULL status of a slot should serve as its flag. +1 TOOWTDI. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRSwjRHEjvBPtnXfVAQJ0sQQAllzbSdONhCBWc/Rt0PW6J5iANLcm99N4 MkkSEDZBo72SsviijRvTha1+1pvpzB6s4Rf7EOw/OKnQ+a3u37w3BB966ag8WIN1 RItKubCVS6kTpg53BBnIX7P0CGSFFY36pEQm4nNe3G6RH4F0FwmIdv0WyJhSnnDR KRT9PHI9QY8= =bH9r -----END PGP SIGNATURE----- From david.nospam.hopwood at blueyonder.co.uk Wed Oct 11 00:49:48 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Tue, 10 Oct 2006 23:49:48 +0100 Subject: [Python-Dev] 2.4 vs Windows vs bsddb In-Reply-To: <1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com> <20061010191139.GA31184@zot.electricrain.com> <1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com> Message-ID: <452C238C.6000708@blueyonder.co.uk> Tim Peters wrote: > Given that, the assert() in question looks fine to me: > > if (_db_errmsg[0] && bytes_left < (sizeof(errTxt) - 4)) { > bytes_left = sizeof(errTxt) - bytes_left - 4 - 1; > assert(bytes_left >= 0); > > We can't get into the block unless > > bytes_left < sizeof(errTxt) - 4 > > is true. Subtracting bytes_left from both sides, then swapping LHS and RHS: > > sizeof(errTxt) - bytes_left - 4 > 0 > > which implies > > sizeof(errTxt) - bytes_left - 4 >= 1 > > Subtracting 1 from both sides: > > sizeof(errTxt) - bytes_left - 4 - 1 >= 0 > > And since the LHS of that is the new value of bytes_left, it must be true that > > bytes_left >= 0 > > Either that, or the original author (and me, just above) made an error > in analyzing what must be true at this point. You omitted to state an assumption that sizeof(errTxt) >= 4, since size_t (and the constant 4) are unsigned. Also bytes_left must initially be nonnegative so that the subexpression 'sizeof(errTxt) - bytes_left' cannot overflow. -- David Hopwood From david.nospam.hopwood at blueyonder.co.uk Wed Oct 11 01:03:26 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Wed, 11 Oct 2006 00:03:26 +0100 Subject: [Python-Dev] 2.4 vs Windows vs bsddb [correction] In-Reply-To: <452C238C.6000708@blueyonder.co.uk> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com> <20061010191139.GA31184@zot.electricrain.com> <1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com> <452C238C.6000708@blueyonder.co.uk> Message-ID: <452C26BE.6090703@blueyonder.co.uk> I wrote: > You omitted to state an assumption that sizeof(errTxt) >= 4, since size_t > (and the constant 4) are unsigned. Sorry, the constant '4' is signed, but sizeof(errTxt) - 4 can nevertheless wrap around unless sizeof(errTxt) >= 4. -- David Hopwood From tim.peters at gmail.com Wed Oct 11 03:20:00 2006 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 10 Oct 2006 21:20:00 -0400 Subject: [Python-Dev] 2.4 vs Windows vs bsddb In-Reply-To: <452C238C.6000708@blueyonder.co.uk> References: <1f7befae0610091344j1639c21qc77b0ebf9c3df1b6@mail.gmail.com> <1f7befae0610091711j5e03c527w7ee0119f7175cf78@mail.gmail.com> <20061010191139.GA31184@zot.electricrain.com> <1f7befae0610101235q407e563cxaf1acf6ef9a5d47e@mail.gmail.com> <452C238C.6000708@blueyonder.co.uk> Message-ID: <1f7befae0610101820o16330e6frce1a33b39ac7b370@mail.gmail.com> [Tim] >> Given that, the assert() in question looks fine to me: >> ... |>> Either that, or the original author (and me, just above) made an error >> in analyzing what must be true at this point. | [David Hopwood] > You omitted to state an assumption that sizeof(errTxt) >= 4, since size_t > (and the constant 4) are unsigned. Also bytes_left must initially be nonnegative > so that the subexpression 'sizeof(errTxt) - bytes_left' cannot overflow. I don't care, but that's really the /point/: asserts are valuable precisely because any inference that's not utterly obvious at first glance at best stands a good chance of relying on hidden assumptions. assert() makes key assumptions and key inferences visible, and verifies them in a debug build of Python. From martin at v.loewis.de Wed Oct 11 06:15:20 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 11 Oct 2006 06:15:20 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> Message-ID: <452C6FD8.8070403@v.loewis.de> Raymond Hettinger schrieb: >> the slightly obscure bit that requires some getting used to is >> that all these flags don't really mean "I have such and such >> feature" but just "I could have such and such >> feature, if the corresponding tp_xxx field were set". > > I would like to see that cleaned-up for Py3k. Ideally, the NULL or > non_NULL status of a slot should serve as its flag. The flag indicates that the field is even present. If you have an extension module from an earlier Python release (in binary form), it won't *have* the field, so you can't test whether it's null. Accessing it will get to some other place in the data segment, and interpreting it as a function pointer will cause a crash. That's why the flags where initially introduced; presence of the flag indicates that the field was there at compile time. Of course, if everybody would always recompile all extension modules for a new Python feature release, those flags weren't necessary. Regards, Martin From mal at egenix.com Wed Oct 11 10:23:40 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 11 Oct 2006 10:23:40 +0200 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <20061010130901.09B1.JCARLSON@uci.edu> References: <20061010084306.09AE.JCARLSON@uci.edu> <20061010130901.09B1.JCARLSON@uci.edu> Message-ID: <452CAA0C.6030306@egenix.com> Josiah Carlson wrote: > Fredrik Lundh wrote: >> Josiah Carlson wrote: >> >>> Presumably with this library you have created, you have also written a >>> fast object encoder/decoder (like marshal or pickle). If it isn't any >>> faster than cPickle or marshal, then users may bypass the module and opt >>> for fork/etc. + XML-RPC >> XML-RPC isn't close to marshal and cPickle in performance, though, so >> that statement is a bit misleading. > > You are correct, it is misleading, and relies on a few unstated > assumptions. > > In my own personal delving into process splitting, RPC, etc., I usually > end up with one of two cases; I need really fast call/return, or I need > not slow call/return. The not slow call/return is (in my opinion) > satisfactorally solved with XML-RPC. But I've personally not been > satisfied with the speed of any remote 'fast call/return' packages, as > they usually rely on cPickle or marshal, which are slow compared to > even moderately fast 100mbit network connections. When we are talking > about local connections, I have even seen cases where the > cPickle/marshal calls can make it so that forking the process is faster > than encoding the input to a called function. This is hard to believe. I've been in that business for a few years and so far have not found an OS/hardware/network combination with the mentioned features. Usually the worst part in performance breakdown for RPC is network latency, ie. time to connect, waiting for the packets to come through, etc. and this parameter doesn't really depend on the OS or hardware you're running the application on, but is more a factor of which network hardware, architecture and structure is being used. It also depends a lot on what you send as arguments, of course, but I assume that you're not pickling a gazillion objects :-) > I've had an idea for a fast object encoder/decoder (with limited support > for certain built-in Python objects), but I haven't gotten around to > actually implementing it as of yet. Would be interesting to look at. BTW, did you know about http://sourceforge.net/projects/py-xmlrpc/ ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 11 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From fredrik at pythonware.com Wed Oct 11 12:35:23 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 11 Oct 2006 12:35:23 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> <452C6FD8.8070403@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Of course, if everybody would always recompile all extension modules > for a new Python feature release, those flags weren't necessary. a dynamic registration approach would be even better, with a single entry point used to register all methods and hooks your C extension has implemented, and code on the other side that builds a properly initialized type descriptor from that set, using fallback functions and error stubs where needed. e.g. the impossible-to-write-from-scratch NoddyType struct initialization in http://docs.python.org/ext/node24.html would collapse to static PyTypeObject NoddyType; ... NoddyType = PyType_Setup("noddy.Noddy", sizeof(Noddy)); PyType_Register(NoddyType, PY_TP_DEALLOC, Noddy_dealloc); PyType_Register(NoddyType, PY_TP_DOC, "Noddy objects"); PyType_Register(NoddyType, PY_TP_TRAVERSE, Noddy_traverse); PyType_Register(NoddyType, PY_TP_CLEAR, Noddy_clear); PyType_Register(NoddyType, PY_TP_METHODS, Noddy_methods); PyType_Register(NoddyType, PY_TP_MEMBERS, Noddy_members); PyType_Register(NoddyType, PY_TP_INIT, Noddy_init); PyType_Register(NoddyType, PY_TP_NEW, Noddy_new); if (PyType_Ready(&NoddyType) < 0) return; (a preprocessor that generated this based on suitable "macro decorators" could be implemented in just over 8 lines of Python...) with this in place, we could simply remove all those silly NULL checks from the interpreter. From fredrik at pythonware.com Wed Oct 11 12:54:33 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 11 Oct 2006 12:54:33 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com><452C6FD8.8070403@v.loewis.de> Message-ID: I wrote: > PyType_Register(NoddyType, PY_TP_METHODS, Noddy_methods); methods and members could of course be registered to, so the implementation can chose how to store them (e.g. short lists for smaller method lists, dictionaries for others). From jcarlson at uci.edu Wed Oct 11 18:38:48 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 11 Oct 2006 09:38:48 -0700 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <452CAA0C.6030306@egenix.com> References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> Message-ID: <20061011090701.09CA.JCARLSON@uci.edu> "M.-A. Lemburg" wrote: > > Josiah Carlson wrote: > > Fredrik Lundh wrote: > >> Josiah Carlson wrote: > >> > >>> Presumably with this library you have created, you have also written a > >>> fast object encoder/decoder (like marshal or pickle). If it isn't any > >>> faster than cPickle or marshal, then users may bypass the module and opt > >>> for fork/etc. + XML-RPC > >> XML-RPC isn't close to marshal and cPickle in performance, though, so > >> that statement is a bit misleading. > > > > You are correct, it is misleading, and relies on a few unstated > > assumptions. > > > > In my own personal delving into process splitting, RPC, etc., I usually > > end up with one of two cases; I need really fast call/return, or I need > > not slow call/return. The not slow call/return is (in my opinion) > > satisfactorally solved with XML-RPC. But I've personally not been > > satisfied with the speed of any remote 'fast call/return' packages, as > > they usually rely on cPickle or marshal, which are slow compared to > > even moderately fast 100mbit network connections. When we are talking > > about local connections, I have even seen cases where the > > cPickle/marshal calls can make it so that forking the process is faster > > than encoding the input to a called function. > > This is hard to believe. I've been in that business for a few > years and so far have not found an OS/hardware/network combination > with the mentioned features. > > Usually the worst part in performance breakdown for RPC is network > latency, ie. time to connect, waiting for the packets to come through, > etc. and this parameter doesn't really depend on the OS or hardware > you're running the application on, but is more a factor of which > network hardware, architecture and structure is being used. I agree, that is usually the case. But for pre-existing connections remote or local (whether via socket or unix domain socket), pickling slows things down significantly. What do I mean? Set up a daemon that reads and discards what is sent to it as fast as possible. Then start sending it plain strings (constructed via something like 32768*'\0'). Compare it to a somewhat equivalently sized pickle-as-you-go sender. Maybe I'm just not doing it right, but I always end up with a slowdown that makes me want to write my own fast encoder/decoder. > It also depends a lot on what you send as arguments, of course, > but I assume that you're not pickling a gazillion objects :-) According to tests on one of the few non-emulated linux machines I have my hands on, forking to a child process runs on the order of .0004-.00055 seconds. On that same machine, pickling... 128*['hello world', 18, {1:2}, 7.382] ...takes ~.0005 seconds. 512 somewhat mixed elements isn't a gazillion, though in my case, I believe it was originally a list of tuples or somesuch. > > I've had an idea for a fast object encoder/decoder (with limited support > > for certain built-in Python objects), but I haven't gotten around to > > actually implementing it as of yet. > > Would be interesting to look at. It would basically be something along the lines of cPickle, but would only support the basic types of: int, long, float, str, unicode, tuple, list, dictionary. > BTW, did you know about http://sourceforge.net/projects/py-xmlrpc/ ? I did not know about it. But it looks interesting. I'll have to compile it for my (ancient) 2.3 installation and see how it does. Thank you for the pointer. - Josiah From jcarlson at uci.edu Wed Oct 11 18:46:39 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 11 Oct 2006 09:46:39 -0700 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: References: <20061010130901.09B1.JCARLSON@uci.edu> Message-ID: <20061011084824.09C7.JCARLSON@uci.edu> "Richard Oudkerk" wrote: > On 10/10/06, Josiah Carlson wrote: > > > the really interesting thing here is a ready-made threading-style API, I > > > think. reimplementing queues, locks, and semaphores can be a reasonable > > > amount of work; might as well use an existing implementation. > > > > Really, it is a matter of asking what kind of API is desireable. Do we > > want to have threading plus other stuff be the style of API that we want > > to replicate? Do we want to have shared queue objects, or would an > > XML-RPC-esque remote.queue_put('queue_X', value) and > > remote.queue_get('queue_X', blocking=1) be better? > > Whatever the API is, I think it is useful if you can swap between > threads and processes just by changing the import line. That way you > can write applications without deciding upfront which to use. It would be convenient, yes, but the question isn't always 'threads or processes?' In my experience (not to say that it is more or better than anyone else's), when going multi-process, the expense on some platforms is significant enough to want to persist the process (this is counter to my previous forking statement, but its all relative). And sometimes one *wants* multiple threads running in a single process handling multiple requests. There's a recipe hanging out in the Python cookbook that adds a threading mixin to the standard XML-RPC server in Python. For a set of processes (perhaps on different machines) that are cooperating and calling amongst each other, I've not seen a significantly better variant, especially when the remote procedure call can take a long time to complete. It does take a few tricks to make sure that sufficient connections are available from process A to process B when A calls B from multiple threads, but its not bad. - Josiah From fredrik at pythonware.com Wed Oct 11 18:41:52 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 11 Oct 2006 18:41:52 +0200 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <20061011090701.09CA.JCARLSON@uci.edu> References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > It would basically be something along the lines of cPickle, but would > only support the basic types of: int, long, float, str, unicode, tuple, > list, dictionary. if you're aware of a way to do that faster than the current marshal implementation, maybe you could work on speeding up marshal instead? From skip at pobox.com Wed Oct 11 18:59:30 2006 From: skip at pobox.com (skip at pobox.com) Date: Wed, 11 Oct 2006 11:59:30 -0500 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <20061011090701.09CA.JCARLSON@uci.edu> References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu> Message-ID: <17709.8946.379605.437664@montanaro.dyndns.org> Josiah> It would basically be something along the lines of cPickle, but Josiah> would only support the basic types of: int, long, float, str, Josiah> unicode, tuple, list, dictionary. Isn't that approximately marshal's territory? If you can write a faster encoder/decoder, it might well be possible to apply the speedup ideas to marshal. Skip From brett at python.org Wed Oct 11 20:01:50 2006 From: brett at python.org (Brett Cannon) Date: Wed, 11 Oct 2006 11:01:50 -0700 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> <452C6FD8.8070403@v.loewis.de> Message-ID: On 10/11/06, Fredrik Lundh wrote: > > Martin v. L?wis wrote: > > > Of course, if everybody would always recompile all extension modules > > for a new Python feature release, those flags weren't necessary. > > a dynamic registration approach would be even better, with a single entry > point > used to register all methods and hooks your C extension has implemented, > and > code on the other side that builds a properly initialized type descriptor > from that > set, using fallback functions and error stubs where needed. > > e.g. the impossible-to-write-from-scratch NoddyType struct initialization > in > > http://docs.python.org/ext/node24.html > > would collapse to > > static PyTypeObject NoddyType; > > ... > > NoddyType = PyType_Setup("noddy.Noddy", sizeof(Noddy)); > PyType_Register(NoddyType, PY_TP_DEALLOC, Noddy_dealloc); > PyType_Register(NoddyType, PY_TP_DOC, "Noddy objects"); > PyType_Register(NoddyType, PY_TP_TRAVERSE, Noddy_traverse); > PyType_Register(NoddyType, PY_TP_CLEAR, Noddy_clear); > PyType_Register(NoddyType, PY_TP_METHODS, Noddy_methods); > PyType_Register(NoddyType, PY_TP_MEMBERS, Noddy_members); > PyType_Register(NoddyType, PY_TP_INIT, Noddy_init); > PyType_Register(NoddyType, PY_TP_NEW, Noddy_new); > if (PyType_Ready(&NoddyType) < 0) > return; > > (a preprocessor that generated this based on suitable "macro decorators" > could > be implemented in just over 8 lines of Python...) > > with this in place, we could simply remove all those silly NULL checks > from the > interpreter. This is also has the benefit of making it really easy to grep for the function used for the tp_init field since there is no guarantee someone will keep the traditional field comments in their file (I usually grep for PyTypeObject until I find the type I want). If we went with C99 this wouldn't be an issue, but since I don't think that is necessarily in the cards I am very happy to go with this solution. It ends up feeling more like how Ruby does C extensions, and I have to admit I think they may have made it simpler than we have. And of course with the change Raymond put in for checking the PyMethodDef slots can also easily allow people to name methods based on what the slots would be named had it been defined in Python (which we might want to do anyway with the C constants to make it more readable and less obtuse to new extension writers; e.g. change PY_TP_NEW to PY__NEW__). And lastly, this approach makes sure that the basic requirement of what a type must have defined can be enforced in the PyType_Setup() method /F's proposing. +1 from me. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061011/6a564753/attachment-0001.htm From jcarlson at uci.edu Wed Oct 11 20:34:01 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 11 Oct 2006 11:34:01 -0700 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: References: <20061011090701.09CA.JCARLSON@uci.edu> Message-ID: <20061011103733.09CD.JCARLSON@uci.edu> Fredrik Lundh wrote: > Josiah Carlson wrote: > > > It would basically be something along the lines of cPickle, but would > > only support the basic types of: int, long, float, str, unicode, tuple, > > list, dictionary. > > if you're aware of a way to do that faster than the current marshal > implementation, maybe you could work on speeding up marshal instead? The current implementation uses a fixed resize semantic (1024 bytes at a time) that makes large marshal operations slow. If we were to switch to a list resize-like or cStringIO semantic (overallocate by ~size>>3, or at least double, respectively), it would likely increase the speed for large resize operations. (see the w_more definition) This should make it significantly faster than cPickle in basically all cases. w_object uses a giant if/else if block to handle all of the possible cases, both for identity checks against None, False, True, etc., as well as with the various Py*_Check(). This is necessary due to marshal supporting subclasses (the Py*_Check() calls) and the dynamic layout of memory during Python startup. The identity checks may be able to be replaced with a small array-like thing if we were to statically allocate them from a single array to guarantee that their addresses are a fixed distance apart... char base_objects[320]; PyObject* IDENTITY[8]; int cases[8]; /* 64 bytes per object is overkill, and we may want to allocate enough room for 15 objects, to make sure that IDENTITY[0] = NULL; */ p = 0 for obj_init in objs_to_init: init_object(base_objects+p, obj_init) x = ((base_objects+p)>>6)&7 IDENTITY[x] = (PyObject*)(base_objects+p) cases[x] = p//64 p += 64 Then we could use the following in w_object... x = (v>>6)&7 if v == IDENTITY[x] { switch (cases[x]) { case 0: /* should be null */ ... case 1: /* based on the order of objs_to_init */ } } The Py*_Check() stuff isn't so amenable to potential speedup, but in a custom no-subclasses only base types version, we ccould use a variant of the above mechanism to look directly at types, then use a second switch/case statement, which should be significantly faster than the if/else if tests that it currently uses. An identity check, then a fast type check, otherwise fail. - Josiah From simonwittber at gmail.com Thu Oct 12 02:31:19 2006 From: simonwittber at gmail.com (Simon Wittber) Date: Thu, 12 Oct 2006 08:31:19 +0800 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <20061011090701.09CA.JCARLSON@uci.edu> References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu> Message-ID: <4e4a11f80610111731n68275e4agd98abd0baf3ab54@mail.gmail.com> On 10/12/06, Josiah Carlson wrote: > > It would basically be something along the lines of cPickle, but would > only support the basic types of: int, long, float, str, unicode, tuple, > list, dictionary. > Great idea! Check this thread for past efforts: http://mail.python.org/pipermail/python-dev/2005-June/054313.html The 'gherkin' module discussed there now lives in the cheeseshop as part of the FibraNet package. http://cheeseshop.python.org/pypi/FibraNet I love benchmarks, especially when they come around for the second time. I wrote a silly script which compares dumps performance between different serialization modules for different simple objects using Python 2.4.3. All figures are 'dumps per second'. test: a tuple: ("a" * 1024,1.0,[1,2,3],{'1':2,'3':4}) gherkin: 10895.7762314 pickle: 6510.97245984 cPickle: 34218.5455317 marshal: 85562.2443672 xmlrpclib: 9468.0766772 test: a large string: 'a' * 10240 gherkin: 45955.4065455 pickle: 10209.0239868 cPickle: 13773.8138516 marshal: 24937.002069 xmlrpclib: Traceback test: a small string: 'a' * 128 gherkin: 73453.0960495 pickle: 28357.0210654 cPickle: 122997.592425 marshal: 202428.776201 xmlrpclib: Traceback test: a tupe of ints: tuple(range(64)) gherkin: 4522.06801154 pickle: 2273.12937965 cPickle: 23969.9306043 marshal: 143691.72582 xmlrpclib: 2781.3083894 Marshal is very quick for most cases, but still has this warning in the documentation. """Warning: The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source.""" -Sw From greg.ewing at canterbury.ac.nz Thu Oct 12 01:30:26 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 12 Oct 2006 12:30:26 +1300 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu> Message-ID: <452D7E92.4050206@canterbury.ac.nz> Fredrik Lundh wrote: > if you're aware of a way to do that faster than the current marshal > implementation, maybe you could work on speeding up marshal instead? Even if it weren't faster than marshal, it could still be useful to have something nearly as fast that used a python-version-independent protocol. -- Greg From fredrik at pythonware.com Thu Oct 12 07:22:00 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 12 Oct 2006 07:22:00 +0200 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <452D7E92.4050206@canterbury.ac.nz> References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu> <452D7E92.4050206@canterbury.ac.nz> Message-ID: Greg Ewing wrote: >> if you're aware of a way to do that faster than the current marshal >> implementation, maybe you could work on speeding up marshal instead? > > Even if it weren't faster than marshal, it could still > be useful to have something nearly as fast that used > a python-version-independent protocol. marshal hasn't changed in many years: $ python1.5 >>> x = 1, 2.0, "three", [4, 5, 6, "seven"], {8: 9}, None >>> import marshal >>> marshal.dump(x, open("x.dat", "w")) >>> $ python2.5 >>> import marshal >>> marshal.load(open("x.dat")) (1, 2.0, 'three', [4, 5, 6, 'seven'], {8: 9}, None) which is a good thing, because there are external non-Python tools that generate marshalled data streams. maybe you were thinking about marshalled code objects? From dave at boost-consulting.com Thu Oct 12 09:00:16 2006 From: dave at boost-consulting.com (Dave Abrahams) Date: Thu, 12 Oct 2006 07:00:16 +0000 (UTC) Subject: [Python-Dev] Plea to distribute debugging lib References: <20051104202824.GA19678@discworld.dyndns.org> <20051202025557.GA22377@ActiveState.com> Message-ID: Trent Mick ActiveState.com> writes: > > [Thomas Heller wrote] > > Anyway, AFAIK, the activestate distribution contains Python debug dlls. > > [Er, a month late, but I was in flitting around Australia at the time. :)] > > Yes, as a separate download. > > ftp://ftp.activestate.com/ActivePython/etc/ > ActivePython--win32-ix86-debug.zip > > And those should be binary compatible with the equivalent python.org > installs as well. Note that the simple "install.py" script in those > packages bails if the Python installation isn't ActivePython, but I > could easily remove that if you think that would be useful for your > users. The only problem here is that there appears to be a lag in the release of ActivePython after Python itself is released. Is there any chance of putting up just the debugging libraries a little earlier? Thanks again, Dave From anthony at python.org Thu Oct 12 09:33:02 2006 From: anthony at python.org (Anthony Baxter) Date: Thu, 12 Oct 2006 17:33:02 +1000 Subject: [Python-Dev] RELEASED Python 2.4.4, release candidate 1 Message-ID: <200610121733.11507.anthony@python.org> On behalf of the Python development team and the Python community, I'm happy to announce the release of Python 2.4.4 (release candidate 1). Python 2.4.4 is a bug-fix release. While Python 2.5 is the latest version of Python, we're making this release for people who are still running Python 2.4. See the release notes at the website (also available as Misc/NEWS in the source distribution) for details of the more than 80 bugs squished in this release, including a number found by the Coverity and Klocwork static analysis tools. We'd like to offer our thanks to both these companies for making this available for open source projects. * Python 2.4.4 contains a fix for PSF-2006-001, a buffer overrun * * in repr() of unicode strings in wide unicode (UCS-4) builds. * * See http://www.python.org/news/security/PSF-2006-001/ for more. * Assuming no major problems crop up, a final release of Python 2.4.4 will follow in about a week's time. This will be the last planned release in the Python 2.4 series - future maintenance releases will be in the 2.5 line. For more information on Python 2.4.4, including download links for various platforms, release notes, and known issues, please see: http://www.python.org/2.4.4/ Highlights of this new release include: - Bug fixes. According to the release notes, at least 80 have been fixed. - A fix for PSF-2006-001, a bug in repr() for unicode strings on UCS-4 (wide unicode) builds. Highlights of the previous major Python release (2.4) are available from the Python 2.4 page, at http://www.python.org/2.4/highlights.html Enjoy this release, Anthony Anthony Baxter anthony at python.org Python Release Manager (on behalf of the entire python-dev team) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061012/c1778e7d/attachment.pgp From anthony at interlink.com.au Thu Oct 12 10:08:46 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 12 Oct 2006 18:08:46 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun Message-ID: <200610121808.47010.anthony@interlink.com.au> I've had a couple of queries about whether PSF-2006-001 merits a 2.3.6. Personally, I lean towards "no" - 2.4 was nearly two years ago now. But I'm open to other opinions - I guess people see the phrase "buffer overrun" and they get scared. Plus once 2.4.4 final is out next week, I'll have cut 12 releases since March. Assuming a 2.5.1 before March (very likely) that'll be 14 releases in 12 months. 16 releases in 12 months would just about make me go crazy. From fredrik at pythonware.com Thu Oct 12 10:18:33 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 12 Oct 2006 10:18:33 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun References: <200610121808.47010.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > 16 releases in 12 months would just about make me go crazy. is there any way we could further automate or otherwise streamline or distribute the release process ? ideally, releasing (earlier release + well-defined patch set) should be fairly trivial, compared to releasing (new release from trunk). what do we have to do to make it easier to handle that case? From nick at craig-wood.com Thu Oct 12 13:35:31 2006 From: nick at craig-wood.com (Nick Craig-Wood) Date: Thu, 12 Oct 2006 12:35:31 +0100 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610121808.47010.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> Message-ID: <20061012113531.GA32477@craig-wood.com> On Thu, Oct 12, 2006 at 06:08:46PM +1000, Anthony Baxter wrote: > I've had a couple of queries about whether PSF-2006-001 merits a 2.3.6. > Personally, I lean towards "no" - 2.4 was nearly two years ago now. But I'm > open to other opinions - I guess people see the phrase "buffer overrun" and > they get scared. As a data point: python 2.3 is the shipped version of python in current stable Debian release (sarge). It is also vulnerable by default (sys.maxunicode == 1114111). I'm sure the debian maintainers are capable of picking up the patch and sending out a security update themselves, but by releasing a fixed 2.3 you'll send a stronger message to all the distributions hopefully! > Plus once 2.4.4 final is out next week, I'll have cut 12 releases > since March. Assuming a 2.5.1 before March (very likely) that'll be > 14 releases in 12 months. 16 releases in 12 months would just about > make me go crazy. I sympathise! I do released for my current workplace and it is time consuming and exacting work. -- Nick Craig-Wood -- http://www.craig-wood.com/nick From arigo at tunes.org Thu Oct 12 14:12:49 2006 From: arigo at tunes.org (Armin Rigo) Date: Thu, 12 Oct 2006 14:12:49 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> <452C6FD8.8070403@v.loewis.de> Message-ID: <20061012121248.GA25659@code0.codespeak.net> Hi Fredrik, On Wed, Oct 11, 2006 at 12:35:23PM +0200, Fredrik Lundh wrote: > NoddyType = PyType_Setup("noddy.Noddy", sizeof(Noddy)); It doesn't address the problem Martin explained (you can put neither NULLs nor stubs in tp_xxx fields that are beyond the C extension module's sizeof(Nobby)). But I imagine it could with a bit more tweaking. A bientot, Armin From fredrik at pythonware.com Thu Oct 12 14:37:25 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 12 Oct 2006 14:37:25 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com><452C6FD8.8070403@v.loewis.de> <20061012121248.GA25659@code0.codespeak.net> Message-ID: Armin Rigo wrote: >> NoddyType = PyType_Setup("noddy.Noddy", sizeof(Noddy)); > > It doesn't address the problem Martin explained (you can put neither > NULLs nor stubs in tp_xxx fields that are beyond the C extension > module's sizeof(Nobby)). But I imagine it could with a bit more > tweaking. umm. last time I checked, the tp fields lived in the type object, not in the instance. From nmm1 at cus.cam.ac.uk Thu Oct 12 15:22:30 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Thu, 12 Oct 2006 14:22:30 +0100 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: Your message of "Wed, 11 Oct 2006 10:23:40 +0200." <452CAA0C.6030306@egenix.com> Message-ID: "M.-A. Lemburg" wrote: > > This is hard to believe. I've been in that business for a few > years and so far have not found an OS/hardware/network combination > with the mentioned features. Surely you must have - unless there is another M.-A. Lemburg in IT! Some of the specialist systems, especially those used for communication, were like that, and it is very likely that many still are. But they aren't currently in Python's domain. I have never used any, but have colleagues who have. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Thu Oct 12 15:25:26 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Thu, 12 Oct 2006 14:25:26 +0100 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: Your message of "Wed, 11 Oct 2006 09:46:39 PDT." <20061011084824.09C7.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > > It would be convenient, yes, but the question isn't always 'threads or > processes?' In my experience (not to say that it is more or better than > anyone else's), when going multi-process, the expense on some platforms > is significant enough to want to persist the process (this is counter to > my previous forking statement, but its all relative). And sometimes one > *wants* multiple threads running in a single process handling multiple > requests. Yes, indeed. This is all confused by the way that POSIX (and Microsoft) threads have become essentially just processes with shared resources. If one had a system with real, lightweight threads, the same might well not be so. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Thu Oct 12 16:10:27 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Thu, 12 Oct 2006 15:10:27 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Wed, 13 Sep 2006 05:36:34 +0200." <45077CC2.9070601@v.loewis.de> Message-ID: Sorry. I was on holiday, and then buried this when sorting out my thousands of Emails on my return, partly because I had to look up the information! =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > > >> | afaik the kernel only sends signals to threads that don't have them blocked. > >> | If python doesn't want anyone but the main thread to get signals, it > >> should just > >> | block signals on all but the main thread and then by nature, all > >> signals will go > >> | to the main thread.... > > > > Well, THAT'S wrong, I am afraid! Things ain't that simple :-(> > > Yes, POSIX implies that things work that way, but there are so many > > get-out clauses and problems with trying to implement that specification > > that such behaviour can't be relied on. > > Can you please give one example for each (one get-out clause, and > one problem with trying to implement that). http://www.opengroup.org/onlinepubs/009695399/toc.htm 2.4.1 Signal Generation and Delivery It is extremely unclear what that means, but it talks about the generation and delivery of signals to both threads and processes. I can tell you (from speaking to system developers) that they understand that to mean that they are allowed to send signals to specific threads when that is appropriate. But they are as confused by POSIX's verbiage as I am! > I fail to see why it isn't desirable to make all signals occur > in the main thread, on systems where this is possible. Oh, THAT's easy. Consider a threaded application running on a muti-CPU machine and consider hardware generated signals (e.g. SIGFPE, SIGSEGV etc.) Sending them to the master thread involves either moving them between CPUs or moving the master thread; both are inefficient and neither may be possible. [ I have brought systems down with signals that did have to be handled on a particular CPU, by flooding that with signals from dozens of others (yes, big SMPs) and blocking out high-priority interrupts. The efficiency point can be serious. ] That also applies to many of the signals that do not reach programs, such as TLB misses, ECC failure etc. But, in those cases, what does Python or even POSIX need to know about them? Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Thu Oct 12 16:15:47 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Thu, 12 Oct 2006 15:15:47 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: Michael Hudson schrieb: > > >> According to [1], all python needs to do to avoid this problem is > >> block all signals in all but the main thread; > > > > Argh, no: then people who call system() from non-main threads end up > > running subprocesses with all signals masked, which breaks other > > things in very mysterious ways. Been there... > > Python should register a pthread_atfork handler then, which clears > the signal mask. Would that not work? No. It's not the only such problem. Personally, I think that anyone who calls system(), fork(), spawn() or whatever from threads is cuckoo. It is precisely the sort of thing that is asking for trouble, because there are so many ways of doing it 'right' that you can't be sure exactly what mental model the system developers will have. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From Audun.Ostrem.Nordal at cern.ch Tue Oct 10 16:28:31 2006 From: Audun.Ostrem.Nordal at cern.ch (Audun Ostrem Nordal) Date: Tue, 10 Oct 2006 16:28:31 +0200 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: Message-ID: You may already know about a similar project a friend of mine (hi, Steffen!) did a few years ago called Python Object Sharing (POSH). This was however unix specific and relied on fork and SYSV IPC iirc. I see he has a SF projectpage here: http://poshmodule.sourceforge.net/ (doesn't seem to be a lot of activity there, though). Best regards __ Audun Ostrem Nordal tel: +41.22.76.74427 CERN IT/IS 1211 Geneve 23 Switzerland > -----Original Message----- > From: python-dev-bounces+audun=cern.ch at python.org > [mailto:python-dev-bounces+audun=cern.ch at python.org] On > Behalf Of Richard Oudkerk > Sent: Monday, October 09, 2006 1:59 PM > To: python-dev at python.org > Subject: [Python-Dev] Cloning threading.py using proccesses > > I am not sure how sensible the idea is, but I have had a > first stab at writing a module processing.py which is a near > clone of threading.py but uses processes and sockets for > communication. (It is one way of avoiding the GIL.) > > I have tested it on unix and windows and it seem to work pretty well. > (Getting round the lack of os.fork on windows is a bit > awkward.) There is also another module dummy_processing.py > which has the same api but is just a wrapper round threading.py. > > Queues, Locks, RLocks, Conditions, Semaphores and some other > shared objects are implemented. > > People are welcome to try out the tests in test_processing.py > contained in the zipfile. More information is included in > the README file. > > As a quick example, the code > > . from processing import Process, Queue, ObjectManager > . > . def f(token): > . q = proxy(token) > . for i in range(10): > . q.put(i*i) > . q.put('STOP') > . > . if __name__ == '__main__': > . manager = ObjectManager() > . token = manager.new(Queue) > . queue = proxy(token) > . > . t = Process(target=f, args=[token]) > . t.start() > . > . result = None > . while result != 'STOP': > . result = queue.get() > . print result > . > . t.join() > > is not very different from the normal threaded equivalent > > . from threading import Thread > . from Queue import Queue > . > . def f(q): > . for i in range(10): > . q.put(i*i) > . q.put('STOP') > . > . if __name__ == '__main__': > . queue = Queue() > . > . t = Thread(target=f, args=[queue]) > . t.start() > . > . result = None > . while result != 'STOP': > . result = queue.get() > . print result > . > . t.join() > > Richard > From gregwillden at gmail.com Tue Oct 10 21:40:59 2006 From: gregwillden at gmail.com (Greg Willden) Date: Tue, 10 Oct 2006 14:40:59 -0500 Subject: [Python-Dev] ConfigParser: whitespace leading comment lines Message-ID: <903323ff0610101240p2f4e0a18g18d34d1a800624ec@mail.gmail.com> Hello all, I'd like to propose the following change to ConfigParser.py. I won't call it a bug-fix because I don't know the relevant standards. This change will enable multiline comments as follows: [section] item=value ;first of multiline comment ;second of multiline comment Right now the behaviour is In [19]: cfg.get('section','item') Out[19]: 'value\n;second of multiline comment' It's a one-line change. RawConfigParser._read lines 434-437 # comment or blank line? - if line.strip() == '' or line[0] in '#;': + if line.strip() == '' or line.strip()[0] in '#;': continue Regards, Greg Willden (Not a member of python-dev) -- Linux. Because rebooting is for adding hardware. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061010/76a9e204/attachment-0001.html From kristjan at ccpgames.com Wed Oct 11 13:29:00 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Wed, 11 Oct 2006 11:29:00 -0000 Subject: [Python-Dev] Python 2.5 performance Message-ID: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> Hello there. I just got round to do some comparative runs of 2.5 32 bit Release, built with visual studio 2003 and 2005. Here the figures (pybench with default arguments) .NET 2003: Test minimum average operation overhead ------------------------------------------------------------------------------- BuiltinFunctionCalls: 262ms 304ms 0.60us 0.251ms BuiltinMethodLookup: 232ms 267ms 0.25us 0.312ms CompareFloats: 148ms 170ms 0.14us 0.377ms CompareFloatsIntegers: 183ms 216ms 0.24us 0.261ms CompareIntegers: 144ms 163ms 0.09us 0.527ms CompareInternedStrings: 157ms 186ms 0.12us 1.606ms CompareLongs: 153ms 174ms 0.17us 0.300ms CompareStrings: 156ms 198ms 0.20us 1.166ms CompareUnicode: 180ms 205ms 0.27us 0.731ms ConcatStrings: 410ms 457ms 0.91us 0.579ms ConcatUnicode: 473ms 610ms 2.03us 0.466ms CreateInstances: 248ms 290ms 2.59us 0.432ms CreateNewInstances: 206ms 243ms 2.89us 0.352ms CreateStringsWithConcat: 164ms 200ms 0.20us 0.971ms CreateUnicodeWithConcat: 268ms 295ms 0.74us 0.343ms DictCreation: 152ms 186ms 0.47us 0.358ms DictWithFloatKeys: 378ms 410ms 0.46us 0.660ms DictWithIntegerKeys: 133ms 161ms 0.13us 0.907ms DictWithStringKeys: 152ms 184ms 0.15us 0.927ms ForLoops: 125ms 133ms 5.32us 0.069ms IfThenElse: 109ms 131ms 0.10us 1.019ms ListSlicing: 193ms 223ms 15.90us 0.072ms NestedForLoops: 147ms 164ms 0.11us 0.021ms NormalClassAttribute: 176ms 195ms 0.16us 0.579ms NormalInstanceAttribute: 171ms 198ms 0.17us 0.598ms PythonFunctionCalls: 207ms 240ms 0.73us 0.326ms PythonMethodCalls: 234ms 287ms 1.27us 0.163ms Recursion: 294ms 328ms 6.56us 0.563ms SecondImport: 191ms 210ms 2.10us 0.241ms SecondPackageImport: 197ms 220ms 2.20us 0.217ms SecondSubmoduleImport: 257ms 276ms 2.76us 0.213ms SimpleComplexArithmetic: 191ms 208ms 0.24us 0.445ms SimpleDictManipulation: 158ms 178ms 0.15us 0.625ms SimpleFloatArithmetic: 183ms 211ms 0.16us 0.703ms SimpleIntFloatArithmetic: 122ms 133ms 0.10us 0.745ms SimpleIntegerArithmetic: 106ms 121ms 0.09us 0.680ms SimpleListManipulation: 132ms 149ms 0.13us 0.750ms SimpleLongArithmetic: 170ms 198ms 0.30us 0.322ms SmallLists: 246ms 274ms 0.40us 0.437ms SmallTuples: 204ms 235ms 0.43us 0.497ms SpecialClassAttribute: 177ms 201ms 0.17us 0.561ms SpecialInstanceAttribute: 257ms 290ms 0.24us 0.598ms StringMappings: 881ms 949ms 3.77us 0.584ms StringPredicates: 321ms 366ms 0.52us 3.207ms StringSlicing: 243ms 286ms 0.51us 1.032ms TryExcept: 87ms 110ms 0.05us 0.957ms TryRaiseExcept: 164ms 197ms 3.08us 0.434ms TupleSlicing: 195ms 230ms 0.88us 0.065ms UnicodeMappings: 158ms 187ms 5.20us 0.699ms UnicodePredicates: 191ms 233ms 0.43us 3.954ms UnicodeProperties: 209ms 251ms 0.63us 3.234ms UnicodeSlicing: 306ms 345ms 0.70us 0.933ms ------------------------------------------------------------------------------- Totals: 11202ms 12875ms .NET 2005: Test minimum average operation overhead ------------------------------------------------------------------------------- BuiltinFunctionCalls: 254ms 279ms 0.55us 0.280ms BuiltinMethodLookup: 269ms 290ms 0.28us 0.327ms CompareFloats: 136ms 147ms 0.12us 0.375ms CompareFloatsIntegers: 158ms 178ms 0.20us 0.268ms CompareIntegers: 118ms 141ms 0.08us 0.603ms CompareInternedStrings: 152ms 203ms 0.14us 1.666ms CompareLongs: 152ms 171ms 0.16us 0.335ms CompareStrings: 118ms 140ms 0.14us 1.374ms CompareUnicode: 160ms 180ms 0.24us 0.730ms ConcatStrings: 430ms 472ms 0.94us 0.681ms ConcatUnicode: 488ms 535ms 1.78us 0.458ms CreateInstances: 249ms 286ms 2.56us 0.437ms CreateNewInstances: 220ms 254ms 3.02us 0.356ms CreateStringsWithConcat: 174ms 204ms 0.20us 1.123ms CreateUnicodeWithConcat: 271ms 294ms 0.74us 0.348ms DictCreation: 151ms 169ms 0.42us 0.365ms DictWithFloatKeys: 350ms 387ms 0.43us 0.666ms DictWithIntegerKeys: 140ms 151ms 0.13us 1.020ms DictWithStringKeys: 154ms 176ms 0.15us 1.070ms ForLoops: 96ms 111ms 4.42us 0.069ms IfThenElse: 115ms 130ms 0.10us 0.697ms ListSlicing: 221ms 261ms 18.66us 0.093ms NestedForLoops: 146ms 167ms 0.11us 0.022ms NormalClassAttribute: 182ms 205ms 0.17us 0.502ms NormalInstanceAttribute: 174ms 192ms 0.16us 0.457ms PythonFunctionCalls: 203ms 221ms 0.67us 0.337ms PythonMethodCalls: 266ms 309ms 1.37us 0.149ms Recursion: 286ms 329ms 6.57us 0.459ms SecondImport: 170ms 197ms 1.97us 0.181ms SecondPackageImport: 187ms 215ms 2.15us 0.178ms SecondSubmoduleImport: 243ms 275ms 2.75us 0.215ms SimpleComplexArithmetic: 177ms 199ms 0.23us 0.370ms SimpleDictManipulation: 159ms 185ms 0.15us 0.498ms SimpleFloatArithmetic: 177ms 196ms 0.15us 1.502ms SimpleIntFloatArithmetic: 109ms 126ms 0.10us 0.574ms SimpleIntegerArithmetic: 108ms 124ms 0.09us 0.611ms SimpleListManipulation: 145ms 169ms 0.14us 0.619ms SimpleLongArithmetic: 167ms 190ms 0.29us 0.324ms SmallLists: 247ms 274ms 0.40us 0.339ms SmallTuples: 204ms 224ms 0.42us 0.429ms SpecialClassAttribute: 193ms 216ms 0.18us 0.558ms SpecialInstanceAttribute: 255ms 280ms 0.23us 0.470ms StringMappings: 297ms 321ms 1.28us 0.474ms StringPredicates: 229ms 274ms 0.39us 3.892ms StringSlicing: 238ms 258ms 0.46us 0.962ms TryExcept: 86ms 102ms 0.05us 0.755ms TryRaiseExcept: 155ms 173ms 2.70us 0.357ms TupleSlicing: 188ms 217ms 0.83us 0.050ms UnicodeMappings: 103ms 118ms 3.29us 0.595ms UnicodePredicates: 176ms 207ms 0.38us 3.950ms UnicodeProperties: 187ms 212ms 0.53us 3.228ms UnicodeSlicing: 312ms 342ms 0.70us 0.834ms ------------------------------------------------------------------------------- Totals: 10343ms 11677ms This is an improvement of more than 7%. In addition, here is a run of the PGO optimized .NET 2005: Test minimum average operation overhead ------------------------------------------------------------------------------- BuiltinFunctionCalls: 232ms 250ms 0.49us 0.330ms BuiltinMethodLookup: 276ms 296ms 0.28us 0.382ms CompareFloats: 130ms 142ms 0.12us 0.451ms CompareFloatsIntegers: 150ms 166ms 0.18us 0.326ms CompareIntegers: 130ms 155ms 0.09us 0.729ms CompareInternedStrings: 152ms 197ms 0.13us 1.947ms CompareLongs: 136ms 146ms 0.14us 0.390ms CompareStrings: 151ms 174ms 0.17us 1.583ms CompareUnicode: 131ms 167ms 0.22us 0.965ms ConcatStrings: 417ms 485ms 0.97us 0.681ms ConcatUnicode: 483ms 551ms 1.84us 0.484ms CreateInstances: 224ms 252ms 2.25us 0.600ms CreateNewInstances: 186ms 216ms 2.58us 0.407ms CreateStringsWithConcat: 155ms 175ms 0.18us 1.264ms CreateUnicodeWithConcat: 275ms 306ms 0.76us 0.437ms DictCreation: 160ms 186ms 0.47us 0.443ms DictWithFloatKeys: 349ms 375ms 0.42us 0.924ms DictWithIntegerKeys: 143ms 173ms 0.14us 1.296ms DictWithStringKeys: 157ms 177ms 0.15us 1.184ms ForLoops: 140ms 155ms 6.21us 0.074ms IfThenElse: 107ms 127ms 0.09us 0.955ms ListSlicing: 217ms 256ms 18.29us 0.103ms NestedForLoops: 166ms 194ms 0.13us 0.018ms NormalClassAttribute: 163ms 179ms 0.15us 0.564ms NormalInstanceAttribute: 151ms 169ms 0.14us 0.536ms PythonFunctionCalls: 210ms 235ms 0.71us 0.313ms PythonMethodCalls: 237ms 260ms 1.15us 0.167ms Recursion: 285ms 334ms 6.68us 0.538ms SecondImport: 147ms 169ms 1.69us 0.243ms SecondPackageImport: 155ms 200ms 2.00us 0.215ms SecondSubmoduleImport: 202ms 234ms 2.34us 0.203ms SimpleComplexArithmetic: 162ms 187ms 0.21us 0.446ms SimpleDictManipulation: 162ms 181ms 0.15us 0.627ms SimpleFloatArithmetic: 171ms 201ms 0.15us 1.335ms SimpleIntFloatArithmetic: 119ms 137ms 0.10us 0.659ms SimpleIntegerArithmetic: 114ms 128ms 0.10us 0.668ms SimpleListManipulation: 145ms 161ms 0.14us 0.764ms SimpleLongArithmetic: 161ms 178ms 0.27us 0.423ms SmallLists: 234ms 271ms 0.40us 0.454ms SmallTuples: 182ms 203ms 0.38us 0.497ms SpecialClassAttribute: 174ms 201ms 0.17us 0.716ms SpecialInstanceAttribute: 230ms 252ms 0.21us 0.558ms StringMappings: 285ms 313ms 1.24us 0.514ms StringPredicates: 233ms 275ms 0.39us 3.475ms StringSlicing: 225ms 242ms 0.43us 1.037ms TryExcept: 78ms 89ms 0.04us 0.961ms TryRaiseExcept: 133ms 156ms 2.44us 0.454ms TupleSlicing: 186ms 202ms 0.77us 0.078ms UnicodeMappings: 103ms 118ms 3.29us 0.520ms UnicodePredicates: 186ms 216ms 0.40us 3.414ms UnicodeProperties: 180ms 214ms 0.54us 2.530ms UnicodeSlicing: 299ms 318ms 0.65us 0.815ms ------------------------------------------------------------------------------- Totals: 9974ms 11345ms This is an improvement of another 3.5 %. In all, we have a performance increase of more than 10%. Granted, this is from a single set of runs, but I think we should start considering to make PCBuild8 a "supported" build. Cheers, Kristj?n -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061011/49a0c38e/attachment-0001.htm From r.m.oudkerk at googlemail.com Wed Oct 11 17:14:27 2006 From: r.m.oudkerk at googlemail.com (Richard Oudkerk) Date: Wed, 11 Oct 2006 16:14:27 +0100 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: References: <20061010084306.09AE.JCARLSON@uci.edu> Message-ID: On 10/10/06, Fredrik Lundh wrote: > Josiah Carlson wrote: > > > Presumably with this library you have created, you have also written a > > fast object encoder/decoder (like marshal or pickle). If it isn't any > > faster than cPickle or marshal, then users may bypass the module and opt > > for fork/etc. + XML-RPC > > XML-RPC isn't close to marshal and cPickle in performance, though, so > that statement is a bit misleading. > > the really interesting thing here is a ready-made threading-style API, I > think. reimplementing queues, locks, and semaphores can be a reasonable > amount of work; might as well use an existing implementation. > The module uses cPickle. As for speed, on my old laptop I get maybe 1300 objects through a queue a second. For many purposes this might be too slow, in which cases you are better of sticking to threading; for many other cases that should not be a problem. It should quite possible to connect to an ObjectServer on a different machine, though I have not tried it. Although I reuse Queue, I wrote locks, semaphores and conditions from scratch -- I could not see a sensible way to use the original implementations. (The implementations of those classes are actually quite a bit shorter than the ones in threading.py.) By the way, on windows the example files currently need to be executed from commandline rather than clicked on (but that is easily fixable). From r.m.oudkerk at googlemail.com Wed Oct 11 17:20:44 2006 From: r.m.oudkerk at googlemail.com (Richard Oudkerk) Date: Wed, 11 Oct 2006 16:20:44 +0100 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <20061010130901.09B1.JCARLSON@uci.edu> References: <20061010084306.09AE.JCARLSON@uci.edu> <20061010130901.09B1.JCARLSON@uci.edu> Message-ID: On 10/10/06, Josiah Carlson wrote: > > the really interesting thing here is a ready-made threading-style API, I > > think. reimplementing queues, locks, and semaphores can be a reasonable > > amount of work; might as well use an existing implementation. > > Really, it is a matter of asking what kind of API is desireable. Do we > want to have threading plus other stuff be the style of API that we want > to replicate? Do we want to have shared queue objects, or would an > XML-RPC-esque remote.queue_put('queue_X', value) and > remote.queue_get('queue_X', blocking=1) be better? Whatever the API is, I think it is useful if you can swap between threads and processes just by changing the import line. That way you can write applications without deciding upfront which to use. From python at rcn.com Thu Oct 12 17:12:43 2006 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Oct 2006 08:12:43 -0700 Subject: [Python-Dev] Python 2.5 performance References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> Message-ID: <00df01c6ee10$ded9c920$ea146b0a@RaymondLaptop1> > From: Kristj?n V. J?nsson > I think we should start considering to make PCBuild8 a "supported" build. +1 and not just for the free speed-up. VC8 is what more and more Windows developers will have on there machines. Without a supported build, it becomes much harder to make patches or build compatible extensions. Raymond From snaury at gmail.com Thu Oct 12 17:15:07 2006 From: snaury at gmail.com (Alexey Borzenkov) Date: Thu, 12 Oct 2006 19:15:07 +0400 Subject: [Python-Dev] Why spawnvp not implemented on Windows? Message-ID: Hi all, I've been looking at python 2.5 today and what I notices is absense of spawnvp with this comment in os.py: # At the moment, Windows doesn't implement spawnvp[e], # so it won't have spawnlp[e] either. I'm wondering, why so? Searching MSDN I can see that these functions are implemented in CRT: spawnvp: http://msdn2.microsoft.com/en-us/library/275khfab.aspx spawnvpe: http://msdn2.microsoft.com/en-us/library/h565xwht.aspx I can also see that spawnvp and spawnvpe are currently wrapped in posixmodule.c, but for some reason on OS/2 only. Forgive me if I'm wrong but shouldn't it work when #if defined(PYOS_OS2) is changed to #if defined(PYOS_OS2) || defined(MS_WINDOWS) around spawnvp and spawnvpe wrappers and in posix_methods? At least when I did it with my copy, nt.spawnvp seems to work fine... From barry at python.org Thu Oct 12 17:36:37 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 12 Oct 2006 11:36:37 -0400 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610121808.47010.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> Message-ID: <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 12, 2006, at 4:08 AM, Anthony Baxter wrote: > I've had a couple of queries about whether PSF-2006-001 merits a > 2.3.6. > Personally, I lean towards "no" - 2.4 was nearly two years ago now. > But I'm > open to other opinions - I guess people see the phrase "buffer > overrun" and > they get scared. > > Plus once 2.4.4 final is out next week, I'll have cut 12 releases > since > March. Assuming a 2.5.1 before March (very likely) that'll be 14 > releases > in 12 months. 16 releases in 12 months would just about make me go > crazy. I've offered in the past to dust off my release manager cap and do a 2.3.6 release. Having not done one in a long while, the most daunting part for me is getting the website updated, since I have none of those tools installed. I'm still willing to do a 2.3.6, though the last time this came up the response was too underwhelming to care. I'm not sure this advisory is enough to change people's minds about that -- I'm sure any affected downstream distro is fully capable of patching and re- releasing their own packages. Since this doesn't affect the binaries /we/ release, I'm not sure I care enough either. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRS5hD3EjvBPtnXfVAQIlLgP/Rz5ahaeus0VLJT0HmyZUYBf07Crr2e1K KgCoEDqXZq+LyF7B8bqokXZ4uFisBbQTREM3d+8vYEHC9kcQpt0FurkSFc47G0gj rJvm0XbGkhXFGdPqrTwUoT033f/bhabpEILDkNJx6bB+Jk5G23EyTKRRDB531QvY qC6ttgGRfVA= =dECg -----END PGP SIGNATURE----- From tjreedy at udel.edu Thu Oct 12 19:34:09 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 12 Oct 2006 13:34:09 -0400 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> Message-ID: "Barry Warsaw" wrote in message news:2514DA1C-F5A1-4144-9068-006A933C516C at python.org... > -----BEGIN PGP SIGNED MESSAGE----- > I've offered in the past to dust off my release manager cap and do a > 2.3.6 release. Having not done one in a long while, the most > daunting part for me is getting the website updated, since I have > none of those tools installed. > > I'm still willing to do a 2.3.6, though the last time this came up > the response was too underwhelming to care. I'm not sure this > advisory is enough to change people's minds about that -- I'm sure > any affected downstream distro is fully capable of patching and re- > releasing their own packages. Since this doesn't affect the > binaries /we/ release, I'm not sure I care enough either. Perhaps all that is needed from both a practical and public relations viewpoint is the release of a 2.3.5U4 security patch as a separate file listed just after 2.3.5 on the source downloads page (if this has not been done already). Add a note (or link to a note) to the effect that it should be applied if one has or is going to compile a wide Unicode build for use in an environment exposed to untrusted Unicode text. tjr From barry at python.org Thu Oct 12 19:55:17 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 12 Oct 2006 13:55:17 -0400 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> Message-ID: <00467286-2218-460F-9B46-54A59F9CC312@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 12, 2006, at 1:34 PM, Terry Reedy wrote: > Perhaps all that is needed from both a practical and public relations > viewpoint is the release of a 2.3.5U4 security patch as a separate > file > listed just after 2.3.5 on the source downloads page (if this has > not been > done already). I don't currently have the ability to update the website, but I think the download page should have a big red star that points to the security patch. The 2.3.5 page should probably be updated with a link to the patch too. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRS6BjnEjvBPtnXfVAQKssAQAnrgoMFbuAQRFSAReCYLBovsXJK481NdB gTk/gaAAXe15Ko+HN0gr1EF7Mpd9a8h+5UaWyiQo+2dEJFPYr8LKcLhVLRO75jwK A7oeXl859cUjwVK1Lc6uR/gFXUIhCsd8kujKb3lE71K6ygVtcqHwxr4OcMlMe/+j YExPu6zELjk= =NcuJ -----END PGP SIGNATURE----- From snaury at gmail.com Thu Oct 12 20:32:23 2006 From: snaury at gmail.com (Alexey Borzenkov) Date: Thu, 12 Oct 2006 22:32:23 +0400 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: Message-ID: On 10/12/06, Alexey Borzenkov wrote: > At least when I did it with my copy, nt.spawnvp seems to work fine... Hi everyone again. I've created patch for spawn*p*, as well as for exec*p* against trunk, so that when possible it uses crt's execvp[e] (defined via HAVE_EXECVP, if there are other platforms that have it they will need to define HAVE_EXECVP and HAVE_SPAWNVP). Fix is in os.py and posixmodule.c: http://snaury.googlepages.com/python-win32-spawn_p_.patch Should I submit it to sourceforge as a patch, or someone can review it as is? From aahz at pythoncraft.com Thu Oct 12 20:39:41 2006 From: aahz at pythoncraft.com (Aahz) Date: Thu, 12 Oct 2006 11:39:41 -0700 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: Message-ID: <20061012183940.GA13499@panix.com> On Thu, Oct 12, 2006, Alexey Borzenkov wrote: > > Should I submit it to sourceforge as a patch, or someone can review it as is? Always submit patches; that guarantees your work won't get lost. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you don't know what your program is supposed to do, you'd better not start writing it." --Dijkstra From anthony at interlink.com.au Thu Oct 12 21:27:59 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 05:27:59 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> Message-ID: <200610130528.01672.anthony@interlink.com.au> On Thursday 12 October 2006 18:18, Fredrik Lundh wrote: > Anthony Baxter wrote: > > 16 releases in 12 months would just about make me go crazy. > > is there any way we could further automate or otherwise streamline or > distribute the release process ? It's already pretty heavily automated (see welease.py in the SVN sandbox). The killer problem is pyramid (the system for the website). Here's (roughly) a breakdown of the workload: - Update the 10 or so files that need the date and version number (about 3m) - Run welease.py, select the branch, enter the version number, press 4 buttons, one after the other. It complains and stops if something goes wrong. (elapsed time about 5-10m, actual "work" time < 30s) - Wait for the Mac/Win/Doc builders (elapsed, 6-12h, depending on timezones, actual "work" time 0s) - Sign binaries and put in place on website (maybe 2m work, plus 5-10m to scp up to dinsdale) - Update webpages (between 30m and an hour, depending on how much I have to fight with pyramid. I still need to go update the old release pages putting the warnings on them, so there's probably another hour of work today) I've mentioned this on pydotorg enough times, I don't feel I can continue to complain about it (because I can't offer the time to make it better) but pyramid is *not* *good* from my point of view. The older system with Makefiles, ht2html and rsync took maybe 1/4 to 1/3 as long. > ideally, releasing (earlier release + well-defined patch set) should be > fairly trivial, compared to releasing (new release from trunk). what do > we have to do to make it easier to handle that case? Mostly it is easy for me, with the one huge caveat. As far as I know, the Mac build is a single command to run for Ronald, and the Doc build similarly for Fred. I don't know what Martin has to do for the Windows build. -- Anthony Baxter It's never too late to have a happy childhood. From barry at python.org Thu Oct 12 22:25:55 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 12 Oct 2006 16:25:55 -0400 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610130528.01672.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 12, 2006, at 3:27 PM, Anthony Baxter wrote: > Mostly it is easy for me, with the one huge caveat. As far as I > know, the Mac > build is a single command to run for Ronald, and the Doc build > similarly for > Fred. I don't know what Martin has to do for the Windows build. Why can't we get buildbot to do most or all of this? At work, we have buildbot slaves that post installers to a share after successful checkout, build, and test on all our supported platforms. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRS6k2HEjvBPtnXfVAQJeawP7BTqVw7tN80h5lB5UZp4MuN2Q/3KWapIi lYZeBqoaiouXKIkKsHbCVb1/OeZQRnDwEqWPu0xKfzlteYUchmDh2h53nzfynyyS PdJ5FaKcAk0LBjR0JsSZKd6TEWxKZZHs04V2LiKZpmsICG8g7uH954wleyGLTl2h 7VZ1aVxGuko= =1Ito -----END PGP SIGNATURE----- From rasky at develer.com Thu Oct 12 22:29:46 2006 From: rasky at develer.com (Giovanni Bajo) Date: Thu, 12 Oct 2006 22:29:46 +0200 Subject: [Python-Dev] Python 2.5 performance References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> Message-ID: <11ac01c6ee3d$2830a450$e303030a@trilan> Kristj?n V. J?nsson wrote: > This is an improvement of another 3.5 %. > In all, we have a performance increase of more than 10%. > Granted, this is from a single set of runs, but I think we should > start considering to make PCBuild8 a "supported" build. Kristj?n, I wonder if the performance improvement comes from ceval.c only (or maybe a few other selected files). Is it possible to somehow link the PGO-optimized ceval.obj into the VS2003 project? -- Giovanni Bajo From ronaldoussoren at mac.com Thu Oct 12 22:38:28 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 12 Oct 2006 22:38:28 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> Message-ID: <0EFD836B-CB42-4F2D-8B82-883758487D87@mac.com> On Oct 12, 2006, at 10:25 PM, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Oct 12, 2006, at 3:27 PM, Anthony Baxter wrote: > >> Mostly it is easy for me, with the one huge caveat. As far as I >> know, the Mac >> build is a single command to run for Ronald, and the Doc build >> similarly for >> Fred. I don't know what Martin has to do for the Windows build. > > Why can't we get buildbot to do most or all of this? At work, we > have buildbot slaves that post installers to a share after successful > checkout, build, and test on all our supported platforms. The windows build is a single command, but I test the output on 3 different platforms (10.3/ppc, 10.4/ppc and 10.4/x86). If buildbot supports such a configuration I'd be very interested (and not just for Python itself). Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061012/85d9353c/attachment.bin From anthony at interlink.com.au Thu Oct 12 22:43:40 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 06:43:40 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> Message-ID: <200610130643.46552.anthony@interlink.com.au> On Friday 13 October 2006 06:25, Barry Warsaw wrote: > On Oct 12, 2006, at 3:27 PM, Anthony Baxter wrote: > > Mostly it is easy for me, with the one huge caveat. As far as I > > know, the Mac > > build is a single command to run for Ronald, and the Doc build > > similarly for > > Fred. I don't know what Martin has to do for the Windows build. > > Why can't we get buildbot to do most or all of this? At work, we > have buildbot slaves that post installers to a share after successful > checkout, build, and test on all our supported platforms. Speaking for myself, I'd rather do it by hand, if it's not a lot of work (which it isn't) - I don't like the idea of "official" releases just being an automated thing. If you're instead just talking about daily builds, maybe, but we'd need to have some new way to do versioning for these. -- Anthony Baxter It's never too late to have a happy childhood. From martin at v.loewis.de Thu Oct 12 22:49:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Oct 2006 22:49:32 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> Message-ID: <452EAA5C.9090806@v.loewis.de> Barry Warsaw schrieb: > Why can't we get buildbot to do most or all of this? Very easy. Because somebody has to set it up. I estimate a man month or so before it works. Regards, Martin From martin at v.loewis.de Thu Oct 12 22:50:54 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Oct 2006 22:50:54 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610130528.01672.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> Message-ID: <452EAAAE.2050200@v.loewis.de> Anthony Baxter schrieb: > Mostly it is easy for me, with the one huge caveat. As far as I know, the Mac > build is a single command to run for Ronald, and the Doc build similarly for > Fred. I don't know what Martin has to do for the Windows build. Actually, for 2.3.x, I wouldn't do the Windows builds. I think Thomas Heller did the 2.3.x series. Regards, Martin From g.brandl at gmx.net Thu Oct 12 21:30:49 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 12 Oct 2006 21:30:49 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> Message-ID: Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Oct 12, 2006, at 4:08 AM, Anthony Baxter wrote: > >> I've had a couple of queries about whether PSF-2006-001 merits a >> 2.3.6. >> Personally, I lean towards "no" - 2.4 was nearly two years ago now. >> But I'm >> open to other opinions - I guess people see the phrase "buffer >> overrun" and >> they get scared. >> >> Plus once 2.4.4 final is out next week, I'll have cut 12 releases >> since >> March. Assuming a 2.5.1 before March (very likely) that'll be 14 >> releases >> in 12 months. 16 releases in 12 months would just about make me go >> crazy. > > I've offered in the past to dust off my release manager cap and do a > 2.3.6 release. Having not done one in a long while, the most > daunting part for me is getting the website updated, since I have > none of those tools installed. I'm I the only one who feels that the website is a big workflow problem? Georg From martin at v.loewis.de Thu Oct 12 22:57:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Oct 2006 22:57:38 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> Message-ID: <452EAC42.5040703@v.loewis.de> Fredrik Lundh schrieb: > ideally, releasing (earlier release + well-defined patch set) should be > fairly trivial, compared to releasing (new release from trunk). what do > we have to do to make it easier to handle that case? For the Windows release, I doubt there is much one can do. The time-consuming part is to run the MSI file, on three different architectures, and in various combinations (admin/no-admin, default directory/Program Files, upgrade/no-upgrade). I don't always do all of them, but still it takes a while; I usually need an hour to make a release. Plus, sometimes something goes wrong: there might a backport that doesn't work on Windows, or it might be that I broke my build environment somehow (which I normally keep across releases - if I have to start from scratch on a fresh machine, it takes much longer: a day or so). Regards, Martin From martin at v.loewis.de Thu Oct 12 23:00:09 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Oct 2006 23:00:09 +0200 Subject: [Python-Dev] Python 2.5 performance In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> Message-ID: <452EACD9.9090001@v.loewis.de> Kristj?n V. J?nsson schrieb: > This is an improvement of another 3.5 %. > In all, we have a performance increase of more than 10%. > Granted, this is from a single set of runs, but I think we should start > considering to make PCBuild8 a "supported" build. What do you mean by that? That Python 2.5.1 should be compiled with VC 2005? Something else (if so, what)? Regards, Martin From greg at electricrain.com Thu Oct 12 23:03:10 2006 From: greg at electricrain.com (Gregory P. Smith) Date: Thu, 12 Oct 2006 14:03:10 -0700 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610130643.46552.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <200610130643.46552.anthony@interlink.com.au> Message-ID: <20061012210310.GC6015@zot.electricrain.com> On Fri, Oct 13, 2006 at 06:43:40AM +1000, Anthony Baxter wrote: > On Friday 13 October 2006 06:25, Barry Warsaw wrote: > > On Oct 12, 2006, at 3:27 PM, Anthony Baxter wrote: > > > Mostly it is easy for me, with the one huge caveat. As far as I > > > know, the Mac > > > build is a single command to run for Ronald, and the Doc build > > > similarly for > > > Fred. I don't know what Martin has to do for the Windows build. > > > > Why can't we get buildbot to do most or all of this? At work, we > > have buildbot slaves that post installers to a share after successful > > checkout, build, and test on all our supported platforms. > > Speaking for myself, I'd rather do it by hand, if it's not a lot of work > (which it isn't) - I don't like the idea of "official" releases just being > an automated thing. IMHO thats a backwards view; I'm with Barry. Requiring human intervention to do anything other than press the big green "go" button to launch the "official" release build process is an opportunity for human error. the same goes for testing the built release installers and tarballs. three macs with some virtual machines could take care of this (damn apple for not allowing their stupid OS to be virtualized). that said, i'm not volunteering to setup an automated system for this but i've got good ideas how to do it if i ever find time or someone wants to chat offline. :( as for buildbot, i haven't looked at its design but from the chatter i've seen i was under the impression that it operates on a continually updated sandbox rather than a 100% fresh checkout for each build? if thats true (is it?) i'd prefer to see a build system setup to do a fresh checkout+build of everything (including externals) in a new directory for each build in use. thats what we do at work. none of the above even considers the web site updating problem.. greg From greg at electricrain.com Thu Oct 12 23:04:31 2006 From: greg at electricrain.com (Gregory P. Smith) Date: Thu, 12 Oct 2006 14:04:31 -0700 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> Message-ID: <20061012210431.GD6015@zot.electricrain.com> On Thu, Oct 12, 2006 at 09:30:49PM +0200, Georg Brandl wrote: > Barry Warsaw wrote: > > I've offered in the past to dust off my release manager cap and do a > > 2.3.6 release. Having not done one in a long while, the most > > daunting part for me is getting the website updated, since I have > > none of those tools installed. > > I'm I the only one who feels that the website is a big workflow problem? nope, you're not. From martin at v.loewis.de Thu Oct 12 23:04:52 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Oct 2006 23:04:52 +0200 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: Message-ID: <452EADF4.9010800@v.loewis.de> Alexey Borzenkov schrieb: > Should I submit it to sourceforge as a patch, or someone can review it as is? Please consider also exposing _wspawnvp, depending on whether path argument is a Unicode object or not. See PEP 277 for guidance. Since this would go into 2.6, support for Windows 95 isn't mandatory. Regards, Martin From greg at electricrain.com Thu Oct 12 23:07:44 2006 From: greg at electricrain.com (Gregory P. Smith) Date: Thu, 12 Oct 2006 14:07:44 -0700 Subject: [Python-Dev] Python 2.5 performance In-Reply-To: <452EACD9.9090001@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> <452EACD9.9090001@v.loewis.de> Message-ID: <20061012210744.GE6015@zot.electricrain.com> On Thu, Oct 12, 2006 at 11:00:09PM +0200, "Martin v. L?wis" wrote: > Kristj?n V. J?nsson schrieb: > > This is an improvement of another 3.5 %. > > In all, we have a performance increase of more than 10%. > > Granted, this is from a single set of runs, but I think we should start > > considering to make PCBuild8 a "supported" build. > > What do you mean by that? That Python 2.5.1 should be compiled with > VC 2005? Something else (if so, what)? i read that as just suggesting that updates should be checked into the release25-maint tree to get PCBuild8 working out of the box for anyone who wants to build python from source with vs2005. Since 2.5 has already shipped built with vs2003 all of the 2.5.x releases should continue to use that so that third party binary modules continue to work across 2.5.x versions. -g From martin at v.loewis.de Thu Oct 12 23:07:56 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 12 Oct 2006 23:07:56 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <20061012210310.GC6015@zot.electricrain.com> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <200610130643.46552.anthony@interlink.com.au> <20061012210310.GC6015@zot.electricrain.com> Message-ID: <452EAEAC.1020007@v.loewis.de> Gregory P. Smith schrieb: > three macs with some virtual machines could take care of this (damn > apple for not allowing their stupid OS to be virtualized). that said, > i'm not volunteering to setup an automated system for this but i've > got good ideas how to do it if i ever find time or someone wants to > chat offline. :( Of course, that makes the idea die here and now. Without volunteers to do the actual work, it just won't happen. > as for buildbot, i haven't looked at its design but from the chatter > i've seen i was under the impression that it operates on a continually > updated sandbox rather than a 100% fresh checkout for each build? if > thats true (is it?) i'd prefer to see a build system setup to do a > fresh checkout+build of everything (including externals) in a new > directory for each build in use. thats what we do at work. Buildbot could do that easily; in fact, I had to explicitly configure it to not start from scratch each time, to reduce the network traffic of the donated machines. Regards, Martin From theller at python.net Thu Oct 12 22:15:05 2006 From: theller at python.net (Thomas Heller) Date: Thu, 12 Oct 2006 22:15:05 +0200 Subject: [Python-Dev] Exceptions and slicing In-Reply-To: References: <45119D6C.2050005@v.loewis.de> Message-ID: Thomas Heller schrieb: > Martin v. L?wis schrieb: >> Thomas Heller schrieb: >>> 1. The __str__ of a WindowsError instance hides the 'real' windows >>> error number. So, in 2.4 "print error_instance" would print >>> for example: >>> >>> [Errno 1002] Das Fenster kann die gesendete Nachricht nicht verarbeiten. >>> >>> while in 2.5: >>> >>> [Error 22] Das Fenster kann die gesendete Nachricht nicht verarbeiten. >> >> That's a bug. I changed the string deliberately from Errno to error to >> indicate that it is not an errno, but a GetLastError. Can you come up >> with a patch? > > Yes, but not today. I submitted a patch for this issue: http://python.org/sf/1576174 Thomas From anthony at interlink.com.au Thu Oct 12 23:12:49 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 07:12:49 +1000 Subject: [Python-Dev] Python 2.5 performance In-Reply-To: <452EACD9.9090001@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> <452EACD9.9090001@v.loewis.de> Message-ID: <200610130712.51127.anthony@interlink.com.au> On Friday 13 October 2006 07:00, Martin v. L?wis wrote: > Kristj?n V. J?nsson schrieb: > > This is an improvement of another 3.5 %. > > In all, we have a performance increase of more than 10%. > > Granted, this is from a single set of runs, but I think we should start > > considering to make PCBuild8 a "supported" build. > > What do you mean by that? That Python 2.5.1 should be compiled with > VC 2005? Something else (if so, what)? I don't think we should switch the "official" compiler for a point release. I'm happy to say something like "we make the PCbuild8 environment a supported compiler", which means we need, at a bare minimum, a buildbot slave for that compiler/platform. Kristj?n, is this something you can offer? Without a buildbot for that compiler, I don't think we can claim it's supported. There's plenty of platforms we "support" which don't have buildslaves, but they're all variants of Unix - I'm happy that they are all mostly[1] sane. Anthony [1] Offer void on some versions of HP/UX, Irix, AIX -- Anthony Baxter It's never too late to have a happy childhood. From anthony at interlink.com.au Thu Oct 12 23:13:58 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 07:13:58 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> Message-ID: <200610130714.00673.anthony@interlink.com.au> On Friday 13 October 2006 05:30, Georg Brandl wrote: > I'm I the only one who feels that the website is a big workflow problem? Assuming you meant "Am I", then I absolutely agree with you. -- Anthony Baxter It's never too late to have a happy childhood. From barry at python.org Thu Oct 12 23:34:37 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 12 Oct 2006 17:34:37 -0400 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <20061012210310.GC6015@zot.electricrain.com> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <200610130643.46552.anthony@interlink.com.au> <20061012210310.GC6015@zot.electricrain.com> Message-ID: <1C323968-BA50-4D36-B5E2-5B4B10306627@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 12, 2006, at 5:03 PM, Gregory P. Smith wrote: > IMHO thats a backwards view; I'm with Barry. Requiring human > intervention to do anything other than press the big green "go" button > to launch the "official" release build process is an opportunity for > human error. the same goes for testing the built release installers > and tarballs. Oh yes, that's an important step I forgot to mention. At work, we also run automated tests of the built installers, so we have a high degree of confidence that what our buildbot farm produces at least passes the sniff test (/then/ our QA dept takes over from there). The files we upload then are named by product, platform, version, revision id, and date. It takes a manual step to delete old builds, but we have big disks so we generally don't do that except for EOL'd versions. The nice thing about that is that you can go back to almost any build and pull down a working installer. Greg hints at a major benefit of this: the knowledge for how to successfully build products is contained in scripts that are themselves revision controlled. A wiki page providing an overview and the starting points are still needed but rarely consulted. > i'm not volunteering to setup an automated system for this but i've > got good ideas how to do it if i ever find time or someone wants to > chat offline. :( I wish I had the cycles to volunteer to help out implementing this. :( - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRS608nEjvBPtnXfVAQIypAQAtiantWJkvStPYR8tnl+AU+HzI7bZ54s1 oX8Ni0/1IbZQwYloV6UMmhwisirZ5bwAtNWfZnd3UQXFhrCC1MGlRMOWP/y6AwS2 /gSzUV9A1dxUE9iVdPy50gEMFrzrZ32g16+FsHzae/9FgklB+GjogAuYmr2vbxd4 SrB1dgEHnXg= =6rIv -----END PGP SIGNATURE----- From barry at python.org Thu Oct 12 23:38:26 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 12 Oct 2006 17:38:26 -0400 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <452EAEAC.1020007@v.loewis.de> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <200610130643.46552.anthony@interlink.com.au> <20061012210310.GC6015@zot.electricrain.com> <452EAEAC.1020007@v.loewis.de> Message-ID: <3191856D-5AFD-4418-B99C-8BE07BA9F1F7@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 12, 2006, at 5:07 PM, Martin v. L?wis wrote: > Of course, that makes the idea die here and now. Without volunteers > to do the actual work, it just won't happen. True, and there's no carrot/stick of a salary to entice people into doing what is mostly thankless grunt work. ;) OTOH, there's always new blood with lots of time on there hands coming into the community looking for a way to distinguish themselves (read: students :). Maybe someone will step forward and win a little lemony slice of net.fame. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRS610nEjvBPtnXfVAQKtiwP/a+BfIhLupcDQwfY6AhXNxjvnh+scjqTd nutSPfHR8qdbDxAxq6YcBkMeIh55XP0QSu+gYSdDDj9dGkIP0FGhurpZVW1WFrye KEBapAmnPUnC8X5kAj0Wrw6BXacchilrH3cpC1psDtlT58TgAsUxtjmYsSKEI0ZP l+tx3jlp2Ck= =vbwS -----END PGP SIGNATURE----- From snaury at gmail.com Thu Oct 12 23:53:10 2006 From: snaury at gmail.com (Alexey Borzenkov) Date: Fri, 13 Oct 2006 01:53:10 +0400 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: <452EADF4.9010800@v.loewis.de> References: <452EADF4.9010800@v.loewis.de> Message-ID: On 10/13/06, "Martin v. L?wis" wrote: > Please consider also exposing _wspawnvp, depending on whether path > argument is a Unicode object or not. See PEP 277 for guidance. > Since this would go into 2.6, support for Windows 95 isn't mandatory. Umm... do you mean that spawn*p* on python 2.5 is an absolute no? From brett at python.org Thu Oct 12 23:55:03 2006 From: brett at python.org (Brett Cannon) Date: Thu, 12 Oct 2006 14:55:03 -0700 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610130714.00673.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> <200610130714.00673.anthony@interlink.com.au> Message-ID: On 10/12/06, Anthony Baxter wrote: > > On Friday 13 October 2006 05:30, Georg Brandl wrote: > > I'm I the only one who feels that the website is a big workflow problem? > > Assuming you meant "Am I", then I absolutely agree with you. I have touched the web site since the Pyramid switch and thus am not that active, so what I am about to say may be slightly off, but ... I know AMK was experimenting with rest2web as a possible way to do the web site. There has also been talk about trying out another system. But I also know some people would rather put the effort into improving Pyramid. Once again, it's a matter of people putting the time in to make a switch happen to a system that the site maintainers would be happy with. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061012/cf302456/attachment.html From ronaldoussoren at mac.com Thu Oct 12 23:59:16 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 12 Oct 2006 23:59:16 +0200 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: <452EADF4.9010800@v.loewis.de> Message-ID: <22924045-80C8-4BAA-A15B-964F4A44841C@mac.com> On Oct 12, 2006, at 11:53 PM, Alexey Borzenkov wrote: > On 10/13/06, "Martin v. L?wis" wrote: >> Please consider also exposing _wspawnvp, depending on whether path >> argument is a Unicode object or not. See PEP 277 for guidance. >> Since this would go into 2.6, support for Windows 95 isn't mandatory. > > Umm... do you mean that spawn*p* on python 2.5 is an absolute no? Unless you have a time machine and manage to get it in before 2.5.0 is released :-). Micro releases (2.5.1, 2.5.2, ...) only contain bugfixes, not new features. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061012/9d4bec90/attachment.bin From martin at v.loewis.de Fri Oct 13 00:03:18 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 13 Oct 2006 00:03:18 +0200 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: <452EADF4.9010800@v.loewis.de> Message-ID: <452EBBA6.8040106@v.loewis.de> Alexey Borzenkov schrieb: > On 10/13/06, "Martin v. L?wis" wrote: >> Please consider also exposing _wspawnvp, depending on whether path >> argument is a Unicode object or not. See PEP 277 for guidance. >> Since this would go into 2.6, support for Windows 95 isn't mandatory. > > Umm... do you mean that spawn*p* on python 2.5 is an absolute no? Yes. No new features can be added to Python 2.5.x; Python 2.5 has already been released. Regards, Martin From martin at v.loewis.de Fri Oct 13 00:06:09 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Oct 2006 00:06:09 +0200 Subject: [Python-Dev] Python 2.5 performance In-Reply-To: <20061012210744.GE6015@zot.electricrain.com> References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> <452EACD9.9090001@v.loewis.de> <20061012210744.GE6015@zot.electricrain.com> Message-ID: <452EBC51.2050509@v.loewis.de> Gregory P. Smith schrieb: > i read that as just suggesting that updates should be checked into the > release25-maint tree to get PCBuild8 working out of the box for anyone > who wants to build python from source with vs2005. That's passive voice ("should be checked"). I think it is unrealistic to expect that anybody making changes will make them to PCbuild8 as well if they are relevant; in many cases, no changes are made to the Windows build process at all. Fortunately, Kristjan has volunteered to maintain PCbuild8, and that's fine with me. Regards, Martin From fuzzyman at voidspace.org.uk Fri Oct 13 00:07:08 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 12 Oct 2006 23:07:08 +0100 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> <200610130714.00673.anthony@interlink.com.au> Message-ID: <452EBC8C.4080800@voidspace.org.uk> Brett Cannon wrote: > On 10/12/06, *Anthony Baxter* > wrote: > > On Friday 13 October 2006 05:30, Georg Brandl wrote: > > I'm I the only one who feels that the website is a big workflow > problem? > > Assuming you meant "Am I", then I absolutely agree with you. > > > I have touched the web site since the Pyramid switch and thus am not > that active, so what I am about to say may be slightly off, but ... > > I know AMK was experimenting with rest2web as a possible way to do the > web site. +1 for rest2web ;-) > There has also been talk about trying out another system. But I also > know some people would rather put the effort into improving Pyramid. > Actually from the little I looked at it, pyramid seemed a very good system. Particularly the SVN integration. If rest2web is a serious option and needs any customisation, I'd be happy to look into it. Michael Foord > Once again, it's a matter of people putting the time in to make a > switch happen to a system that the site maintainers would be happy with. > > -Brett > > ------------------------------------------------------------------------ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > ------------------------------------------------------------------------ > > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.1.408 / Virus Database: 268.13.2/472 - Release Date: 11/10/2006 > -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.408 / Virus Database: 268.13.2/472 - Release Date: 11/10/2006 From martin at v.loewis.de Fri Oct 13 00:14:55 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Oct 2006 00:14:55 +0200 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: References: <20051104202824.GA19678@discworld.dyndns.org> <20051202025557.GA22377@ActiveState.com> Message-ID: <452EBE5F.2040609@v.loewis.de> Dave Abrahams schrieb: > The only problem here is that there appears to be a lag in the release of > ActivePython after Python itself is released. > > Is there any chance of putting up just the debugging libraries a little earlier? I may be out of context here: what is the precise problem in producing them yourself? Why do you need somebody else to do it for you? Regards, Martin From anthony at interlink.com.au Fri Oct 13 01:06:49 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 09:06:49 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <1C323968-BA50-4D36-B5E2-5B4B10306627@python.org> References: <200610121808.47010.anthony@interlink.com.au> <20061012210310.GC6015@zot.electricrain.com> <1C323968-BA50-4D36-B5E2-5B4B10306627@python.org> Message-ID: <200610130906.55129.anthony@interlink.com.au> On Friday 13 October 2006 07:34, Barry Warsaw wrote: > > i'm not volunteering to setup an automated system for this but i've > > got good ideas how to do it if i ever find time or someone wants to > > chat offline. :( > > I wish I had the cycles to volunteer to help out implementing this. :( Well, regardless of anything else, without someone doing it, it's not going to happen. I don't have the time to spend doing this. Right now, the amount of work this would save me is minimal, so I also have little or no incentive to do it. The thing that does take the time is the website - fixing that is a major investment of time, which I also don't have. Yes, had I spent the probably 20+ hours I've spent doing website stuff I could have made it a bit better, but that's what I know _now_ :) -- Anthony Baxter It's never too late to have a happy childhood. From greg.ewing at canterbury.ac.nz Fri Oct 13 01:27:27 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Oct 2006 12:27:27 +1300 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu> <452D7E92.4050206@canterbury.ac.nz> Message-ID: <452ECF5F.7040204@canterbury.ac.nz> Fredrik Lundh wrote: > marshal hasn't changed in many years: Maybe not, but I was given to understand that it's regarded as a private format that's not guaranteed to remain constant across versions. So even if it happens not to change, it wouldn't be wise to rely on that. -- Greg From dave at boost-consulting.com Fri Oct 13 00:53:55 2006 From: dave at boost-consulting.com (David Abrahams) Date: Thu, 12 Oct 2006 18:53:55 -0400 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <452EBE5F.2040609@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wi?= =?utf-8?Q?s's?= message of "Fri, 13 Oct 2006 00:14:55 +0200") References: <20051104202824.GA19678@discworld.dyndns.org> <20051202025557.GA22377@ActiveState.com> <452EBE5F.2040609@v.loewis.de> Message-ID: <8764eprvr0.fsf@pereiro.luannocracy.com> "Martin v. L?wis" writes: > Dave Abrahams schrieb: >> The only problem here is that there appears to be a lag in the release of >> ActivePython after Python itself is released. >> >> Is there any chance of putting up just the debugging libraries a little earlier? > > I may be out of context here: what is the precise problem in producing > them yourself? Why do you need somebody else to do it for you? At the moment I have too weak a server to provide those files, but that will change very soon. All that said, the Python and ActiveState teams need to be aware of each and every Python release and go through a standard release procedure anyway, whereas -- except for this problem -- I would not. I'm willing to try to add it if that's what works, and of course it's easy for me to say, but I think it adds a lot more overhead for me than it would for the other two groups. -- Dave Abrahams Boost Consulting www.boost-consulting.com From snaury at gmail.com Fri Oct 13 02:46:23 2006 From: snaury at gmail.com (Alexey Borzenkov) Date: Fri, 13 Oct 2006 04:46:23 +0400 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: <452EBBA6.8040106@v.loewis.de> References: <452EADF4.9010800@v.loewis.de> <452EBBA6.8040106@v.loewis.de> Message-ID: Forgot to include python-dev... On 10/13/06, "Martin v. L?wis" wrote: > > Umm... do you mean that spawn*p* on python 2.5 is an absolute no? > Yes. No new features can be added to Python 2.5.x; Python 2.5 has > already been released. Ugh... that's just not fair. Because of this there will be no spawn*p* in python for another two years. x_x I have a workaround for this, that tweaks os module: [...snip wrong code...] It should have been: if (not (hasattr(os, 'spawnvpe') or hasattr(os, 'spawnvp')) and hasattr(os, 'spawnve') and hasattr(os, 'spawnv')): def _os__spawnvpe(mode, file, args, env=None): import sys from errno import ENOENT, ENOTDIR from os import path, spawnve, spawnv, environ, defpath, pathsep, error if env is not None: func = spawnve argrest = (args, env) else: func = spawnv argrest = (args,) env = environ head, tail = path.split(file) if head: return func(mode, file, *argrest) if 'PATH' in env: envpath = env['PATH'] else: envpath = defpath PATH = envpath.split(pathsep) if os.name == 'nt' or os.name == 'os2': PATH.insert(0, '') saved_exc = None saved_tb = None for dir in PATH: fullname = path.join(dir, file) try: return func(mode, fullname, *argrest) except error, e: tb = sys.exc_info()[2] if (e.errno != ENOENT and e.errno != ENOTDIR and saved_exc is None): saved_exc = e saved_tb = tb if saved_exc: raise error, saved_exc, saved_tb raise error, e, tb def _os_spawnvp(mode, file, args): return os._spawnvpe(mode, file, args) def _os_spawnvpe(mode, file, args, env): return os._spawnvpe(mode, file, args, env) def _os_spawnlp(mode, file, *args): return os._spawnvpe(mode, file, args) def _os_spawnlpe(mode, file, *args): return os._spawnvpe(mode, file, args[:-1], args[-1]) os._spawnvpe = _os__spawnvpe os.spawnvp = _os_spawnvp os.spawnvpe = _os_spawnvpe os.spawnlp = _os_spawnlp os.spawnlpe = _os_spawnlpe os.__all__.extend(["spawnvp", "spawnvpe", "spawnlp", "spawnlpe"]) But the fact that I have to use similar code anywhere I need to use spawnlp is not fair. Notice that _spawnvpe is simply a clone of _execvpe from os.py, maybe if the problem is new API in c source, this approach could be used in os.py? P.S. Although it's a bit stretching, one might also say that implementing spawn*p* on windows is not actually a new feature, and rather is a bugfix for misfeature. Why every other platform can benefit from spawn*p* and only Windows can't? This just makes os.spawn*p* useless: it becomes unreliable and can't be used in portable code at all. From barry at python.org Fri Oct 13 03:03:47 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 12 Oct 2006 21:03:47 -0400 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: <452EADF4.9010800@v.loewis.de> <452EBBA6.8040106@v.loewis.de> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 12, 2006, at 8:46 PM, Alexey Borzenkov wrote: > Ugh... that's just not fair. Because of this there will be no spawn*p* > in python for another two years. x_x Correct, but don't let that stop you. That's what distutils and the Cheeseshop are for. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRS7l+XEjvBPtnXfVAQJy6gP/RkGcTXDCBYM/WL/X+sNiTp6ydvFPg20u SrxUb/vQpNVkjA2GkFJJAXArnsxn8LB2MC+rPDRkkNMYcFw5JAUcf0IR1L+AdFnC h+68f03XDzbeB8uqVrQ6xObEPXmanvhx1uCrApqFq+zOzqMNlbzUlyGCTLu0Cw9v CYLa+aaKFAA= =dX0B -----END PGP SIGNATURE----- From tim.peters at gmail.com Fri Oct 13 03:04:04 2006 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 12 Oct 2006 21:04:04 -0400 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: <452EADF4.9010800@v.loewis.de> <452EBBA6.8040106@v.loewis.de> Message-ID: <1f7befae0610121804y539d5571wa9b717b10b0d80da@mail.gmail.com> [Alexey Borzenkov] >>> Umm... do you mean that spawn*p* on python 2.5 is an absolute no? [Martin v. L?wis] >> Yes. No new features can be added to Python 2.5.x; Python 2.5 has >> already been released. [Alexey Borzenkov] > Ugh... that's just not fair. Because of this there will be no spawn*p* > in python for another two years. x_x Or the last 15 years. Yet somehow people still have kids ;-) > ... > But the fact that I have to use similar code anywhere I need to use > spawnlp is not fair. "Fair" is a very strange word here. Pain in the ass, sure, but not fair? Doesn't make sense. > ... > P.S. Although it's a bit stretching, one might also say that > implementing spawn*p* on windows is not actually a new feature, and > rather is a bugfix for misfeature. No. Introducing any new function is obviously a new feature, which would become acutely and catastrophically visible as soon as someone released code using the new function in 2.5.1, and someone tried to /use/ that new code under 2.5.0. Micro releases of Python do not introduce new features -- take that as given. It's been tried before, for what appeared to be "very good reasons" at the time, and we lived to regret it deeply. It won't happen again. > Why every other platform can benefit from spawn*p* and only Windows can't? Just the obvious reason: because so far nobody cared enough to do the work of writing code, docs and tests for some of these functions on Windows. > This just makes os.spawn*p* useless: it becomes unreliable and can't be > used in portable code at all. It's certainly true that it can't be used in portable code, at least not before Python 2.6. From steve at holdenweb.com Fri Oct 13 04:56:35 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 13 Oct 2006 03:56:35 +0100 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <452EBC8C.4080800@voidspace.org.uk> References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> <200610130714.00673.anthony@interlink.com.au> <452EBC8C.4080800@voidspace.org.uk> Message-ID: Michael Foord wrote: > Brett Cannon wrote: > >>On 10/12/06, *Anthony Baxter* >> wrote: >> >> On Friday 13 October 2006 05:30, Georg Brandl wrote: >> > I'm I the only one who feels that the website is a big workflow >> problem? >> >> Assuming you meant "Am I", then I absolutely agree with you. >> >> >>I have touched the web site since the Pyramid switch and thus am not >>that active, so what I am about to say may be slightly off, but ... >> >>I know AMK was experimenting with rest2web as a possible way to do the >>web site. > > +1 for rest2web ;-) > > >>There has also been talk about trying out another system. But I also >>know some people would rather put the effort into improving Pyramid. >> > > Actually from the little I looked at it, pyramid seemed a very good > system. Particularly the SVN integration. > The real problem is the more or less complete lack of incremental rebuild, which does make site generation time-consuming. The advantage of pyramid implementation was the regularisation of the site data. I think we probably need to look at taking the now more-or-less regular data structures used to drive pyramid and find some way to use them (still with source control, but hopefully with much less verbiage) to drive something like Django. To retain the advantages of source control this might mean using scripts to generate database content from SVN-controlled data files. Or something [waves hands vaguely and steps back hopefully]. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From martin at v.loewis.de Fri Oct 13 06:01:16 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 13 Oct 2006 06:01:16 +0200 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <8764eprvr0.fsf@pereiro.luannocracy.com> References: <20051104202824.GA19678@discworld.dyndns.org> <20051202025557.GA22377@ActiveState.com> <452EBE5F.2040609@v.loewis.de> <8764eprvr0.fsf@pereiro.luannocracy.com> Message-ID: <452F0F8C.10708@v.loewis.de> David Abrahams schrieb: > At the moment I have too weak a server to provide those files, but > that will change very soon. All that said, the Python and ActiveState > teams need to be aware of each and every Python release and go through > a standard release procedure anyway, whereas -- except for this > problem -- I would not. I'm willing to try to add it if that's what > works, and of course it's easy for me to say, but I think it adds a > lot more overhead for me than it would for the other two groups. It's a significant amount of work, either way. It will be larger for you when you do it the first time; after that, it will be the same amount of work for you that it would be for me. It will be easier for you than for me as you won't be acting under time pressure (whereas additional actions from me will delay the entire Python release, which, due to timezones, already significantly suffers from the need to create Windows binaries). I'm not sure whether you are requesting these for yourself or for somebody else. If for somebody else, that somebody else should seriously consider building Python himself, and publishing the result. Regards, Martin From martin at v.loewis.de Fri Oct 13 06:20:44 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 13 Oct 2006 06:20:44 +0200 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: <452EADF4.9010800@v.loewis.de> <452EBBA6.8040106@v.loewis.de> Message-ID: <452F141C.3050801@v.loewis.de> Alexey Borzenkov schrieb: > On 10/13/06, "Martin v. L?wis" wrote: >> > Umm... do you mean that spawn*p* on python 2.5 is an absolute no? >> Yes. No new features can be added to Python 2.5.x; Python 2.5 has >> already been released. > > Ugh... that's just not fair. Because of this there will be no spawn*p* > in python for another two years. x_x It may be inconvenient, but it is certainly fair: the same rule is applied to *all* proposed new features. It would be unfair if that feature was accepted, and other features were rejected. Please try to see this from "our" view. If new features are added to a bugfix release (say, 2.5.1), then users (programmers) would quickly consider Python as unstable, moving target. They would use the feature, claiming that you need Python 2.5, and not knowing that it is really 2.5.*1* that you need. Users would try to run the program, and find out that it doesn't work, and complain to the author. Unhappy users, unhappy programmers, and unhappy maintainers (as the programmers would then complain which idiot allowed that feature in - they do use strong language at times). It happened once, in 2.2.1 (IIRC) with the introduction of True and False. It was very painful and lead to a lot of bad code, and it still hasn't settled. As you already have a work-around: what is the problem waiting for 2.6, for you personally? If you want to see the feature eventually, please do submit it to sourceforge, anyway. Regards, Martin From anthony at interlink.com.au Fri Oct 13 07:05:22 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 15:05:22 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <452EBC8C.4080800@voidspace.org.uk> Message-ID: <200610131505.24258.anthony@interlink.com.au> On Friday 13 October 2006 12:56, Steve Holden wrote: > The real problem is the more or less complete lack of incremental > rebuild, which does make site generation time-consuming. That's _part_ of it. There's other issues. For instance, there's probably 4 places where the "list of releases" is stored. Every time I do a release, I need to update all of these. If it's a new release, I also have to update the apache config for the /X.Y.Z redirect (anyone who thinks a default URL of www.python.org/download/releases/X.Y.Z is a good idea needs to quit drinking before lunchtime ) Creating a new release area, or hell, even a new page, is a whole pile of fiddly files. These still don't make sense to me - I end up copying an existing page each time, then reading through them looking for the relevant pieces of text. Personally, I can mostly deal with the reST now, although it still trips me up on a regular basis. YAML as well is just way more complexity - I don't understand the syntax, but it appears to offer massively more than we actually use. > The advantage of pyramid implementation was the regularisation of the > site data. Sure - and hopefully if we go down another path we can get that out. > To retain the advantages of source control this might mean using scripts > to generate database content from SVN-controlled data files. Or > something [waves hands vaguely and steps back hopefully]. The other thing to watch out for is that I (or whoever) can still do local work on a bunch of different files, then check it in all in one hit once it's done and checked. This was an issue I had with the various wiki-based proposals, I haven't seen many wikis that allow this. -- Anthony Baxter It's never too late to have a happy childhood. From anthony at interlink.com.au Fri Oct 13 07:11:21 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 15:11:21 +1000 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: <452EBBA6.8040106@v.loewis.de> Message-ID: <200610131511.23131.anthony@interlink.com.au> On Friday 13 October 2006 10:46, Alexey Borzenkov wrote: > But the fact that I have to use similar code anywhere I need to use > spawnlp is not fair. Notice that _spawnvpe is simply a clone of > _execvpe from os.py, maybe if the problem is new API in c source, this > approach could be used in os.py? Oddly, "fair" isn't a constraint in PEP-0006. Backwards and forwards compatibility between all point releases in a major release is. Adding it to os.py rather than C code doesn't make a difference. > P.S. Although it's a bit stretching, one might also say that > implementing spawn*p* on windows is not actually a new feature, and > rather is a bugfix for misfeature. Why every other platform can > benefit from spawn*p* and only Windows can't? This just makes > os.spawn*p* useless: it becomes unreliable and can't be used in > portable code at all. "One" might say that. I wouldn't. It stays out until 2.6. Sorry Anthony -- Anthony Baxter It's never too late to have a happy childhood. From warner at lothar.com Fri Oct 13 07:18:33 2006 From: warner at lothar.com (Brian Warner) Date: Thu, 12 Oct 2006 22:18:33 -0700 (PDT) Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun Message-ID: <20061012.221833.74733401.warner@lothar.com> "Gregory P. Smith" writes: > as for buildbot, i haven't looked at its design but from the chatter > i've seen i was under the impression that it operates on a continually > updated sandbox rather than a 100% fresh checkout for each build? It's a configuration option. If you use mode="update" then your builds will re-use the same source directory over and over, if you use mode="clobber" then your builds will get a brand new checkout each time, and if you use mode="copy" then the source is updated in-place in one directory, but each build is performed from a copy of that checkout. Each offers different tradeoffs between disk usage, network usage, and which sorts of Makefile bugs they are likely to discover. cheers, -Brian (Buildbot author) From fredrik at pythonware.com Fri Oct 13 07:35:23 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 13 Oct 2006 07:35:23 +0200 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: <452ECF5F.7040204@canterbury.ac.nz> References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu> <452D7E92.4050206@canterbury.ac.nz> <452ECF5F.7040204@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Fredrik Lundh wrote: > >> marshal hasn't changed in many years: > > Maybe not, but I was given to understand that it's > regarded as a private format that's not guaranteed > to remain constant across versions. So even if > it happens not to change, it wouldn't be wise to > rely on that. but given that the format *has* been stable for many years, surely it would make more sense to just codify that fact, rather than developing Yet Another Serialization Format instead? From fredrik at pythonware.com Fri Oct 13 08:37:00 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 13 Oct 2006 08:37:00 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> <200610130714.00673.anthony@interlink.com.au> Message-ID: Brett Cannon wrote: > I know AMK was experimenting with rest2web as a possible way to do the > web site. There has also been talk about trying out another system. > But I also know some people would rather put the effort into improving > Pyramid. You forgot the ponies! > Once again, it's a matter of people putting the time in to make a switch > happen to a system that the site maintainers would be happy with. The people behind the current system and process has invested way too much energy and prestige in the current system to ever accept that the result is pretty lousy as a site, and complete rubbish as technology. It's about sunk costs, not cost- and time-effective solutions. For reference, here's my effbot.org release procedure: 1) upload the distribution files one by one, as soon as they're available. all links and stuff will appear automatically 2) update the associated description text through the web, when necessary, as an HTML fragment. click "save" to publish. 3) mail out an announcement when everything looks good. Maybe I should offer Anthony to do the releases via effbot.org instead? From fredrik at pythonware.com Fri Oct 13 08:40:40 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 13 Oct 2006 08:40:40 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610131505.24258.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <452EBC8C.4080800@voidspace.org.uk> <200610131505.24258.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > The other thing to watch out for is that I (or whoever) can still do local > work on a bunch of different files the point of my previous post is that you *shouldn't* have to edit a bunch of different files to make a new release. From anthony at interlink.com.au Fri Oct 13 08:44:54 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 16:44:54 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> Message-ID: <200610131644.57560.anthony@interlink.com.au> > For reference, here's my effbot.org release procedure: > > 1) upload the distribution files one by one, as soon as they're > available. all links and stuff will appear automatically > > 2) update the associated description text through the web, when > necessary, as an HTML fragment. click "save" to publish. > > 3) mail out an announcement when everything looks good. > > Maybe I should offer Anthony to do the releases via effbot.org instead? First off - I'm not going to be posting 10M or 16M files through a web-browser. That's insane :-) The bit of the website that's dealing with the actual files is not the tricky bit - I have a dinky little python script that generates the download table. The problems are with the other bits of the pages. I keep thinking "next release, I'll automate it further", but never have time on the day. -- Anthony Baxter It's never too late to have a happy childhood. From fredrik at pythonware.com Fri Oct 13 08:45:12 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 13 Oct 2006 08:45:12 +0200 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: <452EADF4.9010800@v.loewis.de> <452EBBA6.8040106@v.loewis.de> Message-ID: Alexey Borzenkov wrote: > P.S. Although it's a bit stretching, one might also say that > implementing spawn*p* on windows is not actually a new feature, and > rather is a bugfix for misfeature. Why every other platform can > benefit from spawn*p* and only Windows can't? This just makes > os.spawn*p* useless: it becomes unreliable and can't be used in > portable code at all. any reason you cannot just use the "subprocess" module instead, like everyone else? From fredrik at pythonware.com Fri Oct 13 08:59:46 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 13 Oct 2006 08:59:46 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610131644.57560.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <200610131644.57560.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: >> For reference, here's my effbot.org release procedure: >> >> 1) upload the distribution files one by one, as soon as they're >> available. all links and stuff will appear automatically >> >> 2) update the associated description text through the web, when >> necessary, as an HTML fragment. click "save" to publish. >> >> 3) mail out an announcement when everything looks good. >> >> Maybe I should offer Anthony to do the releases via effbot.org instead? > > First off - I'm not going to be posting 10M or 16M files through a > web-browser. That's insane :-) oh, I only edit the pages through the web, not the files. there's nothing wrong with scp or sftp or rsync-over-ssh or whatever you're using today. > The bit of the website that's dealing with the actual files is not the tricky > bit - I have a dinky little python script that generates the download table. yeah, but *you* are doing it. if the server did that, Martin and other trusted contributors could upload the files as soon as they're available, instead of first transferring them to you, and then waiting for you to find yet another precious time slot to spend on this release. > The problems are with the other bits of the pages. I keep thinking "next > release, I'll automate it further", but never have time on the day. that's why you have to have an overall infrastructure that lets you make incremental tweaks to the tool chain, so things can get a little better all the time. Pyramid obviously isn't such a system. From anthony at interlink.com.au Fri Oct 13 10:02:53 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 18:02:53 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <200610131644.57560.anthony@interlink.com.au> Message-ID: <200610131802.58681.anthony@interlink.com.au> On Friday 13 October 2006 16:59, Fredrik Lundh wrote: > yeah, but *you* are doing it. if the server did that, Martin and > other trusted contributors could upload the files as soon as they're > available, instead of first transferring them to you, and then waiting > for you to find yet another precious time slot to spend on this release. Sure - I get that. There's a couple of reasons for me doing it. First is gpg signing the release files, which has to happen on my local machine. There's also the variation in who actually builds the releases; at least one of the Mac builds was done by Bob I. But there could be ways around this. I don't want to have to ensure every builder has scp, and I'd also prefer for it to all "go live" at once. A while back, the Mac installer would follow up "some time" after the Windows and source builds. Every release, I'd get emails saying "where's the mac build?!" > > The problems are with the other bits of the pages. I keep thinking "next > > release, I'll automate it further", but never have time on the day. > > that's why you have to have an overall infrastructure that lets you make > incremental tweaks to the tool chain, so things can get a little better > all the time. Pyramid obviously isn't such a system. I can't disagree with this. -- Anthony Baxter It's never too late to have a happy childhood. From larry at hastings.org Fri Oct 13 10:10:52 2006 From: larry at hastings.org (Larry Hastings) Date: Fri, 13 Oct 2006 01:10:52 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <452A16B1.9070109@egenix.com> References: <4523F890.9060804@hastings.org> <20061005192858.GA9435@zot.electricrain.com> <17702.13238.684094.6289@montanaro.dyndns.org> <4529E28E.3070800@hastings.org> <452A16B1.9070109@egenix.com> Message-ID: <452F4A0C.7070101@hastings.org> I've uploaded a new patch to Sourceforge in response to feedback: * I purged all // comments and fixed all > 80 characters added by my patch, as per Neil Norwitz. * I added a definition of max() for those who don't already have one, as per skip at pobox.com. It now compiles cleanly on Linux again without modification; sorry for not checking that since the original patch. I've also uploaded my hacked-together benchmark script, for all that's worth. That patch tracker page again: http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 M.-A. Lemburg wrote: > When comparing results, please look at the minimum runtime. > The average times are just given to indicate how much the mintime > differs from the average of all runs. > I'll do that next time. In the meantime, I've also uploaded a zip file containing the results of my benchmarking, including the stdout from the run and the "-f" file which contains the pickled output. So you can examine my results yourself, including doing analysis on the pickled data if you like. > If however the speedups are not consistent across several runs of > pybench, then it's likely that you have some background activity > going on on the machine which causes a slowdown in the unmodified > run you chose as basis for the comparison. > The machine is dual-core, and was quiescent at the time. XP's scheduler is hopefully good enough to just leave the process running on one core. I ran the benchmarks just once on my Linux 2.6 machine; it's a dual-CPU P3 933EB (or maybe just 866EB, I forget). It's faster overall there too, by 1.9% (minimum run-time). The two tests I expected to be faster ("ConcatStrings" and "CreateStringsWithConcat") were consistently much faster; beyond that the results don't particularly resemble the results from my XP machine. (I uploaded those .txt and .pickle files too.) The mystery overall speedup continues, not that I find it unwelcome. :) > Just to make sure: you are using pybench 2.0, right ? > I sure was. And I used stringbench.py downloaded from here: http://svn.python.org/projects/sandbox/branches/jim-fix-setuptools-cli/stringbench/stringbench.py Cheers, /larry/ From ncoghlan at gmail.com Fri Oct 13 11:07:12 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 13 Oct 2006 19:07:12 +1000 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> <452C6FD8.8070403@v.loewis.de> Message-ID: <452F5740.8000504@gmail.com> Fredrik Lundh wrote: > Martin v. L?wis wrote: > >> Of course, if everybody would always recompile all extension modules >> for a new Python feature release, those flags weren't necessary. > > a dynamic registration approach would be even better, with a single entry point > used to register all methods and hooks your C extension has implemented, and > code on the other side that builds a properly initialized type descriptor from that > set, using fallback functions and error stubs where needed. > > e.g. the impossible-to-write-from-scratch NoddyType struct initialization in > > http://docs.python.org/ext/node24.html > > would collapse to > > static PyTypeObject NoddyType; Wouldn't that have to be a pointer to allow the Python runtime complete control of the structure size without recompiling the extension?: static PyTypeObject *NoddyType; NoddyType = PyType_Alloc("noddy.Noddy"); if (!NoddyType) return; PyType_Register(NoddyType, PY_TP_DEALLOC, Noddy_dealloc); PyType_Register(NoddyType, PY_TP_DOC, "Noddy objects"); PyType_Register(NoddyType, PY_TP_TRAVERSE, Noddy_traverse); PyType_Register(NoddyType, PY_TP_CLEAR, Noddy_clear); PyType_Register(NoddyType, PY_TP_METHODS, Noddy_methods); PyType_Register(NoddyType, PY_TP_MEMBERS, Noddy_members); PyType_Register(NoddyType, PY_TP_INIT, Noddy_init); PyType_Register(NoddyType, PY_TP_NEW, Noddy_new); if (PyType_Ready(NoddyType) < 0) return; Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fredrik at pythonware.com Fri Oct 13 11:22:09 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 13 Oct 2006 11:22:09 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> <452C6FD8.8070403@v.loewis.de> <452F5740.8000504@gmail.com> Message-ID: Nick Coghlan wrote: > > would collapse to > > > > static PyTypeObject NoddyType; > > Wouldn't that have to be a pointer to allow the Python runtime complete > control of the structure size without recompiling the extension?: > > static PyTypeObject *NoddyType; yeah, that's a silly typo. or maybe I was thinking of something really clever that I can no longer remember. > NoddyType = PyType_Alloc("noddy.Noddy"); > if (!NoddyType) > return; the fewer places you have to check for an error, the less chance you have to forget to do it. my proposal implied that the NULL check should be done in Ready. I've posted slightly cleaned up version of my rough proposal here: http://effbot.org/zone/idea-register-type.htm From ncoghlan at gmail.com Fri Oct 13 11:25:38 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 13 Oct 2006 19:25:38 +1000 Subject: [Python-Dev] Python 2.5 performance In-Reply-To: <00df01c6ee10$ded9c920$ea146b0a@RaymondLaptop1> References: <129CEF95A523704B9D46959C922A280002FE99F3@nemesis.central.ccp.cc> <00df01c6ee10$ded9c920$ea146b0a@RaymondLaptop1> Message-ID: <452F5B92.8060702@gmail.com> Raymond Hettinger wrote: >> From: Kristj?n V. J?nsson >> I think we should start considering to make PCBuild8 a "supported" build. > > +1 and not just for the free speed-up. VC8 is what more and more Windows > developers will have on there machines. Without a supported build, it becomes > much harder to make patches or build compatible extensions. It also makes hobbyist hacking on the core more straightforward, as it makes it possible to use VC++ Express Edition to try out changes locally. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From rasky at develer.com Fri Oct 13 11:48:22 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 13 Oct 2006 11:48:22 +0200 Subject: [Python-Dev] Proposal: No more standard library additions References: <452F26A3.7060506@acm.org> <2ab101c6ee90$83796470$9a4c2a97@bagio> <52208.62.39.9.251.1160728013.squirrel@webmail.nerim.net> Message-ID: <2f5401c6eeac$b8861b60$9a4c2a97@bagio> Antoine wrote: >> The standard library is not about easeness of installation. It is >> about having >> a consistent fixed codebase to work with. I don't want to go >> Perl/CPAN, where you have 3-4 alternatives to do thing A which will >> never interoperate >> with whatever you chose among the 3-4 alternatives to do thing B. > > Currently in Python: > http://docs.python.org/lib/module-xml.dom.html > http://docs.python.org/lib/module-xml.dom.minidom.html > http://docs.python.org/lib/module-xml.sax.html > http://docs.python.org/lib/module-xml.parsers.expat.html > http://docs.python.org/lib/module-xml.etree.ElementTree.html > > The problem of "consistent fixed codebase" is that standards get > higher, so eventually those old stable modules lose popularity in > favor of newer, better modules. Those are different paradigms of "doing XML". For instance, the standard library was missing a "pythonic" library to do XML processing, and several arose. ElementTree (fortunately) won and joined the standard distribution. This should allievate the need for other libraries in future. Instead of looking what we have inside, look outside. There are dozens of different XML "pythonic" libraries. I have fought in the past with programs that required large XML frameworks, that in turn required to be downloaded, built, installed, and *understood* to make the required modifictions to the programs themselves. This slowed down my own development, and caused infinite headaches before of version compatibilities (A requires the XML library B, but only versions < 1.2, otherwise you can use A 2.0, which needs Python 2.4+, and then you can use latest B; etc. etc. repeat and complicate ad-libitum). A single version number (that of Python) and a large fixed set of libraries anybody can use is a *strong* PLUS. Then, there is the opposite phenomenom, which is interesting as well. I met many perl programmers which simply re-invented their little wheel everytime. They were mostly system administrators, so they *knew* very well what hell the dependency chains are for both programmers and users. Thus, since perl does not have a standard library, they simply did not import *any* module. This way, the program is "easier" to ship, distribute and use, but it's harder to code, read, fix, and contain unnecessary duplications with everybody's else script. Need to send an e-mail? Why using a library, just paste chunks of cut&pasted mail headers (with MIME, etc.) and do some basic string substitution; and the SMTP protocol is easy, just open a socket a dump some strings to it; or you can use 'sendmail' which is available on any UNIX (and there it goes portability, just because they did not want to evaluate and choose one of the 6 Perl SMTP libraries... and rightfully so!). > Therefore, you have to obsolete old stuff if you want there to be > only One Obvious Way To Do It. I'm totally in favor of obsoletion and removal of old cruft from the standard library. I'm totally against *not* having a standard library. Giovanni Bajo From rasky at develer.com Fri Oct 13 11:50:23 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 13 Oct 2006 11:50:23 +0200 Subject: [Python-Dev] [py3k] Re: Proposal: No more standard library additions References: <452F26A3.7060506@acm.org> <2ab101c6ee90$83796470$9a4c2a97@bagio><52208.62.39.9.251.1160728013.squirrel@webmail.nerim.net> <2f5401c6eeac$b8861b60$9a4c2a97@bagio> Message-ID: <2f9401c6eead$00be4420$9a4c2a97@bagio> I apologize, this had to go to python-3000 at . From bob at redivi.com Fri Oct 13 12:35:46 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 13 Oct 2006 03:35:46 -0700 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610131802.58681.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <200610131644.57560.anthony@interlink.com.au> <200610131802.58681.anthony@interlink.com.au> Message-ID: <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com> On 10/13/06, Anthony Baxter wrote: > On Friday 13 October 2006 16:59, Fredrik Lundh wrote: > > yeah, but *you* are doing it. if the server did that, Martin and > > other trusted contributors could upload the files as soon as they're > > available, instead of first transferring them to you, and then waiting > > for you to find yet another precious time slot to spend on this release. > > Sure - I get that. There's a couple of reasons for me doing it. First is gpg > signing the release files, which has to happen on my local machine. There's > also the variation in who actually builds the releases; at least one of the > Mac builds was done by Bob I. But there could be ways around this. I don't > want to have to ensure every builder has scp, and I'd also prefer for it to > all "go live" at once. A while back, the Mac installer would follow up "some > time" after the Windows and source builds. Every release, I'd get emails > saying "where's the mac build?!" With most consumer connections it's a lot faster to download than to upload. Perhaps it would save you a few minutes if the contributors uploaded directly to the destination (or to some other fast server) and you could download and sign it, rather than having to scp it back up somewhere from your home connection. To be fair, (thanks to Ronald) the Mac build is entirely automated by a script with the caveat that you should be a little careful about what your environment looks like (e.g. don't install fink or macports, or to move them out of the way when building). It downloads all of the third party dependencies, builds them with some special flags to make it universal, builds Python, and then wraps it up in an installer package. Given any Mac OS X 10.4 machine, the builds could happen automatically. Apple could probably provide one if someone asked. They did it for Twisted. Or maybe the Twisted folks could appropriate part of that machine's time to also build Python. -bob From fredrik at pythonware.com Fri Oct 13 13:06:16 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 13 Oct 2006 13:06:16 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun References: <200610121808.47010.anthony@interlink.com.au><200610131644.57560.anthony@interlink.com.au><200610131802.58681.anthony@interlink.com.au> <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com> Message-ID: Anthony: >> Sure - I get that. There's a couple of reasons for me doing it. First is gpg >> signing the release files, which has to happen on my local machine. There's >> also the variation in who actually builds the releases; at least one of the >> Mac builds was done by Bob I. But there could be ways around this. I don't >> want to have to ensure every builder has scp scp or scp access? the former isn't much of a requirement, really. I would be surprised to find a developer that didn't already have it on all machines, or knew how to run it off the internet (type "putty download" into google and click "I feel lucky"). >> all "go live" at once. A while back, the Mac installer would follow up "some >> time" after the Windows and source builds. Every release, I'd get emails >> saying "where's the mac build?!" that's a worthwhile goal, now that we have plenty of build volunteers, but I think that could be solved simply by delaying the *public* announcement until everything is in place. this is open source, after all - we don't need to hide how we're doing things. Bob Ippolito wrote: > With most consumer connections it's a lot faster to download than to > upload. Perhaps it would save you a few minutes if the contributors > uploaded directly to the destination (or to some other fast server) > and you could download and sign it, rather than having to scp it back > up somewhere from your home connection. that's another interesting advantage of a more asynchronous release process. if we can reduce the costly parts to a few 8-minute slots, it's a lot easier for any busy developer to find the time, even on a hectic day. and if we can dis- tribute those slots, things will be even easier. From anthony at interlink.com.au Fri Oct 13 13:09:06 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 13 Oct 2006 21:09:06 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com> References: <200610121808.47010.anthony@interlink.com.au> <200610131802.58681.anthony@interlink.com.au> <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com> Message-ID: <200610132109.11329.anthony@interlink.com.au> On Friday 13 October 2006 20:35, Bob Ippolito wrote: > With most consumer connections it's a lot faster to download than to > upload. Perhaps it would save you a few minutes if the contributors > uploaded directly to the destination (or to some other fast server) > and you could download and sign it, rather than having to scp it back > up somewhere from your home connection. I actually pull them down to both dinsdale and home, then verify they're the same with SHA and MD5 before signing, and uploading the keys. The only thing I upload directly are the keys and the source tarballs. > Given any Mac OS X 10.4 machine, the builds could happen > automatically. Apple could probably provide one if someone asked. They > did it for Twisted. Or maybe the Twisted folks could appropriate part > of that machine's time to also build Python. We have one, macteagle. For some reason builds fail on it right now - Ronald might be able to supply more details as to why. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From snaury at gmail.com Fri Oct 13 13:22:58 2006 From: snaury at gmail.com (Alexey Borzenkov) Date: Fri, 13 Oct 2006 15:22:58 +0400 Subject: [Python-Dev] Why spawnvp not implemented on Windows? In-Reply-To: References: <452EADF4.9010800@v.loewis.de> <452EBBA6.8040106@v.loewis.de> Message-ID: On 10/13/06, Fredrik Lundh wrote: > any reason you cannot just use the "subprocess" module instead, like > everyone else? Oh! Wow! I just simply didn't know of its existance (I'm pretty much new to python), and both distutils and SCons (I was looking inside them because they are major build systems and surely had to execute compilers somehow), and upon seeing that each of them invented their own method of searching path created a delusion as if inventing custom workarounds was the only way... Sorry... x_x From fredrik at pythonware.com Fri Oct 13 13:26:46 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 13 Oct 2006 13:26:46 +0200 Subject: [Python-Dev] Why spawnvp not implemented on Windows? References: <452EADF4.9010800@v.loewis.de><452EBBA6.8040106@v.loewis.de> Message-ID: Alexey Borzenkov wrote: >> any reason you cannot just use the "subprocess" module instead, like >> everyone else? > > Oh! Wow! I just simply didn't know of its existance (I'm pretty much > new to python), and both distutils and SCons (I was looking inside > them because they are major build systems and surely had to execute > compilers somehow), and upon seeing that each of them invented their > own method of searching path created a delusion as if inventing custom > workarounds was the only way... Sorry... x_x no problem. someone should really update the documentation to make sure that os.spawn and os.open and commands and popen2 and all the other 80%-solutions at least point to the subprocess module... (and if the library reference had been stored in a wiki, I'd fixed that before any- one else even got this mail...) From theller at python.net Fri Oct 13 13:30:14 2006 From: theller at python.net (Thomas Heller) Date: Fri, 13 Oct 2006 13:30:14 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <452EAAAE.2050200@v.loewis.de> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <452EAAAE.2050200@v.loewis.de> Message-ID: Martin v. L?wis schrieb: > Anthony Baxter schrieb: >> Mostly it is easy for me, with the one huge caveat. As far as I know, the Mac >> build is a single command to run for Ronald, and the Doc build similarly for >> Fred. I don't know what Martin has to do for the Windows build. > > Actually, for 2.3.x, I wouldn't do the Windows builds. I think Thomas > Heller did the 2.3.x series. Yes. But I've switched machines since I last build an installer, and I do not have all of the needed software installed any longer, for example the Wise Installer. Thomas From ronaldoussoren at mac.com Fri Oct 13 13:37:05 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 13 Oct 2006 13:37:05 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610132109.11329.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <200610131802.58681.anthony@interlink.com.au> <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com> <200610132109.11329.anthony@interlink.com.au> Message-ID: <15446848.1160739425294.JavaMail.ronaldoussoren@mac.com> On Friday, October 13, 2006, at 01:10PM, Anthony Baxter wrote: >On Friday 13 October 2006 20:35, Bob Ippolito wrote: >> With most consumer connections it's a lot faster to download than to >> upload. Perhaps it would save you a few minutes if the contributors >> uploaded directly to the destination (or to some other fast server) >> and you could download and sign it, rather than having to scp it back >> up somewhere from your home connection. > >I actually pull them down to both dinsdale and home, then verify they're the >same with SHA and MD5 before signing, and uploading the keys. The only thing >I upload directly are the keys and the source tarballs. > > >> Given any Mac OS X 10.4 machine, the builds could happen >> automatically. Apple could probably provide one if someone asked. They >> did it for Twisted. Or maybe the Twisted folks could appropriate part >> of that machine's time to also build Python. > >We have one, macteagle. For some reason builds fail on it right now - Ronald >might be able to supply more details as to why. IIRC it has the wrong version of Xcode installed (or rather another one than I use and test with). It also has darwinports installed at the default location, which can cause problems because the setup.py adds that directory to the include/link paths. I don't want to release installers that require that the user has darwinports installed :-) I can supply a newer version of Xcode if someone with an admin account is willing to install that. I don't know if the admin of that machine has GUI access to the machine, if not I'd have to investigate how to ensure that the proper subpackages get installed using a command-line install (using RemoteDesktop to administrator servers has spoiled me a bit in that regard). I guess this comes down to the usual problem: I have a working setup for building the mac installer and fixing macteagle takes time which I don't have available in great amounts (who does?). Ronald From ronaldoussoren at mac.com Fri Oct 13 13:44:15 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 13 Oct 2006 13:44:15 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com> References: <200610121808.47010.anthony@interlink.com.au> <200610131644.57560.anthony@interlink.com.au> <200610131802.58681.anthony@interlink.com.au> <6a36e7290610130335u179c5929r65d04c121550d8a4@mail.gmail.com> Message-ID: <7635335.1160739855167.JavaMail.ronaldoussoren@mac.com> On Friday, October 13, 2006, at 12:36PM, Bob Ippolito wrote: > >To be fair, (thanks to Ronald) the Mac build is entirely automated by >a script with the caveat that you should be a little careful about >what your environment looks like (e.g. don't install fink or macports, >or to move them out of the way when building). That (the "don't install Fink or macports" part) is because setup.py explicitly adds those directories to the library and include search path. IMHO that is a misfeature because it is much to easy to accidently contaminate a build that way. Fink and macports can easily add their directories to the search paths using OPTS and LDFLAGS, there's no need to automate this in setup.py. The beauty of macports is that /opt/local is the default prefix, but you can easily pick another prefix and most ports work fine that way (or rather not worse than with the default prefix). Ronald From steve at holdenweb.com Fri Oct 13 13:53:18 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 13 Oct 2006 12:53:18 +0100 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <452EBC8C.4080800@voidspace.org.uk> <200610131505.24258.anthony@interlink.com.au> Message-ID: Fredrik Lundh wrote: > Anthony Baxter wrote: > > >>The other thing to watch out for is that I (or whoever) can still do local >>work on a bunch of different files > > > the point of my previous post is that you *shouldn't* have to edit a > bunch of different files to make a new release. > Indeed. I seem to remember suggesting a while ago on pydotorg that whatever replaces pyramid should cater to groups such as the release team by allowing everything necessary to be generated from a simple set of data that wouldn't be difficult to maintain. Anthony has enough on his plate without having to fight the web server too ... regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From steve at holdenweb.com Fri Oct 13 14:00:36 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 13 Oct 2006 13:00:36 +0100 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> <200610130714.00673.anthony@interlink.com.au> Message-ID: Fredrik Lundh wrote: > Brett Cannon wrote: > > >>I know AMK was experimenting with rest2web as a possible way to do the >>web site. There has also been talk about trying out another system. >>But I also know some people would rather put the effort into improving >>Pyramid. > > > You forgot the ponies! > > >>Once again, it's a matter of people putting the time in to make a switch >>happen to a system that the site maintainers would be happy with. > > > The people behind the current system and process has invested way too > much energy and prestige in the current system to ever accept that the > result is pretty lousy as a site, and complete rubbish as technology. > It's about sunk costs, not cost- and time-effective solutions. > I don't believe that's true, but I'm certainly not the one with the most time invested in pyramid. Tim Parkin is on record as saying he'd be willing to help with a(nother) migration project. I think there's a general appreciation of pyramid's strangths *and* deficiencies. > For reference, here's my effbot.org release procedure: > > 1) upload the distribution files one by one, as soon as they're > available. all links and stuff will appear automatically > > 2) update the associated description text through the web, when > necessary, as an HTML fragment. click "save" to publish. > > 3) mail out an announcement when everything looks good. > > Maybe I should offer Anthony to do the releases via effbot.org instead? > You can try. Or you can start to promote Django again ... regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From dave at boost-consulting.com Fri Oct 13 14:36:47 2006 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 13 Oct 2006 08:36:47 -0400 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <452F0F8C.10708@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wis'?= =?utf-8?Q?s?= message of "Fri, 13 Oct 2006 06:01:16 +0200") References: <20051104202824.GA19678@discworld.dyndns.org> <20051202025557.GA22377@ActiveState.com> <452EBE5F.2040609@v.loewis.de> <8764eprvr0.fsf@pereiro.luannocracy.com> <452F0F8C.10708@v.loewis.de> Message-ID: <87iriowfxc.fsf@pereiro.luannocracy.com> "Martin v. L?wis" writes: > I'm not sure whether you are requesting these for yourself or for > somebody else. If for somebody else, that somebody else should seriously > consider building Python himself, and publishing the result. I'm requesting it for the many Boost.Python (heck, all Python 'C' API) users who find it a usability hurdle when their first visual studio projects fail to work properly in the default mode (debug) just because they don't have the right Python libraries. -- Dave Abrahams Boost Consulting www.boost-consulting.com From rasky at develer.com Fri Oct 13 18:21:05 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 13 Oct 2006 18:21:05 +0200 Subject: [Python-Dev] Why spawnvp not implemented on Windows? References: <452EADF4.9010800@v.loewis.de><452EBBA6.8040106@v.loewis.de> Message-ID: <04b701c6eee3$9525d7f0$e303030a@trilan> Alexey Borzenkov wrote: > Oh! Wow! I just simply didn't know of its existance (I'm pretty much > new to python), and both distutils and SCons (I was looking inside > them because they are major build systems and surely had to execute > compilers somehow), and upon seeing that each of them invented their > own method of searching path created a delusion as if inventing custom > workarounds was the only way... Sorry... x_x SCons is still compatible with Python 1.5. Distutils was written in the 1.5-1.6 timeframe; it has been updated since, but it is basically unmaintained at this point (if you exclude the setuptools stuff which is its disputed maintenance/evolution). subprocess has been introduced in Python 2.4. -- Giovanni Bajo From theller at python.net Fri Oct 13 20:20:55 2006 From: theller at python.net (Thomas Heller) Date: Fri, 13 Oct 2006 20:20:55 +0200 Subject: [Python-Dev] Modulefinder Message-ID: I have patched Lib/modulefinder.py to work with absolute and relative imports. It also is faster now, and has basic unittests in Lib/test/test_modulefinder.py. The work was done in a theller_modulefinder SVN branch. If nobody objects, I will merge this into trunk, and possibly also into release25-maint, when I have time. Thanks, Thomas From jcarlson at uci.edu Fri Oct 13 21:02:06 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 13 Oct 2006 12:02:06 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: <452F4A0C.7070101@hastings.org> References: <452A16B1.9070109@egenix.com> <452F4A0C.7070101@hastings.org> Message-ID: <20061013115748.09F2.JCARLSON@uci.edu> Larry Hastings wrote: [snip] > The machine is dual-core, and was quiescent at the time. XP's scheduler > is hopefully good enough to just leave the process running on one core. It's not. Go into the task manager (accessable via Ctrl+Alt+Del by default) and change the process' affinity to the second core. In my experience, running on the second core (in both 2k and XP) tends to produce slightly faster results. Linux tends to keep processes on a single core for a few seconds at a time. - Josiah From pandyacus at gmail.com Fri Oct 13 21:44:40 2006 From: pandyacus at gmail.com (Chetan Pandya) Date: Fri, 13 Oct 2006 12:44:40 -0700 Subject: [Python-Dev] Cloning threading.py using proccesses Message-ID: I just got around to reading the messages. When I first saw this, I thought this is so that the processes that need to share and work on shared objects. That is where the locks are required. However, all shread objects are managed by the object manager and thus all such operations are in effect sequential, even acquires on different locks. Thus other shared objects in the object manager will actually not require any (additional) synchronization. Of course, the argument here is that it is still possible to use that code. Cleanup of shared objects seems to be another thing to look out for. This is a problem that subprocesses seem to avoid and has been already suggested. -Chetan On 10/11/06, python-dev-request at python.org wrote: > > Message: 5 > Date: Wed, 11 Oct 2006 10:23:40 +0200 > From: "M.-A. Lemburg" > Subject: Re: [Python-Dev] Cloning threading.py using proccesses > To: Josiah Carlson > Cc: python-dev at python.org > Message-ID: <452CAA0C.6030306 at egenix.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Josiah Carlson wrote: > > Fredrik Lundh wrote: > >> Josiah Carlson wrote: > >> > >>> Presumably with this library you have created, you have also written a > >>> fast object encoder/decoder (like marshal or pickle). If it isn't any > >>> faster than cPickle or marshal, then users may bypass the module and > opt > >>> for fork/etc. + XML-RPC > >> XML-RPC isn't close to marshal and cPickle in performance, though, so > >> that statement is a bit misleading. > > > > You are correct, it is misleading, and relies on a few unstated > > assumptions. > > > > In my own personal delving into process splitting, RPC, etc., I usually > > end up with one of two cases; I need really fast call/return, or I need > > not slow call/return. The not slow call/return is (in my opinion) > > satisfactorally solved with XML-RPC. But I've personally not been > > satisfied with the speed of any remote 'fast call/return' packages, as > > they usually rely on cPickle or marshal, which are slow compared to > > even moderately fast 100mbit network connections. When we are talking > > about local connections, I have even seen cases where the > > cPickle/marshal calls can make it so that forking the process is faster > > than encoding the input to a called function. > > This is hard to believe. I've been in that business for a few > years and so far have not found an OS/hardware/network combination > with the mentioned features. > > Usually the worst part in performance breakdown for RPC is network > latency, ie. time to connect, waiting for the packets to come through, > etc. and this parameter doesn't really depend on the OS or hardware > you're running the application on, but is more a factor of which > network hardware, architecture and structure is being used. > > It also depends a lot on what you send as arguments, of course, > but I assume that you're not pickling a gazillion objects :-) > > > I've had an idea for a fast object encoder/decoder (with limited support > > for certain built-in Python objects), but I haven't gotten around to > > actually implementing it as of yet. > > Would be interesting to look at. > > BTW, did you know about http://sourceforge.net/projects/py-xmlrpc/ ? > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source (#1, Oct 11 2006) > >>> Python/Zope Consulting and Support ... http://www.egenix.com/ > >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ > >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ > ________________________________________________________________________ > > ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: > > > ------------------------------ > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061013/b836ab79/attachment.html From martin at v.loewis.de Sat Oct 14 00:10:52 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 14 Oct 2006 00:10:52 +0200 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <87iriowfxc.fsf@pereiro.luannocracy.com> References: <20051104202824.GA19678@discworld.dyndns.org> <20051202025557.GA22377@ActiveState.com> <452EBE5F.2040609@v.loewis.de> <8764eprvr0.fsf@pereiro.luannocracy.com> <452F0F8C.10708@v.loewis.de> <87iriowfxc.fsf@pereiro.luannocracy.com> Message-ID: <45300EEC.7080003@v.loewis.de> David Abrahams schrieb: >> I'm not sure whether you are requesting these for yourself or for >> somebody else. If for somebody else, that somebody else should seriously >> consider building Python himself, and publishing the result. > > I'm requesting it for the many Boost.Python (heck, all Python 'C' API) > users who find it a usability hurdle when their first visual studio > projects fail to work properly in the default mode (debug) just > because they don't have the right Python libraries. And there is not one of them who would be willing and able to build a debug release, and distribute that???? Regards, Martin From martin at v.loewis.de Sat Oct 14 00:23:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 14 Oct 2006 00:23:38 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <452EBC8C.4080800@voidspace.org.uk> <200610131505.24258.anthony@interlink.com.au> Message-ID: <453011EA.2090800@v.loewis.de> Steve Holden schrieb: >>> The other thing to watch out for is that I (or whoever) can still do local >>> work on a bunch of different files >> >> the point of my previous post is that you *shouldn't* have to edit a >> bunch of different files to make a new release. >> > Indeed. I seem to remember suggesting a while ago on pydotorg that > whatever replaces pyramid should cater to groups such as the release > team by allowing everything necessary to be generated from a simple set > of data that wouldn't be difficult to maintain. Anthony has enough on > his plate without having to fight the web server too ... There is always some sort of text that accompanies a release. That has to be edited to be correct; a machine can't do that. Regards, Martin From martin at v.loewis.de Sat Oct 14 00:24:39 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 14 Oct 2006 00:24:39 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <452EAAAE.2050200@v.loewis.de> Message-ID: <45301227.5020805@v.loewis.de> Thomas Heller schrieb: > Yes. But I've switched machines since I last build an installer, and I do not > have all of the needed software installed any longer, for example the Wise Installer. Ok. So we are technically incapable of producing the Windows binaries of another 2.3.x release, then? Regards, Martin From jcarlson at uci.edu Sat Oct 14 01:46:20 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 13 Oct 2006 16:46:20 -0700 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <45301227.5020805@v.loewis.de> References: <45301227.5020805@v.loewis.de> Message-ID: <20061013164246.09F8.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > > Thomas Heller schrieb: > > Yes. But I've switched machines since I last build an installer, and I do not > > have all of the needed software installed any longer, for example the Wise Installer. > > Ok. So we are technically incapable of producing the Windows binaries of > another 2.3.x release, then? I've got a build setup for 2.3.x, but I lack the Wise Installer. It may be possible to use the 2.4 or 2.5 .msi creation tools, if that was sufficient. - Josiah From tim.peters at gmail.com Sat Oct 14 01:53:12 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 13 Oct 2006 19:53:12 -0400 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <45301227.5020805@v.loewis.de> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <452EAAAE.2050200@v.loewis.de> <45301227.5020805@v.loewis.de> Message-ID: <1f7befae0610131653h52fd3bfcnd33af7c08f6fe9d@mail.gmail.com> [Thomas Heller] >> Yes. But I've switched machines since I last build an installer, and I do not >> have all of the needed software installed any longer, for example the Wise >> Installer. [Martin v. L?wis] > Ok. So we are technically incapable of producing the Windows binaries of > another 2.3.x release, then? FYI, I still have the Wise Installer. But since my understanding is that the "Unicode buffer overrun" thingie is a non-issue on Windows, I've got no interest in wrestling with a 2.3.6 for Windows. From dave at boost-consulting.com Sat Oct 14 02:51:44 2006 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 13 Oct 2006 20:51:44 -0400 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <45300EEC.7080003@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wi?= =?utf-8?Q?s's?= message of "Sat, 14 Oct 2006 00:10:52 +0200") References: <20051104202824.GA19678@discworld.dyndns.org> <20051202025557.GA22377@ActiveState.com> <452EBE5F.2040609@v.loewis.de> <8764eprvr0.fsf@pereiro.luannocracy.com> <452F0F8C.10708@v.loewis.de> <87iriowfxc.fsf@pereiro.luannocracy.com> <45300EEC.7080003@v.loewis.de> Message-ID: <877iz3u3bz.fsf@pereiro.luannocracy.com> "Martin v. L?wis" writes: > David Abrahams schrieb: >>> I'm not sure whether you are requesting these for yourself or for >>> somebody else. If for somebody else, that somebody else should seriously >>> consider building Python himself, and publishing the result. >> >> I'm requesting it for the many Boost.Python (heck, all Python 'C' API) >> users who find it a usability hurdle when their first visual studio >> projects fail to work properly in the default mode (debug) just >> because they don't have the right Python libraries. > > And there is not one of them who would be willing and able to build > a debug release, and distribute that???? I don't know. -- Dave Abrahams Boost Consulting www.boost-consulting.com From martin at v.loewis.de Sat Oct 14 07:58:59 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 14 Oct 2006 07:58:59 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <1f7befae0610131653h52fd3bfcnd33af7c08f6fe9d@mail.gmail.com> References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <452EAAAE.2050200@v.loewis.de> <45301227.5020805@v.loewis.de> <1f7befae0610131653h52fd3bfcnd33af7c08f6fe9d@mail.gmail.com> Message-ID: <45307CA3.1070100@v.loewis.de> Tim Peters schrieb: > FYI, I still have the Wise Installer. But since my understanding is > that the "Unicode buffer overrun" thingie is a non-issue on Windows, > I've got no interest in wrestling with a 2.3.6 for Windows. In 2.3.6, there wouldn't just be that change, but also a few other changes that have been collected, some relevant for Windows as well: there are several updates to the email package, and a fix to pcre to prevent a buffer overrun. I'm not saying that you should produce a Windows binary then, just that it would be good if one was produced if there was another release. Of course, people might also get the binaries from ActiveState should they produce some. Regards, Martin From martin at v.loewis.de Sat Oct 14 07:50:43 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 14 Oct 2006 07:50:43 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <20061013164246.09F8.JCARLSON@uci.edu> References: <45301227.5020805@v.loewis.de> <20061013164246.09F8.JCARLSON@uci.edu> Message-ID: <45307AB3.8010504@v.loewis.de> Josiah Carlson schrieb: > I've got a build setup for 2.3.x, but I lack the Wise Installer. It may > be possible to use the 2.4 or 2.5 .msi creation tools, if that was > sufficient. I don't think that would be appropriate. There are differences in usage which might be significant to some users, e.g. in automated install scenarios. We should attempt not to break this. Regards, Martin From aahz at pythoncraft.com Sat Oct 14 19:38:04 2006 From: aahz at pythoncraft.com (Aahz) Date: Sat, 14 Oct 2006 10:38:04 -0700 Subject: [Python-Dev] ConfigParser: whitespace leading comment lines In-Reply-To: <903323ff0610101240p2f4e0a18g18d34d1a800624ec@mail.gmail.com> References: <903323ff0610101240p2f4e0a18g18d34d1a800624ec@mail.gmail.com> Message-ID: <20061014173804.GA25333@panix.com> On Tue, Oct 10, 2006, Greg Willden wrote: > > I'd like to propose the following change to ConfigParser.py. > I won't call it a bug-fix because I don't know the relevant standards. Go ahead and submit a patch; it's guaranteed you won't get progress without it. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you don't know what your program is supposed to do, you'd better not start writing it." --Dijkstra From arigo at tunes.org Sun Oct 15 11:30:25 2006 From: arigo at tunes.org (Armin Rigo) Date: Sun, 15 Oct 2006 11:30:25 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> <452F5740.8000504@gmail.com> Message-ID: <20061015093020.GA2162@code0.codespeak.net> Hi Fredrik, On Fri, Oct 13, 2006 at 11:22:09AM +0200, Fredrik Lundh wrote: > > > static PyTypeObject NoddyType; > > static PyTypeObject *NoddyType; > > yeah, that's a silly typo. Ah, then ignore my previous remark. Armin From steve at holdenweb.com Sun Oct 15 13:23:57 2006 From: steve at holdenweb.com (Steve Holden) Date: Sun, 15 Oct 2006 12:23:57 +0100 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <453011EA.2090800@v.loewis.de> References: <200610121808.47010.anthony@interlink.com.au> <452EBC8C.4080800@voidspace.org.uk> <200610131505.24258.anthony@interlink.com.au> <453011EA.2090800@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Steve Holden schrieb: > >>>>The other thing to watch out for is that I (or whoever) can still do local >>>>work on a bunch of different files >>> >>>the point of my previous post is that you *shouldn't* have to edit a >>>bunch of different files to make a new release. >>> >> >>Indeed. I seem to remember suggesting a while ago on pydotorg that >>whatever replaces pyramid should cater to groups such as the release >>team by allowing everything necessary to be generated from a simple set >>of data that wouldn't be difficult to maintain. Anthony has enough on >>his plate without having to fight the web server too ... > > > There is always some sort of text that accompanies a release. That has > to be edited to be correct; a machine can't do that. > OK. ^everything^the content structure and many of the files^ regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From ocean at m2.ccsnet.ne.jp Sun Oct 15 13:21:18 2006 From: ocean at m2.ccsnet.ne.jp (ocean) Date: Sun, 15 Oct 2006 20:21:18 +0900 Subject: [Python-Dev] VC6 support on release25-maint Message-ID: <000d01c6f04c$092be450$0300a8c0@whiterabc2znlh> Hello. I noticed VisualC++6 support came back. I'm glad with that, but still it seems incomplete. (for example, _sqlite3 support) Maybe does this patch help process? On my machine, testcases other than distutils runs fine. http://sourceforge.net/tracker/?func=detail&aid=1457736&group_id=5470&atid=305470 From anthony at interlink.com.au Sun Oct 15 13:42:05 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Sun, 15 Oct 2006 21:42:05 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <453011EA.2090800@v.loewis.de> Message-ID: <200610152142.07869.anthony@interlink.com.au> On Sunday 15 October 2006 21:23, Steve Holden wrote: > Martin v. L?wis wrote: > > Steve Holden schrieb: > >>>>The other thing to watch out for is that I (or whoever) can still do > >>>> local work on a bunch of different files > >>> > >>>the point of my previous post is that you *shouldn't* have to edit a > >>>bunch of different files to make a new release. > >> > >>Indeed. I seem to remember suggesting a while ago on pydotorg that > >>whatever replaces pyramid should cater to groups such as the release > >>team by allowing everything necessary to be generated from a simple set > >>of data that wouldn't be difficult to maintain. Anthony has enough on > >>his plate without having to fight the web server too ... > > > > There is always some sort of text that accompanies a release. That has > > to be edited to be correct; a machine can't do that. > > OK. > > ^everything^the content structure and many of the files^ If you compare the various pieces that make up the release pages, you'll see that much of it is boilerplate, true. There's two cases worth mentioning: First release of a new series (2.4.4c1, 2.5a1). This involves making the new directory and all the little fiddly files. In practice, this is done by recursively copying the previous release and removing the .ssh directories so that it can be re-added. I then go through and update the files. Subsequent release. This is still largely a manual process - I search for all the references to the previous release, update them, then read through it for missed bits. I then update the text bits that need to be changed. There's all sorts of minor variations there - for instance, often in a non-final release, we don't have an unpacked version of the documentation (but sometimes we do, wah). The killer bits for me are all the other places. For instance, updating the sidebar menu quicklinks for 2.4.4 to 2.5. There's just too many files, and the structure of pyramid's files still doesn't make sense to me. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From martin at v.loewis.de Sun Oct 15 13:49:01 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 15 Oct 2006 13:49:01 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: <200610152142.07869.anthony@interlink.com.au> References: <200610121808.47010.anthony@interlink.com.au> <453011EA.2090800@v.loewis.de> <200610152142.07869.anthony@interlink.com.au> Message-ID: <4532202D.2080607@v.loewis.de> Anthony Baxter schrieb: > Subsequent release. This is still largely a manual process - I search for all > the references to the previous release, update them, then read through it for > missed bits. I then update the text bits that need to be changed. There's all > sorts of minor variations there - for instance, often in a non-final release, > we don't have an unpacked version of the documentation (but sometimes we do, > wah). If that's a source of pain, we can standardize (assuming you are talking about the .chm file). Which way would you like it? It really doesn't matter to me either way - I just didn't think of it causing problems. Regards, Martin From martin at v.loewis.de Sun Oct 15 14:05:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 15 Oct 2006 14:05:32 +0200 Subject: [Python-Dev] VC6 support on release25-maint In-Reply-To: <000d01c6f04c$092be450$0300a8c0@whiterabc2znlh> References: <000d01c6f04c$092be450$0300a8c0@whiterabc2znlh> Message-ID: <4532240C.3060906@v.loewis.de> ocean schrieb: > Hello. I noticed VisualC++6 support came back. I'm glad with that, > but still it seems incomplete. (for example, _sqlite3 support) Maybe > does this patch help process? These changes were all contributed by Larry Hastings. For some reason, I missed/forgot about your patch. Can you please update it? Regards, Martin From martin at v.loewis.de Sun Oct 15 14:59:57 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 15 Oct 2006 14:59:57 +0200 Subject: [Python-Dev] Cloning threading.py using proccesses In-Reply-To: References: <20061010130901.09B1.JCARLSON@uci.edu> <452CAA0C.6030306@egenix.com> <20061011090701.09CA.JCARLSON@uci.edu> <452D7E92.4050206@canterbury.ac.nz> <452ECF5F.7040204@canterbury.ac.nz> Message-ID: <453230CD.60408@v.loewis.de> Fredrik Lundh schrieb: > but given that the format *has* been stable for many years, surely it > would make more sense to just codify that fact, rather than developing > Yet Another Serialization Format instead? There have been minor changes over time, e.g. r26146 (gvanrossum) introduced TYPE_TRUE and TYPE_FALSE, r36242 (loewis) introduced TYPE_INTERNED and TYPE_STRINGREF, and r38266 (rhettinger) introduced TYPE_SET and TYPE_FROZENSET. With these changes, old dumps can load in new versions, but not vice versa. Furthermore, r27219 (nnorwitz) changed the co_argcount, co_nlocals, co_stacksize, co_flags, and co_firstlineno fields from short to long; unmarshalling from an old version would just crash/read garbage. So how would you propose to deal with such changes in the future? Regards, Martin From martin at v.loewis.de Sun Oct 15 15:13:21 2006 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 15 Oct 2006 15:13:21 +0200 Subject: [Python-Dev] os.utime on directories: bug fix or new feature? Message-ID: <453233F1.2070202@v.loewis.de> In Python 2.5.0 and earlier, it is not possible to modify the time stamps of a directory (mtime and atime) on Windows. The reason is that you cannot "open" (CreateFile) a directory. On W9x, it isn't possible, period. On WNT+, it's possible if you pass FILE_FLAG_BACKUP_SEMANTICS to CreateFile. I just applied patch #1576166 to the trunk which does that. Should I backport the patch to 2.5, as it is a bug that you can modify the time stamps of regular files but not directories? Or should I not backport as it is a new feature that you can now adjust the time stamps of a directory, and couldn't before? Anthony, can you please pronounce? Regards, Martin From aahz at pythoncraft.com Sun Oct 15 15:35:21 2006 From: aahz at pythoncraft.com (Aahz) Date: Sun, 15 Oct 2006 06:35:21 -0700 Subject: [Python-Dev] os.utime on directories: bug fix or new feature? In-Reply-To: <453233F1.2070202@v.loewis.de> References: <453233F1.2070202@v.loewis.de> Message-ID: <20061015133521.GA22874@panix.com> On Sun, Oct 15, 2006, "Martin v. L?wis" wrote: > > Should I backport the patch to 2.5, as it is a bug that you can modify > the time stamps of regular files but not directories? Or should I > not backport as it is a new feature that you can now adjust the time > stamps of a directory, and couldn't before? My vote is that it's a bugfix but should be treated as a new feature and rejected for 2.5, based on the standard argument about capabilities and the problems with bugfix releases having new capabilities. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you don't know what your program is supposed to do, you'd better not start writing it." --Dijkstra From anthony at interlink.com.au Sun Oct 15 16:01:42 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 16 Oct 2006 00:01:42 +1000 Subject: [Python-Dev] os.utime on directories: bug fix or new feature? In-Reply-To: <20061015133521.GA22874@panix.com> References: <453233F1.2070202@v.loewis.de> <20061015133521.GA22874@panix.com> Message-ID: <200610160001.45322.anthony@interlink.com.au> On Sunday 15 October 2006 23:35, Aahz wrote: > On Sun, Oct 15, 2006, "Martin v. L?wis" wrote: > > Should I backport the patch to 2.5, as it is a bug that you can modify > > the time stamps of regular files but not directories? Or should I > > not backport as it is a new feature that you can now adjust the time > > stamps of a directory, and couldn't before? > > My vote is that it's a bugfix but should be treated as a new feature and > rejected for 2.5, based on the standard argument about capabilities and > the problems with bugfix releases having new capabilities. Since it wasn't possible in earlier than 2.5 either, I'd say it's on the edge of being a bugfix. Let's be conservative and not backport it, since it's also a pretty marginal feature. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From barry at barrys-emacs.org Sun Oct 15 20:50:22 2006 From: barry at barrys-emacs.org (Barry Scott) Date: Sun, 15 Oct 2006 19:50:22 +0100 Subject: [Python-Dev] Problem building module against Mac Python 2.4 and Python 2.5 Message-ID: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org> This may be down to my lack of knowledge of Mac OS X development. I want to build my python extension for Python 2.3, 2.4 and 2.5 on the same Mac. Build Python 2.3 and Python 2.4 has been working well for a long time. But after I installed Python 2.5 it seems that I can no longer link a against Python 2.4 without changing sym link /Library/Frameworks/Python.framework/ Versions/Current to point at the one I want to build against. The problem did not arise with Python 2.3 and Python 2.4 because Python 2.3 is in /System/Library and Python 2.4 is in /LIbrary. Telling ld which framework folder to look in allows both to be linked against. Is there a way to force ld to use a particular version of the python framework or do I have to change the symlink each time I build against a different version? This type of problem does not happen on Windows or Unix by design. Barry From bob at redivi.com Sun Oct 15 21:41:53 2006 From: bob at redivi.com (Bob Ippolito) Date: Sun, 15 Oct 2006 12:41:53 -0700 Subject: [Python-Dev] Problem building module against Mac Python 2.4 and Python 2.5 In-Reply-To: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org> References: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org> Message-ID: <6a36e7290610151241y55e1078dx5f11126e31bbb01f@mail.gmail.com> On 10/15/06, Barry Scott wrote: > This may be down to my lack of knowledge of Mac OS X development. > > I want to build my python extension for Python 2.3, 2.4 and 2.5 on > the same Mac. > Build Python 2.3 and Python 2.4 has been working well for a long > time. But > after I installed Python 2.5 it seems that I can no longer link a > against Python 2.4 > without changing sym link /Library/Frameworks/Python.framework/ > Versions/Current > to point at the one I want to build against. > > The problem did not arise with Python 2.3 and Python 2.4 because > Python 2.3 > is in /System/Library and Python 2.4 is in /LIbrary. Telling ld which > framework > folder to look in allows both to be linked against. > > Is there a way to force ld to use a particular version of the python > framework or do > I have to change the symlink each time I build against a different > version? > > This type of problem does not happen on Windows or Unix by design. Use an absolute path to the library rather than -framework. Or use distutils! -bob From ronaldoussoren at mac.com Sun Oct 15 22:11:12 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 15 Oct 2006 22:11:12 +0200 Subject: [Python-Dev] Problem building module against Mac Python 2.4 and Python 2.5 In-Reply-To: <6a36e7290610151241y55e1078dx5f11126e31bbb01f@mail.gmail.com> References: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org> <6a36e7290610151241y55e1078dx5f11126e31bbb01f@mail.gmail.com> Message-ID: <58929AAE-9357-4EE5-BD46-8597A343AACE@mac.com> On Oct 15, 2006, at 9:41 PM, Bob Ippolito wrote: > On 10/15/06, Barry Scott wrote: >> This may be down to my lack of knowledge of Mac OS X development. >> >> I want to build my python extension for Python 2.3, 2.4 and 2.5 on >> the same Mac. >> Build Python 2.3 and Python 2.4 has been working well for a long >> time. But >> after I installed Python 2.5 it seems that I can no longer link a >> against Python 2.4 >> without changing sym link /Library/Frameworks/Python.framework/ >> Versions/Current >> to point at the one I want to build against. >> >> The problem did not arise with Python 2.3 and Python 2.4 because >> Python 2.3 >> is in /System/Library and Python 2.4 is in /LIbrary. Telling ld which >> framework >> folder to look in allows both to be linked against. >> >> Is there a way to force ld to use a particular version of the python >> framework or do >> I have to change the symlink each time I build against a different >> version? >> >> This type of problem does not happen on Windows or Unix by design. > > Use an absolute path to the library rather than -framework. That is, add '/Library/Frameworks/Python.framework/Versions/2.4/ Python' to the link command instead of '-framework Python'. > > Or use distutils! That's definitely advisable anyway, that way you'll automaticly get the right flags to compile and link the extension :-) Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061015/646f0311/attachment.bin From kbk at shore.net Mon Oct 16 04:20:15 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun, 15 Oct 2006 22:20:15 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200610160220.k9G2KFNN020854@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 431 open ( +3) / 3425 closed ( +8) / 3856 total (+11) Bugs : 916 open (-23) / 6273 closed (+44) / 7189 total (+21) RFE : 244 open ( +4) / 240 closed ( +1) / 484 total ( +5) New / Reopened Patches ______________________ typo in PC/_msi.c (2006-10-07) CLOSED http://python.org/sf/1572724 opened by jose nazario Fix for segfault in ISO 2022 codecs (2006-10-07) CLOSED http://python.org/sf/1572832 opened by Ray Chason let quit and exit really exit (2006-10-09) CLOSED http://python.org/sf/1573835 opened by Gerrit Holl urllib2 - Fix line breaks in authorization headers (2006-10-09) http://python.org/sf/1574068 opened by Scott Dial Add %var% support to ntpath.expandvars (2006-10-09) http://python.org/sf/1574252 opened by Chip Norkus Mailbox will lock properly after flush() (2006-10-11) http://python.org/sf/1575506 opened by Philippe Gauthier Support spawnvp[e] + use native execvp[e] on win32 (2006-10-12) http://python.org/sf/1576120 opened by Snaury os.utime acess denied with directories on win32 (2006-10-12) CLOSED http://python.org/sf/1576166 opened by Snaury os.execvp[e] on win32 fails for current directory (2006-10-13) http://python.org/sf/1576313 opened by Snaury Fix VC6 build, remove redundant files for VC7 build (2006-10-14) CLOSED http://python.org/sf/1576954 opened by Larry Hastings Fix VC6 build, remove redundant files for VC7 build (2006-10-14) CLOSED http://python.org/sf/1577078 opened by Larry Hastings Add _ctypes, _ctypes_test, and _elementtree to VC6 build (2006-10-15) CLOSED http://python.org/sf/1577551 opened by Larry Hastings newline in -DSVNVERSION=\"`LANG=C svnversion .`\" (2006-10-15) CLOSED http://python.org/sf/1577756 opened by Daniel Str?nger Patches Closed ______________ typo in PC/_msi.c (2006-10-07) http://python.org/sf/1572724 closed by gbrandl Fix for segfault in ISO 2022 codecs (2006-10-08) http://python.org/sf/1572832 closed by perky fix crash with continue in nested try/finally (2006-08-18) http://python.org/sf/1542451 closed by gbrandl let quit and exit really exit (2006-10-09) http://python.org/sf/1573835 closed by mwh Fix for Lib/test/crashers/gc_inspection.py (2006-07-04) http://python.org/sf/1517042 closed by gbrandl os.utime acess denied with directories on win32 (2006-10-12) http://python.org/sf/1576166 closed by loewis Fix VC6 build, remove redundant files for VC7 build (2006-10-14) http://python.org/sf/1576954 closed by loewis Fix VC6 build, remove redundant files for VC7 build (2006-10-14) http://python.org/sf/1577078 deleted by lhastings Add _ctypes, _ctypes_test, and _elementtree to VC6 build (2006-10-15) http://python.org/sf/1577551 closed by loewis newline in -DSVNVERSION=\"`LANG=C svnversion .`\" (2006-10-15) http://python.org/sf/1577756 deleted by schmaller New / Reopened Bugs ___________________ cElementTree.SubElement doesn't recognize keyword "attrib" (2006-10-07) CLOSED http://python.org/sf/1572710 opened by Mark Stephens import org.python.core imports local org.py (2006-10-08) CLOSED http://python.org/sf/1573180 opened by E.-O. Le Bigot ctypes unit test fails (test_macholib.py) under MacOS 10.4.7 (2006-08-21) CLOSED http://python.org/sf/1544102 reopened by ronaldoussoren struct module doesn't use weakref for cache (2006-10-08) CLOSED http://python.org/sf/1573394 opened by Mark Flacy sqlite3 documentation on rowcount is contradictory (2006-10-10) http://python.org/sf/1573854 opened by Seo Sanghyeon if(status = ERROR_MORE_DATA) (2006-10-09) CLOSED http://python.org/sf/1573928 opened by Helmut Grohne WSGI, cgi.FieldStorage incompatibility (2006-10-09) http://python.org/sf/1573931 opened by Michael Kerrin isinstance swallows exceptions (2006-10-09) http://python.org/sf/1574217 opened by Brian Harring os.popen with os.close gives error message (2006-10-10) http://python.org/sf/1574310 opened by dtrosset Error with callback function and as_parameter with NumPy ndp (2006-10-10) http://python.org/sf/1574584 opened by Albert Strasheim ctypes: Pointer-to-pointer unchanged in callback (2006-10-10) http://python.org/sf/1574588 opened by Albert Strasheim ctypes: Returning c_void_p from callback doesn't work (2006-10-10) http://python.org/sf/1574593 opened by Albert Strasheim Request wave support > 16 bit samples (2006-10-11) http://python.org/sf/1575020 opened by Murray Lang isSequenceType returns True for dict subclasses (<> 2.3) (2006-10-11) http://python.org/sf/1575169 opened by Martin Gfeller typo: section 2.1 -> property (2006-10-12) CLOSED http://python.org/sf/1575746 opened by Antoine De Groote Missing notice on environment setting LD_LIBRARY_PATH (2006-10-12) CLOSED http://python.org/sf/1575803 opened by Anastasios Hatzis from_param and _as_parameter_ truncating 64-bit value (2006-10-12) http://python.org/sf/1575945 opened by Albert Strasheim str(WindowsError) wrong (2006-10-12) http://python.org/sf/1576174 opened by Thomas Heller ConfigParser: whitespace leading comment lines (2006-10-12) http://python.org/sf/1576208 opened by gregwillden functools.wraps fails on builtins (2006-10-12) http://python.org/sf/1576241 opened by kajiuma Example typo in section 4 of 'Installing Python Modules' (2006-10-13) http://python.org/sf/1576348 opened by ytrewq1 enable-shared .dso location (2006-10-12) CLOSED http://python.org/sf/1576394 opened by Mike Klaas cStringIO misbehaving with unicode (2006-10-13) CLOSED http://python.org/sf/1576443 opened by Yang Zhang ftplib doesn't follow standard (2006-10-13) http://python.org/sf/1576598 opened by Denis S. Otkidach dict keyerror formatting and tuples (2006-10-13) http://python.org/sf/1576657 opened by M.-A. Lemburg potential buffer overflow in complexobject.c (2006-10-13) http://python.org/sf/1576861 opened by Jochen Voss GetFileAttributesExA and Win95 (2006-09-29) CLOSED http://python.org/sf/1567666 reopened by giomach Bugs Closed ___________ csv "dialect = 'excel-tab'" to use excel_tab (2006-10-06) http://python.org/sf/1572471 closed by montanaro cElementTree.SubElement doesn't recognize keyword "attrib" (2006-10-07) http://python.org/sf/1572710 closed by effbot tabs missing in idle options configure (2006-09-28) http://python.org/sf/1567450 closed by kbk IDLE doesn't load - apparently without firewall problems (2006-09-22) http://python.org/sf/1563630 closed by kbk Let assign to as raise SyntaxWarning as well (2003-02-23) http://python.org/sf/691733 closed by nnorwitz cvs update warnings (2003-07-02) http://python.org/sf/764447 closed by nnorwitz Minor floatobject.c bug (2003-08-15) http://python.org/sf/789159 closed by nnorwitz another threads+readline+signals nasty (2004-06-11) http://python.org/sf/971213 closed by nnorwitz init_types (2006-09-30) http://python.org/sf/1568243 closed by gbrandl import org.python.core imports local org.py (2006-10-08) http://python.org/sf/1573180 closed by gbrandl PGIRelease linkage fails on pgodb80.dll (2006-10-02) http://python.org/sf/1569517 closed by krisvale missing _typesmodule.c,Visual Studio 2005 pythoncore.vcproj (2006-09-29) http://python.org/sf/1567910 closed by krisvale ctypes unit test fails (test_macholib.py) under MacOS 10.4.7 (2006-08-21) http://python.org/sf/1544102 closed by ronaldoussoren Tutorial: incorrect info about package importing and mac (2006-09-17) http://python.org/sf/1560114 closed by gbrandl struct module doesn't use weakref for cache (2006-10-08) http://python.org/sf/1573394 closed by etrepum setup() keyword have to be list (doesn't work with tuple) (2006-08-23) http://python.org/sf/1545341 closed by akuchling if(status = ERROR_MORE_DATA) (2006-10-09) http://python.org/sf/1573928 closed by gbrandl os.stat() subsecond file mode time is incorrect on Windows (2006-09-25) http://python.org/sf/1565150 closed by loewis 2.6 changes stomp on 2.5 docs (2006-09-23) http://python.org/sf/1564039 closed by sf-robot Cannot use high-numbered sockets in 2.4.3 (2006-05-24) http://python.org/sf/1494314 closed by anthonybaxter typo: section 2.1 -> property (2006-10-12) http://python.org/sf/1575746 closed by gbrandl -Qnew switch doesn't work (2003-09-26) http://python.org/sf/813342 closed by gbrandl sets missing from standard types list in ref (2006-09-26) http://python.org/sf/1565919 closed by gbrandl make plistlib.py available in every install (2006-09-25) http://python.org/sf/1565129 closed by gbrandl inspect module and class startlineno (2006-09-01) http://python.org/sf/1550524 closed by gbrandl Pdb parser bug (2006-08-30) http://python.org/sf/1549574 closed by gbrandl Missing notice on environment setting LD_LIBRARY_PATH (2006-10-12) http://python.org/sf/1575803 closed by loewis shlex (or perhaps cStringIO) and unicode strings (2006-08-29) http://python.org/sf/1548891 closed by gbrandl Build of 2.4.3 on fedora core 5 fails to find asm/msr.h (2006-09-03) http://python.org/sf/1551238 closed by gbrandl urlparse.urljoin odd behaviour (2006-08-25) http://python.org/sf/1546628 closed by gbrandl inconsistent treatment of NULs in int() (2006-08-23) http://python.org/sf/1545497 closed by gbrandl Move fpectl elsewhere in library reference (2006-09-11) http://python.org/sf/1556261 closed by gbrandl Fix Lib/test/test___all__.py (2006-08-22) http://python.org/sf/1544295 closed by gbrandl smeared title when installing (2004-10-18) http://python.org/sf/1049615 closed by gbrandl nit for builtin sum doc (2005-09-07) http://python.org/sf/1283491 closed by gbrandl site-packages & build-dir python (2002-07-25) http://python.org/sf/586700 closed by gbrandl __name__ doesn't show up in dir() of class (2006-08-03) http://python.org/sf/1534014 closed by gbrandl Interpreter crash: filter() + gc.get_referrers() (2006-07-05) http://python.org/sf/1517663 closed by gbrandl Better/faster implementation of os.path.basename/dirname (2006-09-17) http://python.org/sf/1560179 closed by gbrandl enable-shared .dso location (2006-10-13) http://python.org/sf/1576394 closed by loewis cStringIO misbehaving with unicode (2006-10-13) http://python.org/sf/1576443 closed by gbrandl GetFileAttributesExA and Win95 (2006-09-29) http://python.org/sf/1567666 closed by loewis site-packages isn't created before install_egg_info (2006-09-27) http://python.org/sf/1566719 closed by sf-robot New / Reopened RFE __________________ release GIL while doing I/O operations in the mmap module (2006-10-08) http://python.org/sf/1572968 opened by Lukas Lalinsky RFE Closed __________ Print identical floats consistently (2006-08-05) http://python.org/sf/1534942 closed by gbrandl From kristjan at ccpgames.com Mon Oct 16 15:07:09 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Mon, 16 Oct 2006 13:07:09 -0000 Subject: [Python-Dev] Python 2.5 performance Message-ID: <129CEF95A523704B9D46959C922A280002FE99FE@nemesis.central.ccp.cc> Well, it ought to be possible. I can turn off the instrumentation on the other modules, and see what happens. K > -----Original Message----- > From: Giovanni Bajo [mailto:rasky at develer.com] > Sent: 12. okt?ber 2006 20:30 > To: Kristj?n V. J?nsson > Cc: python-dev at python.org > Subject: Re: Python 2.5 performance > > Kristj?n V. J?nsson wrote: > > > This is an improvement of another 3.5 %. > > In all, we have a performance increase of more than 10%. > > Granted, this is from a single set of runs, but I think we should > > start considering to make PCBuild8 a "supported" build. > > Kristj?n, I wonder if the performance improvement comes from > ceval.c only (or maybe a few other selected files). Is it > possible to somehow link the PGO-optimized ceval.obj into the > VS2003 project? > -- > Giovanni Bajo > > From kristjan at ccpgames.com Mon Oct 16 15:09:44 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Mon, 16 Oct 2006 13:09:44 -0000 Subject: [Python-Dev] Python 2.5 performance Message-ID: <129CEF95A523704B9D46959C922A280002FE99FF@nemesis.central.ccp.cc> I must confess that I am not familiar with the buildbots. I could imagine that it would be difficult to set up internally due to security concerns, but I can voice the issue here. K > -----Original Message----- > From: Anthony Baxter [mailto:anthony at interlink.com.au] > Sent: 12. okt?ber 2006 21:13 > To: python-dev at python.org > Cc: Martin v. L?wis; Kristj?n V. J?nsson > Subject: Re: [Python-Dev] Python 2.5 performance > > On Friday 13 October 2006 07:00, Martin v. L?wis wrote: > > Kristj?n V. J?nsson schrieb: > > > This is an improvement of another 3.5 %. > > > In all, we have a performance increase of more than 10%. > > > Granted, this is from a single set of runs, but I think we should > > > start considering to make PCBuild8 a "supported" build. > > > > What do you mean by that? That Python 2.5.1 should be > compiled with VC > > 2005? Something else (if so, what)? > > I don't think we should switch the "official" compiler for a > point release. > I'm happy to say something like "we make the PCbuild8 > environment a supported compiler", which means we need, at a > bare minimum, a buildbot slave for that compiler/platform. > Kristj?n, is this something you can offer? > > Without a buildbot for that compiler, I don't think we can > claim it's supported. There's plenty of platforms we > "support" which don't have buildslaves, but they're all > variants of Unix - I'm happy that they are all mostly[1] sane. > > Anthony > > [1] Offer void on some versions of HP/UX, Irix, AIX > -- > Anthony Baxter > It's never too late to have a happy childhood. > From martin at v.loewis.de Mon Oct 16 21:37:56 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 16 Oct 2006 21:37:56 +0200 Subject: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 performance) In-Reply-To: <129CEF95A523704B9D46959C922A280002FE99FF@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE99FF@nemesis.central.ccp.cc> Message-ID: <4533DF94.9010404@v.loewis.de> Kristj?n V. J?nsson schrieb: > I must confess that I am not familiar with the buildbots. The challenge and work-load is primarily initially in setting it up; in this case (for PCbuild8), there is work for both the master and the slave sides (probably, new scripts in Tools/buildbot will have to be created). > I could > imagine that it would be difficult to set up internally due to > security concerns, but I can voice the issue here. It's not mandatory, of course: neither that there is a PCbuild8 buildbot at all, or that it is hosted at ccpgames. It just would reduce the chance that breakage of PCbuild8 goes unnoticed for long. As for the security concerns: the buildbot slave actively opens a networking connection to the master; you don't have to open any additional ports on your firewalls. Of course, the master can send the slave arbitrary commands to execute, so if the master is taken over by some attacker, that attacker could easily get control over all slaves also (except that you want to run the slave in a restricted account, so that the attacker would have to find a hole in the slave's operating system, also, before taking the machine over completely). As for making VS 2005 "more official": you also might have meant that the PCbuild directory should be converted to VS 2005. That would have a number of implications (on the buildbots, on changes to Tools/msi, and on potential usage of VS 2007 for Python 2.6), which need to be discussed when somebody actually proposes such a change. Regards, Martin From barry at barrys-emacs.org Mon Oct 16 21:57:04 2006 From: barry at barrys-emacs.org (Barry Scott) Date: Mon, 16 Oct 2006 20:57:04 +0100 Subject: [Python-Dev] Problem building module against Mac Python 2.4 and Python 2.5 In-Reply-To: <58929AAE-9357-4EE5-BD46-8597A343AACE@mac.com> References: <94B4C274-1414-4AD0-AE70-E16DB2290E65@barrys-emacs.org> <6a36e7290610151241y55e1078dx5f11126e31bbb01f@mail.gmail.com> <58929AAE-9357-4EE5-BD46-8597A343AACE@mac.com> Message-ID: >> >> Use an absolute path to the library rather than -framework. > > That is, add '/Library/Frameworks/Python.framework/Versions/2.4/ > Python' to the link command instead of '-framework Python'. Thanks I'll update my builds to do that. >> >> Or use distutils! > > That's definitely advisable anyway, that way you'll automaticly get > the right flags to compile and link the extension :-) I call distutils to get some information for CFLAGS and include dirs. I'll look at what I get back for libs and update my build script. All my code is C++ and in the past distutils lacked C++ support so I could not use it and have develoer my own solution to the build problem. Does distutils work for C++ code these days? Barry From fredrik at pythonware.com Tue Oct 17 10:54:29 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 17 Oct 2006 10:54:29 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun References: <200610121808.47010.anthony@interlink.com.au> <200610130528.01672.anthony@interlink.com.au> <452EAAAE.2050200@v.loewis.de> <45301227.5020805@v.loewis.de><1f7befae0610131653h52fd3bfcnd33af7c08f6fe9d@mail.gmail.com> <45307CA3.1070100@v.loewis.de> Message-ID: Martin v. L?wis wrote: > In 2.3.6, there wouldn't just be that change, but also a few other > changes that have been collected, some relevant for Windows as well why not just do a "2.3.5+security" source release, and leave the rest to the downstream maintainers? From anthony at interlink.com.au Tue Oct 17 11:02:10 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 17 Oct 2006 19:02:10 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <45307CA3.1070100@v.loewis.de> Message-ID: <200610171902.12840.anthony@interlink.com.au> On Tuesday 17 October 2006 18:54, Fredrik Lundh wrote: > Martin v. L?wis wrote: > > In 2.3.6, there wouldn't just be that change, but also a few other > > changes that have been collected, some relevant for Windows as well > > why not just do a "2.3.5+security" source release, and leave the rest to > the downstream maintainers? I think we'd need to renumber it to 2.3.6 at least, otherwise there's the problem of distinguishing between the two. I'd _hope_ that all the downstreams will have picked up the patch (if you know of someone who hasn't, let me know and I'll kick them for you if it would help). But I'm certainly thinking if there's a 2.3.6, it's going to be 2.3.5 with the email fix and the unicode repr() fix, and that's it. No windows or Mac binaries - they'll be pointed to the perfectly fine 2.3.5 binary installers. And no, I'm not doing another 2.2 release :) From fredrik at pythonware.com Tue Oct 17 11:03:43 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 17 Oct 2006 11:03:43 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun References: <200610121808.47010.anthony@interlink.com.au> <2514DA1C-F5A1-4144-9068-006A933C516C@python.org> <200610130714.00673.anthony@interlink.com.au> Message-ID: Steve Holden wrote: > Or you can start to promote Django again ... my original plan would still work, I think: http://effbot.org/zone/pydotorg-cache.htm#todo From fredrik at pythonware.com Tue Oct 17 11:09:20 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 17 Oct 2006 11:09:20 +0200 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun References: <200610121808.47010.anthony@interlink.com.au><45307CA3.1070100@v.loewis.de> <200610171902.12840.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > > why not just do a "2.3.5+security" source release, and leave the rest to > > the downstream maintainers? > > I think we'd need to renumber it to 2.3.6 at least, otherwise there's the > problem of distinguishing between the two. I'd _hope_ that all the > downstreams will have picked up the patch (if you know of someone who hasn't, > let me know and I'll kick them for you if it would help). in my experience, downstream builders tend to deal with patches just fine; I'm more worried about people who build directly from tarballs (using the good old "wget, tar xvfz, configure, make" mental macro) > But I'm certainly thinking if there's a 2.3.6, it's going to be 2.3.5 with the > email fix and the unicode repr() fix, and that's it. sounds good to me. how much work would that be, and if you're willing to coordinate, is there anything we can do to help? From anthony at interlink.com.au Tue Oct 17 11:35:16 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 17 Oct 2006 19:35:16 +1000 Subject: [Python-Dev] 2.3.6 for the unicode buffer overrun In-Reply-To: References: <200610121808.47010.anthony@interlink.com.au> <200610171902.12840.anthony@interlink.com.au> Message-ID: <200610171935.18447.anthony@interlink.com.au> On Tuesday 17 October 2006 19:09, Fredrik Lundh wrote: > > But I'm certainly thinking if there's a 2.3.6, it's going to be 2.3.5 > > with the email fix and the unicode repr() fix, and that's it. > > sounds good to me. how much work would that be, and if you're willing to > coordinate, is there anything we can do to help? Less than a normal release, since I'm not going to worry about changing the docs, the windows installers or the mac installers. I'll look at it next week, once 2.4.4 final is done. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From grig.gheorghiu at gmail.com Tue Oct 17 16:59:59 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Tue, 17 Oct 2006 07:59:59 -0700 Subject: [Python-Dev] svn.python.org down Message-ID: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> FYI -- can't do svn checkouts/updates from the trunk at this point. starting svn operation svn update --revision HEAD in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs) svn: PROPFIND request failed on '/projects/python/trunk' svn: PROPFIND of '/projects/python/trunk': could not connect to server (http://svn.python.org) Grig -- http://agiletesting.blogspot.com From kristjan at ccpgames.com Tue Oct 17 17:09:45 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Tue, 17 Oct 2006 15:09:45 -0000 Subject: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 performance) Message-ID: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc> Okay, a buildbot then doesn't sound quite that scary. Any info somewhere on how to set one up on a windows box? I Also wasn't suggesting that we change the PCBuild directory, since I think we definitely want to keep the old support. But I agree that getting regular builds running would be a good thing. An x64 box would be ideal to build both the x86 and x64 versions on. A single bot can manage many platforms, right? I would also need to get the _msi and _sqlite3 modules building (which I haven't yet, since I didn't get their sources.) Kristj?n > -----Original Message----- > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] > Sent: 16. okt?ber 2006 19:38 > To: Kristj?n V. J?nsson > Cc: Anthony Baxter; python-dev at python.org > Subject: Re: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 > performance) > > Kristj?n V. J?nsson schrieb: > > I must confess that I am not familiar with the buildbots. > > The challenge and work-load is primarily initially in setting > it up; in this case (for PCbuild8), there is work for both > the master and the slave sides (probably, new scripts in > Tools/buildbot will have to be created). > > > I could > > imagine that it would be difficult to set up internally due to > > security concerns, but I can voice the issue here. > > It's not mandatory, of course: neither that there is a > PCbuild8 buildbot at all, or that it is hosted at ccpgames. > It just would reduce the chance that breakage of PCbuild8 > goes unnoticed for long. > > As for the security concerns: the buildbot slave actively > opens a networking connection to the master; you don't have > to open any additional ports on your firewalls. Of course, > the master can send the slave arbitrary commands to execute, > so if the master is taken over by some attacker, that > attacker could easily get control over all slaves also > (except that you want to run the slave in a restricted > account, so that the attacker would have to find a hole in > the slave's operating system, also, before taking the machine > over completely). > > As for making VS 2005 "more official": you also might have > meant that the PCbuild directory should be converted to VS 2005. > That would have a number of implications (on the buildbots, > on changes to Tools/msi, and on potential usage of VS 2007 > for Python 2.6), which need to be discussed when somebody > actually proposes such a change. > > Regards, > Martin > From anthony at interlink.com.au Tue Oct 17 17:28:53 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed, 18 Oct 2006 01:28:53 +1000 Subject: [Python-Dev] BRANCH FREEZE release24-maint, Wed 18th Oct, 00:00UTC Message-ID: <200610180128.57266.anthony@interlink.com.au> I'm declaring the branch frozen for 2.4.4 final from 00:00 UTC (that's about 8 hours from now). The release will either be Wednesday 18th or Thursday 19th. There's a blocking bug http://www.python.org/sf/1578513 - I've attached a patch for it, if someone with autoconf knowledge wants to review that it can be checked in. It _should_ be good, and probably needs to be applied to release25-maint and the trunk as well. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From grig.gheorghiu at gmail.com Tue Oct 17 17:38:58 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Tue, 17 Oct 2006 08:38:58 -0700 Subject: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 performance) In-Reply-To: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc> Message-ID: <3f09d5a00610170838w40217aa9s7f9ac713c6e0c866@mail.gmail.com> On 10/17/06, Kristj?n V. J?nsson wrote: > > Okay, a buildbot then doesn't sound quite that scary. Any info somewhere on how to set one up on a windows box? > http://wiki.python.org/moin/BuildbotOnWindows Grig From anthony at interlink.com.au Tue Oct 17 17:48:07 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed, 18 Oct 2006 01:48:07 +1000 Subject: [Python-Dev] svn.python.org down In-Reply-To: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> Message-ID: <200610180148.12006.anthony@interlink.com.au> On Wednesday 18 October 2006 00:59, Grig Gheorghiu wrote: > FYI -- can't do svn checkouts/updates from the trunk at this point. > > starting svn operation > svn update --revision HEAD > in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs) > svn: PROPFIND request failed on '/projects/python/trunk' > svn: PROPFIND of '/projects/python/trunk': could not connect to server > (http://svn.python.org) It works for me. Can you connect to port 22 on svn.python.org? From grig.gheorghiu at gmail.com Tue Oct 17 17:51:07 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Tue, 17 Oct 2006 08:51:07 -0700 Subject: [Python-Dev] svn.python.org down In-Reply-To: <200610180148.12006.anthony@interlink.com.au> References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> <200610180148.12006.anthony@interlink.com.au> Message-ID: <3f09d5a00610170851ic91cdf2x7a9e6a5a687775df@mail.gmail.com> On 10/17/06, Anthony Baxter wrote: > On Wednesday 18 October 2006 00:59, Grig Gheorghiu wrote: > > FYI -- can't do svn checkouts/updates from the trunk at this point. > > > > starting svn operation > > svn update --revision HEAD > > in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs) > > svn: PROPFIND request failed on '/projects/python/trunk' > > svn: PROPFIND of '/projects/python/trunk': could not connect to server > > (http://svn.python.org) > > It works for me. Can you connect to port 22 on svn.python.org? > I can connect with ssh, but svn checkouts fail across the board for all pybots buildslaves: http://www.python.org/dev/buildbot/community/all/ Grig From p.f.moore at gmail.com Tue Oct 17 17:51:18 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 17 Oct 2006 16:51:18 +0100 Subject: [Python-Dev] svn.python.org down In-Reply-To: <200610180148.12006.anthony@interlink.com.au> References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> <200610180148.12006.anthony@interlink.com.au> Message-ID: <79990c6b0610170851q7a0cec02h378a449d466fe7f1@mail.gmail.com> On 10/17/06, Anthony Baxter wrote: > On Wednesday 18 October 2006 00:59, Grig Gheorghiu wrote: > > FYI -- can't do svn checkouts/updates from the trunk at this point. > > > > starting svn operation > > svn update --revision HEAD > > in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs) > > svn: PROPFIND request failed on '/projects/python/trunk' > > svn: PROPFIND of '/projects/python/trunk': could not connect to server > > (http://svn.python.org) > > It works for me. Can you connect to port 22 on svn.python.org? I think it's the HTTP side of things. The ViewCVS interface isn't working either. Paul. From anthony at interlink.com.au Tue Oct 17 17:54:33 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed, 18 Oct 2006 01:54:33 +1000 Subject: [Python-Dev] svn.python.org down In-Reply-To: <200610180148.12006.anthony@interlink.com.au> References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> <200610180148.12006.anthony@interlink.com.au> Message-ID: <200610180154.34925.anthony@interlink.com.au> Ah - the svn-apache server was down. I've restarted it. We should probably put some monitoring/restarting in place for those servers - if someone wants to volunteer a script I'll add it to cron, or I'll write it myself when I get a chance. (I was testing with svn+ssh, it was the http version that was down) Anthony From p.f.moore at gmail.com Tue Oct 17 17:57:52 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 17 Oct 2006 16:57:52 +0100 Subject: [Python-Dev] svn.python.org down In-Reply-To: <200610180154.34925.anthony@interlink.com.au> References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> <200610180148.12006.anthony@interlink.com.au> <200610180154.34925.anthony@interlink.com.au> Message-ID: <79990c6b0610170857y268f7b40ye519c007c065dd59@mail.gmail.com> On 10/17/06, Anthony Baxter wrote: > Ah - the svn-apache server was down. I've restarted it. We should probably put > some monitoring/restarting in place for those servers - if someone wants to > volunteer a script I'll add it to cron, or I'll write it myself when I get a > chance. Working now. Thanks. Paul. From skip at pobox.com Tue Oct 17 18:32:26 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 17 Oct 2006 11:32:26 -0500 Subject: [Python-Dev] svn.python.org down In-Reply-To: <200610180154.34925.anthony@interlink.com.au> References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> <200610180148.12006.anthony@interlink.com.au> <200610180154.34925.anthony@interlink.com.au> Message-ID: <17717.1434.974171.689688@montanaro.dyndns.org> Anthony> Ah - the svn-apache server was down. I've restarted it. We Anthony> should probably put some monitoring/restarting in place for Anthony> those servers - if someone wants to volunteer a script I'll add Anthony> it to cron, or I'll write it myself when I get a chance. Is this on a machine hosted by xs4all? If so, we can probably just ask them to monitor it from nagios (or whatever tool they use). Skip From brett at python.org Tue Oct 17 20:58:53 2006 From: brett at python.org (Brett Cannon) Date: Tue, 17 Oct 2006 11:58:53 -0700 Subject: [Python-Dev] who is interested on being on a python-dev panel at PyCon? Message-ID: For the past couple years there has been the suggestion of having a panel discussion made up of core developers at PyCon. Basically it would provide a way for the community to find how we do things, where we are going, our views, etc. I have finally decided to step forward and try to organize such a panel. Steve Holden has already graciously stepped forward at my request to be the moderator. That means I just need to fill out the panel. =) Since I am organizing this I am also going to stick my neck out and be on the panel. AMK has also volunteered. Who else is interested? If you think you will be at PyCon (does not have to be a definite "yes" at the moment, just that you are hoping to) and are interested in participating, send me an email. Let me know how good your chances are attending PyCon are in case there are more people volunteering than would reasonably fit on the panel (I am guessing five people would be good, especially if we get folks who fill different roles on python-dev). -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061017/2e834f6d/attachment.html From martin at v.loewis.de Tue Oct 17 21:09:03 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 17 Oct 2006 21:09:03 +0200 Subject: [Python-Dev] Promoting PCbuild8 (Was: Python 2.5 performance) In-Reply-To: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc> Message-ID: <45352A4F.2080203@v.loewis.de> Kristj?n V. J?nsson schrieb: > Okay, a buildbot then doesn't sound quite that scary. Any info > somewhere on how to set one up on a windows box? Sure. See http://wiki.python.org/moin/BuildbotOnWindows Feel free to make changes if you find the instructions need to be enhanced. > I Also wasn't suggesting that we change the PCBuild directory, since > I think we definitely want to keep the old support. Well, at any point in time, there is one and only one "official" build procedure, which is the procedure used to make releases. If VS 2003 is not used anymore, a copy of it is made into PC, and the existing PCBuild directory is converted to the new procedure (or some other directory name is invented; that should *not* be PCbuild8 - we shouldn't have to rename directories each time we switch the compiler). > But I agree that > getting regular builds running would be a good thing. An x64 box > would be ideal to build both the x86 and x64 versions on. A single > bot can manage many platforms, right? A single machine, and a single buildbot installation, yes. But not a single build slave, since there can be only one build procedure per slave. So if we need different procedures (which we likely do: how else could it find out which of them it should do?), we would need two slaves. That should work fine, except that both slaves will typically start simultaneously on the machine, doubling the load. It's possible to tell the master not to build different branches on a single slave (i.e. 2.5 has to wait if trunk is building), but it's not possible to tell it that two slaves reside on the same machine (it might be possible, but I don't know how to do it). > I would also need to get the _msi and _sqlite3 modules building > (which I haven't yet, since I didn't get their sources.) You don't need any additional sources for _msi, and, in fact, my AMD64 and Itanium installers do provide _msi.pyd binaries. Regards, Martin From pandyacus at gmail.com Wed Oct 18 12:26:54 2006 From: pandyacus at gmail.com (Chetan Pandya) Date: Wed, 18 Oct 2006 03:26:54 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string Re: PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom Message-ID: The discussion on this topic seems to have died down. However, I had a look at the patch and here are some comments: This has the potential to speed up simple strings expressions like s = '1' + '2' + '3' + '4' + '5' + '6' + '7' + '8' However, if this is followed by s += '9' this (the 9th string) will cause rendering of the existing value of s and then create another concatenated string. This can, however, be changed, but I have not checked to see if it is worth it. The deallocation code needs to be robust for a complex tree - it is currently not recursive, but needs to be, like the concatenation code. Construct like s = a + b + c + d + e , where a, b etc. have been assigned string values earlier will not benefit from the patch. If the values are generated and concatenated in a single expression, that is another type of construct that will benefit. There are some other changes needed that I can write up if needed. -Chetan On 10/13/06, python-dev-request at python.org wrote: > Date: Fri, 13 Oct 2006 12:02:06 -0700 > From: Josiah Carlson > Subject: Re: [Python-Dev] PATCH submitted: Speed up + for string > concatenation, now as fast as "".join(x) idiom > To: Larry Hastings , python-dev at python.org > Message-ID: <20061013115748.09F2.JCARLSON at uci.edu> > Content-Type: text/plain; charset="US-ASCII" > > > Larry Hastings wrote: > [snip] > > The machine is dual-core, and was quiescent at the time. XP's scheduler > > is hopefully good enough to just leave the process running on one core. > > It's not. Go into the task manager (accessable via Ctrl+Alt+Del by > default) and change the process' affinity to the second core. In my > experience, running on the second core (in both 2k and XP) tends to > produce slightly faster results. Linux tends to keep processes on a > single core for a few seconds at a time. > > - Josiah > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/0aa8795b/attachment.html From kristjan at ccpgames.com Wed Oct 18 12:47:22 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Wed, 18 Oct 2006 10:47:22 -0000 Subject: [Python-Dev] PATCH submitted: Speed up + for string Re: PATCHsubmitted: Speed up + for string concatenation, now as fast as "".join(x) idiom Message-ID: <129CEF95A523704B9D46959C922A280002FE9A10@nemesis.central.ccp.cc> Doesn't it end up in a call to PyString_Concat()? That should return a PyStringConcatenationObject too, right? K ________________________________ Construct like s = a + b + c + d + e , where a, b etc. have been assigned string values earlier will not benefit from the patch. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/e3d4f6ba/attachment.htm From pandyacus at gmail.com Wed Oct 18 20:01:35 2006 From: pandyacus at gmail.com (Chetan Pandya) Date: Wed, 18 Oct 2006 11:01:35 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string Re: PATCHsubmitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE9A10@nemesis.central.ccp.cc> Message-ID: My statement wasn't clear enough. Rendering occurs if the string being concatenated is already a concatenation object created by an earlier assignment. In s = a + b + c + d + e + f , there would be rendering of the source string if it is already a concatenation. Here is an example that would make it clear: a = "Value a =" a += "anything" # creates a concatenation c = a + b #This would cause rendering of a and then c will become concatenation between a and b. c += "Something" # This will not append to the concatenation object, but cause rendering of c and then it will create a concatenation between c and "Something", which will be assigned to c. Now if there are a series of assignments, (1) s = c + "something" # causes rendering of c (2) s += a # causes rendering of s and creates a new concatenation (3) s += b # causes rendering of s and creates a new concatenation (4) s += c # causes rendering of s and creates a new concatenation (5) print s # causes rendering of s If there is list of strings created and then they are concatenated with +=, I would expect it to be slower because of the additional allocations involved in rendering. -Chetan On 10/18/06, Kristj?n V. J?nsson wrote: > > Doesn't it end up in a call to PyString_Concat()? That should return a > PyStringConcatenationObject too, right? > K > > ------------------------------ > > > Construct like s = a + b + c + d + e , where a, b etc. have been assigned > string values earlier will not benefit from the patch. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/18e0c506/attachment.html From scott+python-dev at scottdial.com Wed Oct 18 21:59:16 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Wed, 18 Oct 2006 15:59:16 -0400 Subject: [Python-Dev] PATCH submitted: Speed up + for string Re: PATCHsubmitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE9A10@nemesis.central.ccp.cc> Message-ID: <45368794.50508@scottdial.com> Chetan Pandya wrote: > My statement wasn't clear enough. > > Rendering occurs if the string being concatenated is already a > concatenation object created by an earlier assignment. > I'm not sure how you came to that conclusion. My reading of the patch doesn't suggest that at all. The operation of string_concat would not produce what you suggest. I can't find anything anywhere that would cause the behavior you suggest. The only case where this is true is if the depth of the tree is too great. To revisit your example, I will notate concats as string pairs: a = "Value a =" # new string a += "anything" # ("Value a =", "anything") c = a + b # (("Value a =", "anything"), b) c += "Something" # ((("Value a =", "anything"), b), "Something" So again, for your other example of repeated right-hand concatenation, you do not continually render the concat object, you merely create new one and attach to the leaves. Once the print is executed, you will force the rendering of the object, but only once that happens. So in contrast to your statement, there are actually there are fewer allocations of strings and smaller objects being allocated than the current trunk uses. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From larry at hastings.org Wed Oct 18 22:04:14 2006 From: larry at hastings.org (Larry Hastings) Date: Wed, 18 Oct 2006 13:04:14 -0700 Subject: [Python-Dev] PATCH submitted: Speed up + for string Re: PATCHsubmitted: Speed up + for string concatenation, now as fast as "".join(x) idiom In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE9A10@nemesis.central.ccp.cc> Message-ID: <453688BE.2060509@hastings.org> Chetan Pandya wrote: > The deallocation code needs to be robust for a complex tree - it is > currently not recursive, but needs to be, like the concatenation code. It is already both those things. Deallocation is definitely recursive. See Objects/stringobject.c, function (*ahem*) recursive_dealloc. That Py_DECREF() line is where it recurses into child string concatenation objects. You might have been confused because it is *optimized* for the general case, where the tree only recurses down the left-hand side. For the left-hand side it iterates, instead of recursing, which is both slightly faster and much more robust (unlikely to blow the stack). > Rendering occurs if the string being concatenated is already a > concatenation object created by an earlier assignment. Nope. Rendering only occurs when somebody asks for the string's value, not when merely concatenating. If you add nine strings together, the ninth one fails the "left side has room" test and creates a second object. Try stepping through it. Run Python interactively under the debugger. Let it get to the prompt. Execute some expression like "print 3", just so the interpreter creates its concatenated encoding object (I get "encodings.cp437"). Now, in the debugger, put a breakpoint in the rendering code in recursiveConcatenate(), and another on the "op = (PyStringConcatenationObject *)PyObject_MALLOC()" line in string_concat. Finally, go back to the Python console and concatenate nine strings with this code: x = "" for i in xrange(9): x += "a" You won't hit any breakpoints for rendering, and you'll hit the string concatenation object malloc line twice. (Note that for demonstration purposes, this code is more illustrative than running x = "a" + "b" ... + "i" because the peephole optimizer makes a constant folding pass. It's mostly harmless, but for my code it does mean I create concatenation objects more often.) In the interests of full disclosure, there is *one* scenario where pure string concatenation will cause it to render. Rendering or deallocating a recursive object that's too deep would blow the program stack, so I limit recursion depth on the right seven slots of the recursion object. That's what the "right recursion depth" field is used for. If you attempt to concatenate a string concatenation object that's already at the depth limit, it renders the deep object first. The depth limit is 2**14 right now. You can force this to happen by prepending like crazy: x = "" for i in xrange(2**15): x = "a" + x Since my code is careful to be only iterative when rendering and deallocating down the left-hand side of the tree, there is no depth limit for the left-hand side. Step before you leap, /larry/ From mike.klaas at gmail.com Wed Oct 18 22:02:42 2006 From: mike.klaas at gmail.com (Mike Klaas) Date: Wed, 18 Oct 2006 13:02:42 -0700 Subject: [Python-Dev] Segfault in python 2.5 Message-ID: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com> [http://sourceforge.net/tracker/index.php?func=detail&aid=1579370&group_id=5470&atid=105470] Hello, I'm managed to provoke a segfault in python2.5 (occasionally it just a "invalid argument to internal function" error). I've posted a traceback and a general idea of what the code consists of in the sourceforge entry. Unfortunately, I've been attempting for hours to reduce the problem to a completely self-contained script, but it is resisting my efforts due to timing problems. Should I continue in that vein, or is it more useful to provide more detailed results from gdb? Thanks, -Mike From mwh at python.net Wed Oct 18 22:08:42 2006 From: mwh at python.net (Michael Hudson) Date: Wed, 18 Oct 2006 22:08:42 +0200 Subject: [Python-Dev] Segfault in python 2.5 In-Reply-To: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com> (Mike Klaas's message of "Wed, 18 Oct 2006 13:02:42 -0700") References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com> Message-ID: <8764eh2xpx.fsf@starship.python.net> "Mike Klaas" writes: > [http://sourceforge.net/tracker/index.php?func=detail&aid=1579370&group_id=5470&atid=105470] > > Hello, > > I'm managed to provoke a segfault in python2.5 (occasionally it just a > "invalid argument to internal function" error). I've posted a > traceback and a general idea of what the code consists of in the > sourceforge entry. I've been reading the bug report with interest, but unless I can reproduce it it's mighty hard for me to debug, as I'm sure you know. > Unfortunately, I've been attempting for hours to > reduce the problem to a completely self-contained script, but it is > resisting my efforts due to timing problems. > > Should I continue in that vein, or is it more useful to provide more > detailed results from gdb? Well, I don't think that there's much point in posting masses of details from gdb. You might want to try trying to fix the bug yourself I guess, trying to figure out where the bad pointers come from, etc. Are you absolutely sure that the fault does not lie with any extension modules you may be using? Memory scribbling bugs have been known to cause arbitrarily confusing problems... Cheers, mwh -- I'm not sure that the ability to create routing diagrams similar to pretzels with mad cow disease is actually a marketable skill. -- Steve Levin -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html From Jack.Jansen at cwi.nl Thu Oct 19 00:23:43 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Thu, 19 Oct 2006 00:23:43 +0200 Subject: [Python-Dev] Segfault in python 2.5 In-Reply-To: <8764eh2xpx.fsf@starship.python.net> References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com> <8764eh2xpx.fsf@starship.python.net> Message-ID: <7654C99F-5062-49AC-B604-0CBC9567A586@cwi.nl> On 18-Oct-2006, at 22:08 , Michael Hudson wrote: >> Unfortunately, I've been attempting for hours to >> reduce the problem to a completely self-contained script, but it is >> resisting my efforts due to timing problems. Has anyone ever tried to use helgrind (the valgrind module, not the heavy metal band:-) on Python? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From mike.klaas at gmail.com Thu Oct 19 02:08:51 2006 From: mike.klaas at gmail.com (Mike Klaas) Date: Wed, 18 Oct 2006 17:08:51 -0700 Subject: [Python-Dev] Segfault in python 2.5 In-Reply-To: <8764eh2xpx.fsf@starship.python.net> References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com> <8764eh2xpx.fsf@starship.python.net> Message-ID: <3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com> On 10/18/06, Michael Hudson wrote: > "Mike Klaas" writes: > I've been reading the bug report with interest, but unless I can > reproduce it it's mighty hard for me to debug, as I'm sure you know. Indeed. > > Unfortunately, I've been attempting for hours to > > reduce the problem to a completely self-contained script, but it is > > resisting my efforts due to timing problems. > > > > Should I continue in that vein, or is it more useful to provide more > > detailed results from gdb? > > Well, I don't think that there's much point in posting masses of > details from gdb. You might want to try trying to fix the bug > yourself I guess, trying to figure out where the bad pointers come > from, etc. I've peered at the code, but my knowledge of the python core is superficial at best. The fact that it is occuring as a result of a long string of garbage collection/dealloc/etc. and involves threading lowers my confidence further. That said, I'm beginning to think that to reproduce this in a standalone script will require understanding the problem in greater depth regardless... > Are you absolutely sure that the fault does not lie with any extension > modules you may be using? Memory scribbling bugs have been known to > cause arbitrarily confusing problems... I've had sufficient experience being arbitrarily confused to never be sure about such things, but I am quite confident. The script I posted in the bug report is all stock python save for the operation in <>'s. That operation is pickling and unpickling (using pickle, not cPickle) a somewhat complicated pure-python instance several times. It's doing nothing with the actual instance--it just happens to take the right amount of time to trigger the segfault. It's still not perfect--this trimmed-down version segfaults only sporatically, while the original python script segfaults reliably. -Mike From pandyacus at gmail.com Thu Oct 19 02:36:43 2006 From: pandyacus at gmail.com (Chetan Pandya) Date: Wed, 18 Oct 2006 17:36:43 -0700 Subject: [Python-Dev] Python-Dev Digest, Vol 39, Issue 54 In-Reply-To: References: Message-ID: I got up in the middle of the night and wrote the email - and it shows. Apologies for creating confusion. My comments below. -Chetan On 10/18/06, python-dev-request at python.org > > Date: Wed, 18 Oct 2006 13:04:14 -0700 > From: Larry Hastings > Subject: Re: [Python-Dev] PATCH submitted: Speed up + for string Re: > PATCHsubmitted: Speed up + for string concatenation, now as fast > as > "".join(x) idiom > To: python-dev at python.org > Message-ID: <453688BE.2060509 at hastings.org> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Chetan Pandya wrote: > > The deallocation code needs to be robust for a complex tree - it is > > currently not recursive, but needs to be, like the concatenation code. > It is already both those things. > > Deallocation is definitely recursive. See Objects/stringobject.c, > function (*ahem*) recursive_dealloc. That Py_DECREF() line is where it > recurses into child string concatenation objects. > > You might have been confused because it is *optimized* for the general > case, where the tree only recurses down the left-hand side. For the > left-hand side it iterates, instead of recursing, which is both slightly > faster and much more robust (unlikely to blow the stack). Actually I looked at the setting of ob_sstrings to NULL in recursive_dealloc and thought none of the strings will get destroyed as the list is destroyed. However it is only setting the first element to NULL, which is fine. > Rendering occurs if the string being concatenated is already a > > concatenation object created by an earlier assignment. > Nope. Rendering only occurs when somebody asks for the string's value, > not when merely concatenating. If you add nine strings together, the > ninth one fails the "left side has room" test and creates a second object. I don't know what I was thinking. In the whole of string_concat() there is no call to render the string, except for the right recursion case. Try stepping through it. Run Python interactively under the debugger. > Let it get to the prompt. Execute some expression like "print 3", just > so the interpreter creates its concatenated encoding object (I get > "encodings.cp437"). Now, in the debugger, put a breakpoint in the > rendering code in recursiveConcatenate(), and another on the "op = > (PyStringConcatenationObject *)PyObject_MALLOC()" line in > string_concat. Finally, go back to the Python console and concatenate > nine strings with this code: > x = "" > for i in xrange(9): > x += "a" > You won't hit any breakpoints for rendering, and you'll hit the string > concatenation object malloc line twice. (Note that for demonstration > purposes, this code is more illustrative than running x = "a" + "b" ... > + "i" because the peephole optimizer makes a constant folding pass. > It's mostly harmless, but for my code it does mean I create > concatenation objects more often.) I don't have a patch build, since I didn't download the revision used by the patch. However, I did look at values in the debugger and it looked like x in your example above had a reference count of 2 or more within string_concat even when there were no other assignments that would account for it. My idea was to investibate this, but this was the whole reason for saying that the concatenation will create new objects. However, I ran on another machine under debugger and I get the reference count as 1, which is what I would expect. I need to find out what has happened to my work machine. In the interests of full disclosure, there is *one* scenario where pure > string concatenation will cause it to render. Rendering or deallocating > a recursive object that's too deep would blow the program stack, so I > limit recursion depth on the right seven slots of the recursion object. > That's what the "right recursion depth" field is used for. If you > attempt to concatenate a string concatenation object that's already at > the depth limit, it renders the deep object first. The depth limit is > 2**14 right now. You can force this to happen by prepending like crazy: > x = "" > for i in xrange(2**15): > x = "a" + x > > Since my code is careful to be only iterative when rendering and > deallocating down the left-hand side of the tree, there is no depth > limit for the left-hand side. The recursion limit seems to be optimistic, given the default stack limit, but of course, I haven't tried it. There is probably a depth limit on the left hand side as well, since recursiveConcatenate is recursive even on the left side. Step before you leap, > > > /larry/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/a3e48cc9/attachment.htm From tim.peters at gmail.com Thu Oct 19 03:02:31 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 18 Oct 2006 21:02:31 -0400 Subject: [Python-Dev] Segfault in python 2.5 In-Reply-To: <3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com> References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com> <8764eh2xpx.fsf@starship.python.net> <3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com> Message-ID: <1f7befae0610181802y7e0fca9ateb7d43c2ae2cff01@mail.gmail.com> [Michael Hudson] >> I've been reading the bug report with interest, but unless I can >> reproduce it it's mighty hard for me to debug, as I'm sure you know. [Mike Klaas] > Indeed. Note that I just attached a much simpler pure-Python script that fails very quickly, on Windows, using a debug build. Read the new comment to learn why both "Windows" and "debug build" are essential to it failing reliably and quickly ;-) >>> Unfortunately, I've been attempting for hours to reduce the problem to a >>> completely self-contained script, but it is resisting my efforts due to timing >>> problems. Yes, but you did good! This is still just an educated guess on my part, but my education here is hard to match ;-): this new business of generators deciding to "clean up after themselves" if they're left hanging appears to have made it possible for a generator to hold on to a frame whose thread state has been free()'d, after the thread that created the generator has gone away. Then when the generator gets collected as trash, the new exception-based "clean up abandoned generator" gimmick tries to access the generator's frame's thread state, but that's just a raw C struct (not a Python object with reachability-based lifetime), and the thread free()'d that struct when the thread went away. The important timing-based vagary here is whether dead-thread cleanup gets performed before the main thread tries to clean up the trash generator. > I've peered at the code, but my knowledge of the python core is > superficial at best. The fact that it is occuring as a result of a > long string of garbage collection/dealloc/etc. and involves threading > lowers my confidence further. That said, I'm beginning to think that > to reproduce this in a standalone script will require understanding > the problem in greater depth regardless... Or upgrade to Windows ;-) >> Are you absolutely sure that the fault does not lie with any extension >> modules you may be using? Memory scribbling bugs have been known to >> cause arbitrarily confusing problems... Unless I've changed the symptom, it's been reduced to minimal pure Python. It does require a thread T, and creating a generator in T, where the generator object's lifetime is controlled by the main thread, and where T vanishes before the generator has exited of its own accord. Offhand I don't know how to repair it. Thread states /aren't/ Python objects, and there's no provision for a thread state to outlive the thread it represents. > I've had sufficient experience being arbitrarily confused to never be > sure about such things, but I am quite confident. The script I posted > in the bug report is all stock python save for the operation in <>'s. > That operation is pickling and unpickling (using pickle, not cPickle) > a somewhat complicated pure-python instance several times. FYI, in my whittled script, your `getdocs()` became simply: def getdocs(): while True: yield None and it's called only once, via self.docIter.next(). In fact, the "while True:" isn't needed there either (given that it's only resumed once now). From mike.klaas at gmail.com Thu Oct 19 04:26:59 2006 From: mike.klaas at gmail.com (Mike Klaas) Date: Wed, 18 Oct 2006 19:26:59 -0700 Subject: [Python-Dev] Segfault in python 2.5 In-Reply-To: <1f7befae0610181802y7e0fca9ateb7d43c2ae2cff01@mail.gmail.com> References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com> <8764eh2xpx.fsf@starship.python.net> <3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com> <1f7befae0610181802y7e0fca9ateb7d43c2ae2cff01@mail.gmail.com> Message-ID: <3d2ce8cb0610181926g2e0915dama208d8839bd1cc5b@mail.gmail.com> On 10/18/06, Tim Peters wrote: > [Mike Klaas] > > Indeed. > > Note that I just attached a much simpler pure-Python script that fails > very quickly, on Windows, using a debug build. Read the new comment > to learn why both "Windows" and "debug build" are essential to it > failing reliably and quickly ;-) Thanks! Next time I find a bug, installing Windows will certainly be my first step . <> > Yes, but you did good! This is still just an educated guess on my > part, but my education here is hard to match ;-): this new business > of generators deciding to "clean up after themselves" if they're left > hanging appears to have made it possible for a generator to hold on to > a frame whose thread state has been free()'d, after the thread that > created the generator has gone away. Then when the generator gets > collected as trash, the new exception-based "clean up abandoned > generator" gimmick tries to access the generator's frame's thread > state, but that's just a raw C struct (not a Python object with > reachability-based lifetime), and the thread free()'d that struct when > the thread went away. The important timing-based vagary here is > whether dead-thread cleanup gets performed before the main thread > tries to clean up the trash generator. Indeed--and normally it doesn't happen that way. My/your script never crashes on the first iteration because the thread's target is the generator and thus it gets DECREF'd before the thread terminates. But the exception from the first iteration holds on to a reference to the frame/generator so when it gets cleaned up (in the second iteration, due to a new exception overwriting it) the generator is freed after the thread is destroyed. At least, I think... <> > Offhand I don't know how to repair it. Thread states /aren't/ Python > objects, and there's no provision for a thread state to outlive the > thread it represents. Take this with a grain of salt, but ISTM that the problem can be repaired by resetting the generator's frame threadstate to the current threadstate: (in genobject.c:gen_send_ex():80) Py_XINCREF(tstate->frame); assert(f->f_back == NULL); f->f_back = tstate->frame; + f->f_tstate = tstate; gen->gi_running = 1; result = PyEval_EvalFrameEx(f, exc); gen->gi_running = 0; Shouldn't the thread state generally be the same anyway? (I seem to recall some gloomy warning against resuming generators in separate threads). This solution is surely wrong--if f_tstate != tstate, then the generator _is_ being resumed in another thread and so the generated traceback will be wrong (among other issues which surely occur by fudging a frame's threadstate). Perhaps it could be set conditionally by gen_close before signalling the exception? A lie, but a smaller lie than a segfault. We could advertise that the exception ocurring from generator .close() isn't guaranteed to have an accurate traceback in this case. Take all this with a grain of un-core-savvy salt. Thanks again for investigating this, Tim, -Mike From larry at hastings.org Thu Oct 19 08:03:25 2006 From: larry at hastings.org (Larry Hastings) Date: Wed, 18 Oct 2006 23:03:25 -0700 Subject: [Python-Dev] Python-Dev Digest, Vol 39, Issue 54 In-Reply-To: References: Message-ID: <4537152D.7090900@hastings.org> Chetan Pandya wrote: > I don't have a patch build, since I didn't download the revision used > by the patch. > However, I did look at values in the debugger and it looked like x in > your example above had a reference count of 2 or more within > string_concat even when there were no other assignments that would > account for it. It could be the optimizer. If you concatenate hard-coded strings, the peephole optimizer does constant folding. It says "hey, look, this binary operator is performed on two constant objects". So it evaluates the expression itself and substitutes the result, in this case swapping (pseudotokens here) [PUSH "a" PUSH "b" PLUS] for [PUSH "ab"]. Oddly, it didn't seem to optimize away the whole expression. If you say "a" + "b" + "c" + "d" + "e", I would have expected the peephole optimizer to turn that whole shebang into [PUSH "abcde"]. But when I gave it a cursory glance it seemed to skip every-other; it constant-folded "a" + "b", then + "c" and optimized ("a" + "b" + "c") + "d", resulting ultimately I believe in [PUSH "ab" PUSH "cd" PLUS PUSH "e" PLUS]. But I suspect I missed something; it bears further investigation. But this is all academic, as real-world performance of my patch is not contingent on what the peephole optimizer does to short runs of hard-coded strings in simple test cases. > The recursion limit seems to be optimistic, given the default stack > limit, but of course, I haven't tried it. I've tried it, on exactly one computer (running Windows XP). The depth limit was arrived at experimentally. But it is probably too optimistic and should be winched down. On the other hand, right now when you do x = "a" + x ten zillion times there are always two references to the concatenation object stored in x: the interpreter holds one, and x itself holds the other. That means I have to build a new concatenation object each time, so it becomes a degenerate tree (one leaf and one subtree) recursing down the right-hand side. I plan to fix that in my next patch. There's already code that says "if the next instruction is a store, and the location we're storing to holds a reference to the left-hand side of the concatenation, make the location drop its reference". That was an optimization for the old-style concat code; when the left side only had one reference it would simply resize it and memcpy() in the right side. I plan to add support for dropping the reference when it's the *right*-hand side of the concatenation, as that would help prepending immensely. Once that's done, I believe it'll prepend ((depth limit) * (number of items in ob_sstrings - 1)) + 1 strings before needing to render. > There is probably a depth limit on the left hand side as well, since > recursiveConcatenate is recursive even on the left side. Let me again stress that recursiveConcatenate is *iterative* down the left side; it is *not* not *not* recursive. The outer loop iterates over "s->ob_sstrings[0]"s. The nested "for" loop iterates backwards, from the highest string used down to "s->ob_sstrings + 1", aka "&s->ob_sstrings[1]", recursing into them. It then sets "s" to "*s->ob_sstrings", aka "s->ob_sstrings[0]" and the outer loop repeats. This is iterative. As a personal favor to me, please step through my code before you tell me again how my code is recursive down the left-hand side. Passing the dutchie, /larry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061018/31d515fc/attachment-0001.htm From anthony at python.org Thu Oct 19 09:42:00 2006 From: anthony at python.org (Anthony Baxter) Date: Thu, 19 Oct 2006 17:42:00 +1000 Subject: [Python-Dev] RELEASED Python 2.4.4, Final. Message-ID: <200610191742.14093.anthony@python.org> On behalf of the Python development team and the Python community, I'm happy to announce the release of Python 2.4.4 (FINAL). Python 2.4.4 is a bug-fix release. While Python 2.5 is the latest version of Python, we're making this release for people who are still running Python 2.4. This is the final planned release from the Python 2.4 series. Future maintenance releases will be in the 2.5 series, beginning with 2.5.1. See the release notes at the website (also available as Misc/NEWS in the source distribution) for details of the more than 80 bugs squished in this release, including a number found by the Coverity and Klocwork static analysis tools. We'd like to offer our thanks to both these firms for making this available for open source projects. * Python 2.4.4 contains a fix for PSF-2006-001, a buffer overrun * * in repr() of unicode strings in wide unicode (UCS-4) builds. * * See http://www.python.org/news/security/PSF-2006-001/ for more. * There's only been one small change since the release candidate - a fix to "configure" to repair cross-compiling of Python under Unix. For more information on Python 2.4.4, including download links for various platforms, release notes, and known issues, please see: http://www.python.org/2.4.4 Highlights of this new release include: - Bug fixes. According to the release notes, at least 80 have been fixed. This includes a fix for PSF-2006-001, a bug in repr() for unicode strings on UCS-4 (wide unicode) builds. Enjoy this release, Anthony Anthony Baxter anthony at python.org Python Release Manager (on behalf of the entire python-dev team) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061019/5bffd293/attachment.pgp From steve at holdenweb.com Thu Oct 19 10:34:28 2006 From: steve at holdenweb.com (Steve Holden) Date: Thu, 19 Oct 2006 09:34:28 +0100 Subject: [Python-Dev] Segfault in python 2.5 In-Reply-To: <3d2ce8cb0610181926g2e0915dama208d8839bd1cc5b@mail.gmail.com> References: <3d2ce8cb0610181302w5d87716btfd09da833e525c73@mail.gmail.com> <8764eh2xpx.fsf@starship.python.net> <3d2ce8cb0610181708i1eeb13b5qaf56488f406d4fc7@mail.gmail.com> <1f7befae0610181802y7e0fca9ateb7d43c2ae2cff01@mail.gmail.com> <3d2ce8cb0610181926g2e0915dama208d8839bd1cc5b@mail.gmail.com> Message-ID: Mike Klaas wrote: > On 10/18/06, Tim Peters wrote: [...] > Shouldn't the thread state generally be the same anyway? (I seem to > recall some gloomy warning against resuming generators in separate > threads). > Is this an indication that generators aren't thread-safe? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From anthony at interlink.com.au Thu Oct 19 18:58:10 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 20 Oct 2006 02:58:10 +1000 Subject: [Python-Dev] state of the maintenance branches Message-ID: <200610200258.12460.anthony@interlink.com.au> OK - 2.4.4 is done. With that, the release24-maint branch moves into dignified old age, where we get to mostly ignore it, yay! Unless you really feel like it, I don't think there's much point to making the effort to backport fixes to this branch. Any future releases from that branch will be of the serious security flaw only variety, and are almost certainly only going to have those critical patches applied. Either this weekend or next week I'll cut a 2.3.6 off the release23-maint branch. As previously discussed, this will be a source-only release - I don't envisage making documentation packages or binaries for it. Although should we maybe have new doc packages with the newer version number, just to prevent confusion? Fred? What do you think? I don't think there's any need to do this for 2.3.6c1, but maybe for 2.3.6 final? For 2.3.6, it's just 2.3.5 plus the email fix and the PSF-2006-001 fix. As I feared, I've had a couple of people asking for a 2.3.6. Oh well. Only one person has (jokingly) suggested a new 2.2 release. That ain't going to happen :-) I don't even want to _think_ about 2.5.1 right now. I can't see us doing this before December at the earliest, and preferably early in 2007. As far as I can see so far, the generator+threads nasty that's popped up isn't going to affect so many people that it needs a rushed out 2.5.1 to cover it - although this may change as the problem and solution becomes better understood. Anyway, all of the above is open to disagreement or other opinions - if you have them, let me know. -- Anthony Baxter It's never too late to have a happy childhood. From p.f.moore at gmail.com Thu Oct 19 20:15:49 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 19 Oct 2006 19:15:49 +0100 Subject: [Python-Dev] state of the maintenance branches In-Reply-To: <200610200258.12460.anthony@interlink.com.au> References: <200610200258.12460.anthony@interlink.com.au> Message-ID: <79990c6b0610191115r2b020d03o2ddec9c6b3cc3b6b@mail.gmail.com> On 10/19/06, Anthony Baxter wrote: > Anyway, all of the above is open to disagreement or other opinions - if you > have them, let me know. My only thought is that you've done a fantastic job pushing through all the recent releases. Thanks! Paul. From brett at python.org Thu Oct 19 20:50:54 2006 From: brett at python.org (Brett Cannon) Date: Thu, 19 Oct 2006 11:50:54 -0700 Subject: [Python-Dev] state of the maintenance branches In-Reply-To: <79990c6b0610191115r2b020d03o2ddec9c6b3cc3b6b@mail.gmail.com> References: <200610200258.12460.anthony@interlink.com.au> <79990c6b0610191115r2b020d03o2ddec9c6b3cc3b6b@mail.gmail.com> Message-ID: On 10/19/06, Paul Moore wrote: > > On 10/19/06, Anthony Baxter wrote: > > Anyway, all of the above is open to disagreement or other opinions - if > you > > have them, let me know. > > My only thought is that you've done a fantastic job pushing through > all the recent releases. > > Thanks! Thanks from me as well! You showed great patience putting up with all of us during releases. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/9d493102/attachment.html From rhettinger at ewtllc.com Thu Oct 19 22:07:31 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Thu, 19 Oct 2006 13:07:31 -0700 Subject: [Python-Dev] Nondeterministic long-to-float coercion Message-ID: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> My colleague got an odd result today that is reproducible on his build of Python (RedHat's distribution of Py2.4.2) but not any other builds I've checked (including an Ubuntu Py2.4.2 built with a later version of GCC). I hypothesized that this was a bug in the underlying GCC libraries, but the magnitude of the error is so large that that seems implausible. Does anyone have a clue what is going-on? Raymond ------------------------------------------------ Python 2.4.2 (#1, Mar 29 2006, 11:22:09) [GCC 4.0.2 20051125 (Red Hat 4.0.2-8)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> set(-19400000000 * (1/100.0) for i in range(10000)) set([-194000000.0, -193995904.0, -193994880.0]) From skip at pobox.com Thu Oct 19 22:44:23 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 19 Oct 2006 15:44:23 -0500 Subject: [Python-Dev] Nondeterministic long-to-float coercion In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> Message-ID: <17719.58279.458064.680744@montanaro.dyndns.org> Raymond> My colleague got an odd result today that is reproducible on Raymond> his build of Python (RedHat's distribution of Py2.4.2) but not Raymond> any other builds I've checked (including an Ubuntu Py2.4.2 Raymond> built with a later version of GCC). I hypothesized that this Raymond> was a bug in the underlying GCC libraries, but the magnitude of Raymond> the error is so large that that seems implausible. Does anyone Raymond> have a clue what is going-on? Not off the top of my head (but then I'm not a guts of the implementation or gcc whiz). I noticed that you used both "nondeterministic" and "reproducible" though. Does your colleague always get the same result? If you remove the set constructor do the oddball values always wind up in the same spots on repeated calls? Are the specific values significant (e.g., do you really need range(10000) to demonstrate the problem)? Also, I can never remember exactly, but are even-numbered minor numbers in GCC releases supposed to be development releases (or is that for the Linux kernel)? Just a few questions that come to mind. Skip From grig.gheorghiu at gmail.com Thu Oct 19 23:19:40 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 19 Oct 2006 14:19:40 -0700 Subject: [Python-Dev] Python unit tests failing on Pybots farm Message-ID: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> The latest trunk checkin caused almost all Pybots to fail when running the Python unit tests. 273 tests OK. 12 tests failed: test___all__ test_calendar test_capi test_datetime test_email test_email_renamed test_imaplib test_mailbox test_strftime test_strptime test_time test_xmlrpc Here's the status page: http://www.python.org/dev/buildbot/community/trunk/ Not sure why the official Python buildbot farm is all green and happy....maybe a difference in how the steps are running? Grig -- http://agiletesting.blogspot.com From martin at v.loewis.de Thu Oct 19 23:20:28 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 19 Oct 2006 23:20:28 +0200 Subject: [Python-Dev] Nondeterministic long-to-float coercion In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> Message-ID: <4537EC1C.1060201@v.loewis.de> Raymond Hettinger schrieb: > My colleague got an odd result today that is reproducible on his build > of Python (RedHat's distribution of Py2.4.2) but not any other builds > I've checked (including an Ubuntu Py2.4.2 built with a later version of > GCC). I hypothesized that this was a bug in the underlying GCC > libraries, but the magnitude of the error is so large that that seems > implausible. Does anyone have a clue what is going-on? I'd say it's memory corruption. Look: r=array.array("d",[-194000000.0, -193995904.0, -193994880.0]).tostring() print map(ord,r[0:8]) print map(ord,r[8:16]) print map(ord,r[16:24]) gives [0, 0, 0, 0, 105, 32, 167, 193] [0, 0, 0, 0, 73, 32, 167, 193] [0, 0, 0, 0, 65, 32, 167, 193] It's only one byte that changes, and then that in only two bits (2**3 and 2**5). Could be faulty hardware, too. Regards, Martin From bos at serpentine.com Thu Oct 19 23:08:33 2006 From: bos at serpentine.com (Bryan O'Sullivan) Date: Thu, 19 Oct 2006 14:08:33 -0700 Subject: [Python-Dev] Nondeterministic long-to-float coercion In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> Message-ID: <4537E951.1080107@serpentine.com> Raymond Hettinger wrote: > My colleague got an odd result today that is reproducible on his build > of Python (RedHat's distribution of Py2.4.2) but not any other builds > I've checked (including an Ubuntu Py2.4.2 built with a later version of > GCC). I hypothesized that this was a bug in the underlying GCC > libraries, but the magnitude of the error is so large that that seems > implausible. These errors are due to a bit or two being flipped in either the long or double representation of the number. They could be due to a compiler bug, but other potential culprits include bad memory, a bum power supply introducing noise, or cooling problems. Has your colleague run memtest86 or other load tests for a day on their box? References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> Message-ID: <1f7befae0610191428t249d2007jdc2fea9ecd8410ee@mail.gmail.com> [Raymond Hettinger] > My colleague got an odd result today that is reproducible on his build > of Python (RedHat's distribution of Py2.4.2) but not any other builds > I've checked (including an Ubuntu Py2.4.2 built with a later version of > GCC). I hypothesized that this was a bug in the underlying GCC > libraries, but the magnitude of the error is so large that that seems > implausible. > > Does anyone have a clue what is going-on? > > Python 2.4.2 (#1, Mar 29 2006, 11:22:09) [GCC 4.0.2 20051125 (Red Hat > 4.0.2-8)] on linux2 Type "help", "copyright", "credits" or "license" for > more information. > >>> set(-19400000000 * (1/100.0) for i in range(10000)) > set([-194000000.0, -193995904.0, -193994880.0]) Note that the Hamming distance between -194000000.0 and -193995904.0 is 1, and ditto between -193995904.0 and -193994880.0, when viewed as IEEE-754 doubles. That is, 193995904.0 is "missing a bit" from -194000000.0, and -193994880.0 is missing the same bit plus an additional bit. Maybe clearer, writing a function to show the hex little-endian representation: >>> def ashex(d): ... return binascii.hexlify(struct.pack(">> ashex(-194000000) '000000006920a7c1' >>> ashex(-193995904) # "the 2 bit" from "6" is missing, leaving 4 '000000004920a7c1' >>> ashex(-193994880) # and "the 8 bit" from "9" is missing, leaving 1 '000000004120a7c1' More than anything else that suggests flaky memory, or "weak bits" in a HW register or CPU<->FPU path. IOW, it looks like a hardware problem to me. Note that the missing bits here don't coincide with a "natural" software boundary -- screwing up a bit "in the middle of" a byte isn't something software is prone to do. You could try different inputs and see whether the same bits "go missing", e.g. starting with a double with a lot of 1 bits lit. Might also try using these as keys to a counting dict to see how often they go missing. From rhettinger at ewtllc.com Thu Oct 19 23:22:59 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Thu, 19 Oct 2006 14:22:59 -0700 Subject: [Python-Dev] Nondeterministic long-to-float coercion Message-ID: <34FE2A7A34BC3544BC3127D023DF3D12128753@EWTEXCH.office.bhtrader.com> > I noticed that you used both "nondeterministic" and > "reproducible" though. LOL. The nondeterministic part is that the same calculation will give different answers and there doesn't appear to be a pattern to which of the several answers will occur. The reproducible part is that it happens from session-to-session > Are the specific values significant (e.g., do > you really need range(10000) to demonstrate the problem)? No, you just need to run the calculation several times at the command line: >>> -19400000000 * (1/100.0) -193994880.0 >>> -19400000000 * (1/100.0) -194000000.0 >>> -19400000000 * (1/100.0) -194000000.0 Raymond -----Original Message----- From: skip at pobox.com [mailto:skip at pobox.com] Sent: Thursday, October 19, 2006 1:44 PM To: Raymond Hettinger Cc: python-dev at python.org Subject: Re: [Python-Dev] Nondeterministic long-to-float coercion Raymond> My colleague got an odd result today that is reproducible on Raymond> his build of Python (RedHat's distribution of Py2.4.2) but not Raymond> any other builds I've checked (including an Ubuntu Py2.4.2 Raymond> built with a later version of GCC). I hypothesized that this Raymond> was a bug in the underlying GCC libraries, but the magnitude of Raymond> the error is so large that that seems implausible. Does anyone Raymond> have a clue what is going-on? Not off the top of my head (but then I'm not a guts of the implementation or gcc whiz). I noticed that you used both "nondeterministic" and "reproducible" though. Does your colleague always get the same result? If you remove the set constructor do the oddball values always wind up in the same spots on repeated calls? Are the specific values significant (e.g., do you really need range(10000) to demonstrate the problem)? Also, I can never remember exactly, but are even-numbered minor numbers in GCC releases supposed to be development releases (or is that for the Linux kernel)? Just a few questions that come to mind. Skip From facundobatista at gmail.com Thu Oct 19 23:24:46 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Thu, 19 Oct 2006 18:24:46 -0300 Subject: [Python-Dev] Nondeterministic long-to-float coercion In-Reply-To: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> References: <34FE2A7A34BC3544BC3127D023DF3D12128751@EWTEXCH.office.bhtrader.com> Message-ID: 2006/10/19, Raymond Hettinger : > My colleague got an odd result today that is reproducible on his build > of Python (RedHat's distribution of Py2.4.2) but not any other builds > ... > >>> set(-19400000000 * (1/100.0) for i in range(10000)) > set([-194000000.0, -193995904.0, -193994880.0]) I neither can reproduce it in my Ubuntu, but analyzing the problem... what about this?: d = {} for i in range(10000): val = -19400000000 * (1/100.0) d[val] = d.get(val, 0) + 1 or d = {} for i in range(10000): val = -19400000000 * (1/100.0) d.setdefault(val, []).append(i) I think that is interesting to know,,, - if in these structures the problem still happens... - how many values go for each key, and which values. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From pandyacus at gmail.com Fri Oct 20 00:25:01 2006 From: pandyacus at gmail.com (Chetan Pandya) Date: Thu, 19 Oct 2006 15:25:01 -0700 Subject: [Python-Dev] Python-Dev Digest, Vol 39, Issue 55 In-Reply-To: References: Message-ID: Larry Hastings wrote: > Chetan Pandya wrote: > > I don't have a patch build, since I didn't download the revision used > > by the patch. > > However, I did look at values in the debugger and it looked like x in > > your example above had a reference count of 2 or more within > > string_concat even when there were no other assignments that would > > account for it. > It could be the optimizer. If you concatenate hard-coded strings, the > peephole optimizer does constant folding. It says "hey, look, this > binary operator is performed on two constant objects". So it evaluates > the expression itself and substitutes the result, in this case swapping > (pseudotokens here) [PUSH "a" PUSH "b" PLUS] for [PUSH "ab"]. > > Oddly, it didn't seem to optimize away the whole expression. If you say > "a" + "b" + "c" + "d" + "e", I would have expected the peephole > optimizer to turn that whole shebang into [PUSH "abcde"]. But when I > gave it a cursory glance it seemed to skip every-other; it > constant-folded "a" + "b", then + "c" and optimized ("a" + "b" + "c") + > "d", resulting ultimately I believe in [PUSH "ab" PUSH "cd" PLUS PUSH > "e" PLUS]. But I suspect I missed something; it bears further > investigation. I looked at the optimizer, but couldn't find any place where it does constant folding for strings. However, I an unable to set breakpoints for some mysterious reason, so investigation is somewhat hard. But I am not bothered about it anymore, since it does not behave the way I originally thought it did. But this is all academic, as real-world performance of my patch is not > contingent on what the peephole optimizer does to short runs of > hard-coded strings in simple test cases. > > > The recursion limit seems to be optimistic, given the default stack > > limit, but of course, I haven't tried it. > I've tried it, on exactly one computer (running Windows XP). The depth > limit was arrived at experimentally. But it is probably too optimistic > and should be winched down. On the other hand, right now when you do x = "a" + x ten zillion times > there are always two references to the concatenation object stored in x: > the interpreter holds one, and x itself holds the other. That means I > have to build a new concatenation object each time, so it becomes a > degenerate tree (one leaf and one subtree) recursing down the right-hand > side. This is the case I was thinking of (but not what I wrote). I plan to fix that in my next patch. There's already code that says "if > the next instruction is a store, and the location we're storing to holds > a reference to the left-hand side of the concatenation, make the > location drop its reference". That was an optimization for the > old-style concat code; when the left side only had one reference it > would simply resize it and memcpy() in the right side. I plan to add > support for dropping the reference when it's the *right*-hand side of > the concatenation, as that would help prepending immensely. Once that's > done, I believe it'll prepend ((depth limit) * (number of items in > ob_sstrings - 1)) + 1 strings before needing to render. I am confused as to whether you are referring to the LHS or the concatenation operation or the assignment operation. But I haven't looked at how the reference counting optimizations are done yet. In general, there are caveats about removing references, but I plan to look at that later. There is another, possibly complimentary way of reducing the recursion depth. While creating a new concatenation object, instead of inserting the two string references, the strings they reference can be inserted in the new object. This can be done if the number of strings they contain is small. In the x = "a" + x case, for example, this will reduce the recursion depth of the string tree (but not reduce the allocations). -Chetan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/ac0c060f/attachment.htm From brett at python.org Fri Oct 20 00:25:15 2006 From: brett at python.org (Brett Cannon) Date: Thu, 19 Oct 2006 15:25:15 -0700 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> Message-ID: On 10/19/06, Grig Gheorghiu wrote: > > The latest trunk checkin caused almost all Pybots to fail when running > the Python unit tests. > > 273 tests OK. > 12 tests failed: > test___all__ test_calendar test_capi test_datetime test_email > test_email_renamed test_imaplib test_mailbox test_strftime > test_strptime test_time test_xmlrpc > > Here's the status page: > > http://www.python.org/dev/buildbot/community/trunk/ > > Not sure why the official Python buildbot farm is all green and > happy....maybe a difference in how the steps are running? Possibly. If you look at the reason those tests failed it is because time.strftime is missing for some odd reason. But none of recent checkins seem to have anything to do with the 'time' module, let alone with how methods are added to modules (Martin's recent checkins have been for PyArg_ParseTuple). -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/8a7688ad/attachment.html From grig.gheorghiu at gmail.com Fri Oct 20 00:30:01 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 19 Oct 2006 15:30:01 -0700 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> Message-ID: <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> On 10/19/06, Brett Cannon wrote: > > Possibly. If you look at the reason those tests failed it is because > time.strftime is missing for some odd reason. But none of recent checkins > seem to have anything to do with the 'time' module, let alone with how > methods are added to modules (Martin's recent checkins have been for > PyArg_ParseTuple). > > -Brett Could there possible be a side effect of the PyArg_ParseTuple changes? Grig From brett at python.org Fri Oct 20 00:53:45 2006 From: brett at python.org (Brett Cannon) Date: Thu, 19 Oct 2006 15:53:45 -0700 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> Message-ID: On 10/19/06, Grig Gheorghiu wrote: > > On 10/19/06, Brett Cannon wrote: > > > > Possibly. If you look at the reason those tests failed it is because > > time.strftime is missing for some odd reason. But none of recent > checkins > > seem to have anything to do with the 'time' module, let alone with how > > methods are added to modules (Martin's recent checkins have been for > > PyArg_ParseTuple). > > > > -Brett > > Could there possible be a side effect of the PyArg_ParseTuple changes? I doubt that, especially since I just updated my pristine checkout and test_time passed fine. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/3ef71908/attachment.htm From grig.gheorghiu at gmail.com Fri Oct 20 01:48:35 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 19 Oct 2006 16:48:35 -0700 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> Message-ID: <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> On 10/19/06, Brett Cannon wrote: > > > On 10/19/06, Grig Gheorghiu wrote: > > On 10/19/06, Brett Cannon wrote: > > > > > > Possibly. If you look at the reason those tests failed it is because > > > time.strftime is missing for some odd reason. But none of recent > checkins > > > seem to have anything to do with the 'time' module, let alone with how > > > methods are added to modules (Martin's recent checkins have been for > > > PyArg_ParseTuple). > > > > > > -Brett > > > > Could there possible be a side effect of the PyArg_ParseTuple changes? > > I doubt that, especially since I just updated my pristine checkout and > test_time passed fine. > > -Brett > > OK, I deleted the checkout directory on one of my buidslaves and re-ran the build steps. The tests passed. So my conclusion is that a full rebuild is needed for the tests to pass after the last checkins (which included files such as configure and configure.in). The Python buildbots are doing full rebuilds every time, that's why they're green and happy, but the Pybots are just doing incremental builds. Maybe the makefiles should be modified so that a full rebuild is triggered when the configure and configure.in files are changed? At this point, I'll have to tell all the Pybots owners to delete their checkout directories and start a new build. Grig From warner at lothar.com Fri Oct 20 01:59:46 2006 From: warner at lothar.com (Brian Warner) Date: Thu, 19 Oct 2006 16:59:46 -0700 Subject: [Python-Dev] Promoting PCbuild8 In-Reply-To: <45352A4F.2080203@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?= =?iso-8859-1?Q?wis's?= message of "Tue, 17 Oct 2006 21:09:03 +0200") References: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc> <45352A4F.2080203@v.loewis.de> Message-ID: <87vemfc0wd.fsf@lothar.com> "Martin v. L?wis" writes: >> But I agree that >> getting regular builds running would be a good thing. An x64 box >> would be ideal to build both the x86 and x64 versions on. A single >> bot can manage many platforms, right? > > A single machine, and a single buildbot installation, yes. But not > a single build slave, since there can be only one build procedure > per slave. To be precise, you have have as many build procedures per slave as you like, but if the procedure depends upon running on a particular platform, then it is unlikely that a single slave can accomodate multiple platforms. Each Builder object in the buildbot config file is created with a BuildFactory (which defines the sequence of steps it will execute), and a list of buildslaves that it can run on. There is a many-to-many mapping from Builders to buildslaves. For example, you might have an "all-tests" Builder that does a compile and runs the unit-test suite, and a second "build-API-docs" Builder that just runs epydoc or something. Both of these Builders could easily run on the same slave. But if you have an x86 Builder and a PPC Builder, you'd be hard pressed to find a single buildslave that could usefully serve for both. If the x86 and the x64 builds can be run on the same machine, how do you control which kind of build you're doing? The decision about whether to run them in the same buildslave or in two separate buildslaves depends upon how you express this control. One possibility is that you just pass some different CFLAGS to the configure or compile step.. in that case, putting them both in the same slave is easy, and the CFLAGS settings will appear in your BuildFactories. If instead you have to use a separate chroot environment (or whatever the equivalent is for this issue) for each, then it may be easiest to run two separate buildslaves (and your BuildFactories might be identical). > It's possible to tell the master not to build different branches on a > single slave (i.e. 2.5 has to wait if trunk is building), but it's not > possible to tell it that two slaves reside on the same machine (it might be > possible, but I don't know how to do it). You could create a MasterLock that is shared by just the two Builders which use slaves which share the same machine. That would prohibit the two Builders from running at the same time. (SlaveLocks wouldn't help here, because as you pointed out there is no way to tell the buildmaster that two slaves share a host). cheers, -Brian From brett at python.org Fri Oct 20 06:00:13 2006 From: brett at python.org (Brett Cannon) Date: Thu, 19 Oct 2006 21:00:13 -0700 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> Message-ID: On 10/19/06, Grig Gheorghiu wrote: > > On 10/19/06, Brett Cannon wrote: > > > > > > On 10/19/06, Grig Gheorghiu wrote: > > > On 10/19/06, Brett Cannon wrote: > > > > > > > > Possibly. If you look at the reason those tests failed it is > because > > > > time.strftime is missing for some odd reason. But none of recent > > checkins > > > > seem to have anything to do with the 'time' module, let alone with > how > > > > methods are added to modules (Martin's recent checkins have been for > > > > PyArg_ParseTuple). > > > > > > > > -Brett > > > > > > Could there possible be a side effect of the PyArg_ParseTuple changes? > > > > I doubt that, especially since I just updated my pristine checkout and > > test_time passed fine. > > > > -Brett > > > > > > OK, I deleted the checkout directory on one of my buidslaves and > re-ran the build steps. The tests passed. So my conclusion is that a > full rebuild is needed for the tests to pass after the last checkins > (which included files such as configure and configure.in). > > The Python buildbots are doing full rebuilds every time, that's why > they're green and happy, but the Pybots are just doing incremental > builds. > > Maybe the makefiles should be modified so that a full rebuild is > triggered when the configure and configure.in files are changed? Maybe, but I don't know how to do that. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/32ce9baf/attachment.html From martin at v.loewis.de Fri Oct 20 07:51:33 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 20 Oct 2006 07:51:33 +0200 Subject: [Python-Dev] Promoting PCbuild8 In-Reply-To: <87vemfc0wd.fsf@lothar.com> References: <129CEF95A523704B9D46959C922A280002FE9A04@nemesis.central.ccp.cc> <45352A4F.2080203@v.loewis.de> <87vemfc0wd.fsf@lothar.com> Message-ID: <453863E5.6060704@v.loewis.de> Brian Warner schrieb: > To be precise, you have have as many build procedures per slave as you like, > but if the procedure depends upon running on a particular platform, then it > is unlikely that a single slave can accomodate multiple platforms. Ah, right, I can have multiple builders per slave. That's good. For the case of x86 and AMD64, a single slave can indeed accommodate both platforms. > If the x86 and the x64 builds can be run on the same machine, how do you > control which kind of build you're doing? The decision about whether to run > them in the same buildslave or in two separate buildslaves depends upon how > you express this control. One possibility is that you just pass some > different CFLAGS to the configure or compile step.. in that case, putting > them both in the same slave is easy, and the CFLAGS settings will appear in > your BuildFactories. Most likely, there would be different batch files to run, although using environment variables might also work. So I guess I could use the same slave for both builders. > You could create a MasterLock that is shared by just the two Builders which > use slaves which share the same machine. That would prohibit the two Builders > from running at the same time. (SlaveLocks wouldn't help here, because as you > pointed out there is no way to tell the buildmaster that two slaves share a > host). Ah, ok. Regards, Martin From martin at v.loewis.de Fri Oct 20 07:58:13 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 20 Oct 2006 07:58:13 +0200 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> Message-ID: <45386575.3000809@v.loewis.de> Grig Gheorghiu schrieb: > OK, I deleted the checkout directory on one of my buidslaves and > re-ran the build steps. The tests passed. So my conclusion is that a > full rebuild is needed for the tests to pass after the last checkins > (which included files such as configure and configure.in). Indeed, you had to re-run configure. There was a bug where -Werror was added to the build flags, causing several configure tests to fail (most notably, it would determine that there's no memmove on Linux). > Maybe the makefiles should be modified so that a full rebuild is > triggered when the configure and configure.in files are changed? The makefiles already do that: if configure changes, a plain "make" will first re-run configure. > At this point, I'll have to tell all the Pybots owners to delete their > checkout directories and start a new build. Not necessarily. You can also ask, at the buildbot GUI, that a non-existing branch is build. This should cause the checkouts to be deleted (and then the build to fail); the next regular build will check out from scratch. Regards, Martin From larry at hastings.org Fri Oct 20 08:45:31 2006 From: larry at hastings.org (Larry Hastings) Date: Thu, 19 Oct 2006 23:45:31 -0700 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> <452C6FD8.8070403@v.loewis.de> Message-ID: <4538708B.8070406@hastings.org> Fredrik Lundh wrote: > a dynamic registration approach would be even better, with a single entry point > used to register all methods and hooks your C extension has implemented, and > code on the other side that builds a properly initialized type descriptor from that > set, using fallback functions and error stubs where needed. I knocked out a prototype of this last week, emailed Mr. Lundh about it, then forgot about it. Would anyone be interested in taking a peek at it? I only changed one file to use this new-style initialization, sha256module.c. The resulting init_sha256() looks like this: PyMODINIT_FUNC init_sha256(void) { PyObject *m; SHA224type = PyType_New("_sha256.sha224", sizeof(SHAobject), NULL); if (SHA224type == NULL) return; PyType_SetPointer(SHA224type, pte_dealloc, &SHA_dealloc); PyType_SetPointer(SHA224type, pte_methods, &SHA_methods); PyType_SetPointer(SHA224type, pte_members, &SHA_members); PyType_SetPointer(SHA224type, pte_getset, &SHA_getseters); if (PyType_Ready(SHA224type) < 0) return; SHA256type = PyType_New("_sha256.sha256", sizeof(SHAobject), NULL); if (SHA256type == NULL) return; PyType_SetPointer(SHA256type, pte_dealloc, &SHA_dealloc); PyType_SetPointer(SHA256type, pte_methods, &SHA_methods); PyType_SetPointer(SHA256type, pte_members, &SHA_members); PyType_SetPointer(SHA256type, pte_getset, &SHA_getseters); if (PyType_Ready(SHA256type) < 0) return; m = Py_InitModule("_sha256", SHA_functions); if (m == NULL) return; } In a way this wasn't really a good showpiece for my code. The "methods", "members", and "getseters" structs still need to be passed in. However, I did change all four "as_" structures so you can set those directly. For instance, the "concat" as_sequence method for a PyString object would be set using PyType_SetPointer(PyString_Type, pte_sequence_concat, string_concat); (I actually converted the PyString object to my new code, but had chicken-and-egg initialization problems as a result and backed out of it. The code is still in the branch, just commented out.) Patch available for interested parties, /larry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061019/531206b1/attachment.htm From theller at ctypes.org Fri Oct 20 09:37:09 2006 From: theller at ctypes.org (Thomas Heller) Date: Fri, 20 Oct 2006 09:37:09 +0200 Subject: [Python-Dev] ctypes and win64 In-Reply-To: References: Message-ID: Thomas Heller schrieb (this was before Python 2.5 had been released): > The _ctypes extension module does currently not even build on Win64. > > I'm (slowly) working on this (for AMD64, not for itanium), but it may > take a good while before it is stable - It is not even fully implemented > currently. > > The win64 msi installer installs the ctypes package anyway, but it cannot be > imported. > > I suggest that it should be removed from the 2.5 win64 msi installers, so that > at least, when it is ready, can be installed as separate package. Then, Martin changed the win64 msi installer to exclude the ctypes package when the _ctypes.pyd extension does not exist because it was not built. In the meantime I have integrated patches (in the trunk) so that _ctypes can be built for win64/AMD64, and does even work. Can these changes be merged into release25-maint? IMO this is low-risk because they contain only small changes to the files in Modules/_ctypes/libffi_msvc, plus *some* changes to support the Windows LP64 model. I would prefer to merge these changes into release25-maint, because I want to also release the standalone ctypes packages from this branch (using it with svn:externals from somewhere else). The official Python 2.5.x win64/AMD64 windows installers should still *not* contain the ctypes package, but they could install it separately. Thanks, Thomas From theller at ctypes.org Fri Oct 20 11:59:14 2006 From: theller at ctypes.org (Thomas Heller) Date: Fri, 20 Oct 2006 11:59:14 +0200 Subject: [Python-Dev] ctypes and win64 In-Reply-To: References: Message-ID: <45389DF2.60004@ctypes.org> [Resent after subscribing to python-dev with this new email address, sorry if it appears twice] Thomas Heller schrieb (this was before Python 2.5 had been released): > > The _ctypes extension module does currently not even build on Win64. > > > > I'm (slowly) working on this (for AMD64, not for itanium), but it may > > take a good while before it is stable - It is not even fully implemented > > currently. > > > > The win64 msi installer installs the ctypes package anyway, but it cannot be > > imported. > > > > I suggest that it should be removed from the 2.5 win64 msi installers, so that > > at least, when it is ready, can be installed as separate package. Then, Martin changed the win64 msi installer to exclude the ctypes package when the _ctypes.pyd extension does not exist because it was not built. In the meantime I have integrated patches (in the trunk) so that _ctypes can be built for win64/AMD64, and does even work. Can these changes be merged into release25-maint? IMO this is low-risk because they contain only small changes to the files in Modules/_ctypes/libffi_msvc, plus *some* changes to support the Windows LP64 model. I would prefer to merge these changes into release25-maint, because I want to also release the standalone ctypes packages from this branch (using it with svn:externals from somewhere else). The official Python 2.5.x win64/AMD64 windows installers should still *not* contain the ctypes package, but they could install it separately. Thanks, Thomas From grig.gheorghiu at gmail.com Fri Oct 20 17:31:45 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Fri, 20 Oct 2006 08:31:45 -0700 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: <45386575.3000809@v.loewis.de> References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> <45386575.3000809@v.loewis.de> Message-ID: <3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com> On 10/19/06, "Martin v. L?wis" wrote: > Grig Gheorghiu schrieb: > > OK, I deleted the checkout directory on one of my buidslaves and > > re-ran the build steps. The tests passed. So my conclusion is that a > > full rebuild is needed for the tests to pass after the last checkins > > (which included files such as configure and configure.in). > > Indeed, you had to re-run configure. There was a bug where -Werror was > added to the build flags, causing several configure tests to fail > (most notably, it would determine that there's no memmove on Linux). > > > Maybe the makefiles should be modified so that a full rebuild is > > triggered when the configure and configure.in files are changed? > > The makefiles already do that: if configure changes, a plain > "make" will first re-run configure. Well, that didn't trigger a full rebuild on the Pybots buildslaves though. > > > At this point, I'll have to tell all the Pybots owners to delete their > > checkout directories and start a new build. > > Not necessarily. You can also ask, at the buildbot GUI, that a > non-existing branch is build. This should cause the checkouts > to be deleted (and then the build to fail); the next regular > build will check out from scratch. > OK, I'll try that next time. Or I can add an extra 'clean checkout dir' step to the buildmaster -- but that would trigger a full rebuild every time, which is not what I want, since some of the buildslaves take a long time to do that. Grig From martin at v.loewis.de Fri Oct 20 19:56:48 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 20 Oct 2006 19:56:48 +0200 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: <3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com> References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> <45386575.3000809@v.loewis.de> <3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com> Message-ID: <45390DE0.6050104@v.loewis.de> Grig Gheorghiu schrieb: >> > Maybe the makefiles should be modified so that a full rebuild is >> > triggered when the configure and configure.in files are changed? >> >> The makefiles already do that: if configure changes, a plain >> "make" will first re-run configure. > > Well, that didn't trigger a full rebuild on the Pybots buildslaves though. Can you provide more details? Did it not run configure again, or did that not cause a rebuild? There is an issue with setup.py/distutils not doing the rebuilding properly if header files change; contributions to fix this are welcome (quick-hacked work-arounds are not). Regards, Martin From martin at v.loewis.de Fri Oct 20 20:08:24 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 20 Oct 2006 20:08:24 +0200 Subject: [Python-Dev] ctypes and win64 In-Reply-To: References: Message-ID: <45391098.8020906@v.loewis.de> Thomas Heller schrieb: > I would prefer to merge these changes into release25-maint, because I want to > also release the standalone ctypes packages from this branch (using it with > svn:externals from somewhere else). That's not a good reason for back-porting. If you want a "maintenance" branch for ctypes, feel free to create one in the subversion, likewise for tags. OTOH, I can't comment on whether those changes would be acceptable for a backport to the 2.5 maintenance branch - if they don't introduce actual new features, it might be ok. > The official Python 2.5.x win64/AMD64 windows installers should still *not* > contain the ctypes package, but they could install it separately. I don't really understand. Are you planning to back-port PCbuild changes also? If so, how should including those extensions be suppressed? Regards, Martin From grig.gheorghiu at gmail.com Fri Oct 20 20:16:49 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Fri, 20 Oct 2006 11:16:49 -0700 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: <45390DE0.6050104@v.loewis.de> References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> <45386575.3000809@v.loewis.de> <3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com> <45390DE0.6050104@v.loewis.de> Message-ID: <3f09d5a00610201116we7b418fyb7b56fe88042af5@mail.gmail.com> On 10/20/06, "Martin v. L?wis" wrote: > Grig Gheorghiu schrieb: > >> > Maybe the makefiles should be modified so that a full rebuild is > >> > triggered when the configure and configure.in files are changed? > >> > >> The makefiles already do that: if configure changes, a plain > >> "make" will first re-run configure. > > > > Well, that didn't trigger a full rebuild on the Pybots buildslaves though. > > Can you provide more details? Did it not run configure again, or > did that not cause a rebuild? > > There is an issue with setup.py/distutils not doing the rebuilding > properly if header files change; contributions to fix this are welcome > (quick-hacked work-arounds are not). > Here are the steps that led to the unit test failures, after your checkin of configure and configure.in. svn update: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-svn/0 configure: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-configure/0 compile: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-compile/0 test: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-test/0 HTH, Grig From martin at v.loewis.de Fri Oct 20 20:56:40 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 20 Oct 2006 20:56:40 +0200 Subject: [Python-Dev] Python unit tests failing on Pybots farm In-Reply-To: <3f09d5a00610201116we7b418fyb7b56fe88042af5@mail.gmail.com> References: <3f09d5a00610191419n38b0701akdf5e3485da4820ac@mail.gmail.com> <3f09d5a00610191530t7b05d353h851c2791ec2aac87@mail.gmail.com> <3f09d5a00610191648j43343802uf18bc8300f478a8d@mail.gmail.com> <45386575.3000809@v.loewis.de> <3f09d5a00610200831na0ec613ya773f69646245742@mail.gmail.com> <45390DE0.6050104@v.loewis.de> <3f09d5a00610201116we7b418fyb7b56fe88042af5@mail.gmail.com> Message-ID: <45391BE8.8090902@v.loewis.de> Grig Gheorghiu schrieb: > Here are the steps that led to the unit test failures, after your > checkin of configure and configure.in. > > svn update: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-svn/0 > > configure: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-configure/0 > > compile: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-compile/0 > > test: http://www.python.org/dev/buildbot/community/all/x86%20Ubuntu%20Breezy%20trunk/builds/55/step-test/0 As you can see, it indeed re-ran configure, and it also rebuilt the interpreter. It then did not rebuild any of the extensions except for pyexpat and elementtree. As I said, contributions to fix that are welcome. Regards, Martin From theller at ctypes.org Fri Oct 20 20:59:23 2006 From: theller at ctypes.org (Thomas Heller) Date: Fri, 20 Oct 2006 20:59:23 +0200 Subject: [Python-Dev] ctypes and win64 In-Reply-To: <45391098.8020906@v.loewis.de> References: <45391098.8020906@v.loewis.de> Message-ID: <45391C8B.1070605@ctypes.org> Martin v. L?wis schrieb: > Thomas Heller schrieb: [I was talking about patches to make ctypes work on 64-bit windows] >> I would prefer to merge these changes into release25-maint, because I want to >> also release the standalone ctypes packages from this branch (using it with >> svn:externals from somewhere else). > > That's not a good reason for back-porting. If you want a "maintenance" > branch for ctypes, feel free to create one in the subversion, likewise > for tags. > > OTOH, I can't comment on whether those changes would be acceptable for > a backport to the 2.5 maintenance branch - if they don't introduce > actual new features, it might be ok. > >> The official Python 2.5.x win64/AMD64 windows installers should still *not* >> contain the ctypes package, but they could install it separately. > > I don't really understand. Are you planning to back-port PCbuild changes > also? If so, how should including those extensions be suppressed? Let me try to put it in different words. The official Python-2.5.amd64.msi does *not* contain ctypes, so the official Python-2.5.x.amd64.msi should also not contain ctypes (I assume). Not many people (I assume again) are running 64-bit windows, and use the 64-bit Python version - but that will probably change soon. I would like to merge the 64-bit windows related ctypes changes in trunk, as soon as I'm sure that they work, back into the release25-maint branch. And also make separate ctypes releases from the release25-maint source code. I will only backport these changes if I'm convinced that they do not change the functionality of tehe current code. This way win64 Python users could install ctypes from the separate release. Also this way the source code for ctypes in the separate and the Python bundled releases are exactly the same, without creating too much work because of the different repositories. Hope that makes the plan clear, Thomas From skip at pobox.com Fri Oct 20 21:39:58 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 20 Oct 2006 14:39:58 -0500 Subject: [Python-Dev] OT: fdopen on Windows question Message-ID: <17721.9742.690489.77309@montanaro.dyndns.org> Sorry for the off-topic post. I figured someone here would know the answer and I don't have access to Windows to check experimentally. The ocrad program opens its input like so: if ( std::strcmp( infile_name, "-" ) == 0 ) infile = stdin; else infile = std::fopen( infile_name, "r" ); (SpamBayes is starting to use ocrad and PIL to extract text from image spam). Ocrad fails on Windows because the input file is opened in text mode. That "r" should be "rb". What's not clear to me is whether we can do anything about stdin. Will this work: if ( std::strcmp( infile_name, "-" ) == 0 ) infile = std::fdopen( std::fileno(stdin), "rb" ); else infile = std::fopen( infile_name, "rb" ); That is, can I change stdin from text to binary this way or is it destined to always be in text mode? Thx, Skip From skip at pobox.com Fri Oct 20 22:04:55 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 20 Oct 2006 15:04:55 -0500 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes Message-ID: <17721.11239.499820.841585@montanaro.dyndns.org> I'm setting up a buildbot slave for sqlalchemy on one of my Macs at home. When it builds and tests Python's test suite the sqlite test fails. When I ran it alone like this: ./python.exe Lib/test/test_sqlite.py and ./python.exe Lib/test/regrtest.py test_sqlite it succeeded. When I ran the full test suite it failed. I then tried adding -v as the error message suggested. It hung in test_pty waiting for a child process to complete. (Is this a known problem?) I finally redirected stdout and stderr like so: ./python.exe Lib/test/regrtest.py -l -v > test.out 2>&1 and it completed. It failed 146 out of 167 tests. Here is a sample of the failure messages: ... CheckClose (sqlite3.test.dbapi.ConnectionTests) ... ERROR CheckCommit (sqlite3.test.dbapi.ConnectionTests) ... ERROR CheckCommitAfterNoChanges (sqlite3.test.dbapi.ConnectionTests) ... ERROR CheckCursor (sqlite3.test.dbapi.ConnectionTests) ... ERROR CheckExceptions (sqlite3.test.dbapi.ConnectionTests) ... ERROR CheckFailedOpen (sqlite3.test.dbapi.ConnectionTests) ... ERROR CheckRollback (sqlite3.test.dbapi.ConnectionTests) ... ERROR CheckRollbackAfterNoChanges (sqlite3.test.dbapi.ConnectionTests) ... ERROR CheckArraySize (sqlite3.test.dbapi.CursorTests) ... ERROR CheckClose (sqlite3.test.dbapi.CursorTests) ... ERROR CheckCursorConnection (sqlite3.test.dbapi.CursorTests) ... ERROR CheckCursorWrongClass (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteArgFloat (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteArgInt (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteArgString (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteDictMapping (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteDictMappingNoArgs (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteDictMappingTooLittleArgs (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteDictMappingUnnamed (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteIllegalSql (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteManyGenerator (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteManyIterator (sqlite3.test.dbapi.CursorTests) ... ERROR CheckExecuteManyNotIterable (sqlite3.test.dbapi.CursorTests) ... ERROR ... A quick check of the tracebacks shows all the errors are of this form (CheckClose is the first failure): ====================================================================== ERROR: CheckClose (sqlite3.test.dbapi.ConnectionTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Buildbot/pybot/trunk.montanaro-g5/build/Lib/sqlite3/test/dbapi.py", line 85, in setUp self.cx = sqlite.connect(":memory:") ProgrammingError: library routine called out of sequence That is, they all raise the same exception and all exceptions are raised on sqlite.connect(":memory:") calls. Sometimes there is a second parameter to the call. Anybody seen this before? Skip From martin at v.loewis.de Sat Oct 21 00:46:47 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 21 Oct 2006 00:46:47 +0200 Subject: [Python-Dev] svn.python.org down In-Reply-To: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> References: <3f09d5a00610170759y2dac4772t157ebeadb878ae1f@mail.gmail.com> Message-ID: <453951D7.2000900@v.loewis.de> Grig Gheorghiu schrieb: > FYI -- can't do svn checkouts/updates from the trunk at this point. > > starting svn operation > svn update --revision HEAD > in dir /home/twistbot/pybot/trunk.gheorghiu-x86/build (timeout 1200 secs) > svn: PROPFIND request failed on '/projects/python/trunk' > svn: PROPFIND of '/projects/python/trunk': could not connect to server > (http://svn.python.org) It turns out that there was a power surge at the colocation site where the machines are, and, due to an unfortunate series of wreck, power went out for about one second. When power came back, the machine rebooted, but, for some reason, the svn apache server did not. Regards, Martin From larry at hastings.org Sat Oct 21 04:29:01 2006 From: larry at hastings.org (Larry Hastings) Date: Fri, 20 Oct 2006 19:29:01 -0700 Subject: [Python-Dev] The "lazy strings" patch [was: PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom] In-Reply-To: <4523F890.9060804@hastings.org> References: <4523F890.9060804@hastings.org> Message-ID: <453985ED.7050303@hastings.org> I've significantly enhanced my string-concatenation patch, to the point where that name is no longer accurate. So I've redubbed it the "lazy strings" patch. The major new feature is that string *slices* are also represented with a lazy-evaluation placeholder for the actual string, just as concatenated strings were in my original patch. The lazy slice object stores a reference to the original PyStringObject * it is sliced from, and the desired start and stop slice markers. (It only supports step = 1.) Its ob_sval is NULL until the string is rendered--but that rarely happens! Not only does this mean string slices are faster, but I bet this generally reduces overall memory usage for slices too. Now, one rule of the Python programming API is that "all strings are zero-terminated". That part of makes the life of a Python extension author sane--they don't have to deal with some exotic Python string class, they can just assume C-style strings everywhere. Ordinarily, this means a string slice couldn't simply point into the original string; if it did, and you executed x = "abcde" y = x[1:4] internally y->ob_sval[3] would not be 0, it would be 'e', breaking the API's rule about strings. However! When a PyStringObject lives out its life purely within the Python VM, the only code that strenuously examines its internals is stringobject.c. And that code almost never needs the trailing zero*. So I've added a new static method in stringobject.c: char * PyString_AsUnterminatedString(PyStringObject *) If you call it on a lazy-evaluation slice object, it gives you back a pointer into the original string's ob_sval. The s->ob_size'th element of this *might not* be zero, but if you call this function you're saying that's a-okay, you promise not to look at it. (If the PyStringObject * is any other variety, it calls into PyString_AsString, which renders string concatenation objects then returns ob_sval.) Again: this behavior is *never* observed by anyone outside of stringobject.c. External users of PyStringObjects call PyString_AS_STRING(), which renders all lazy concatenation and lazy slices so they look just like normal zero-terminated PyStringObjects. With my patch applied, trunk still passes all expected tests. Of course, lazy slice objects aren't just for literal slices created with [x:y]. There are lots of string methods that return what are effectively string slices, like lstrip() and split(). With this code in place, string slices that aren't examined by modules are very rarely rendered. I ran "pybench -n 2" (two rounds, warp 10 (whatever that means)) while collecting some statistics. When it finished, the interpreter had created a total of 640,041 lazy slices, of which only *19* were ever rendered. Apart from lazy slices, there's only one more enhancement when compared with v1: string prepending now reuses lazy concatenation objects much more often. There was an optimization in string_concatenate (Python/ceval.c) that said: "if the left-side string has two references, and we're about to overwrite the second reference by storing this concatenation to an object, tell that object to drop its reference". That often meant the reference on the string dropped to 1, which meant PyString_Resize could just resize the left-side string in place and append the right-side. I modified it so it drops the reference to the right-hand operand too. With this change, even with a reduction in the allowable stack depth for right-hand recursion (so it's less likely to blow the stack), I was able to prepend over 86k strings before it forced a render. (Oh, for the record: I ensure depth limits are enforced when combining lazy slices and lazy concatenations, so you still won't blow your stack when you mix them together.) Here are the highlights of a single apples-to-apples pybench run, 2.6 trunk revision 52413 ("this") versus that same revision with my patch applied ("other"): Test minimum run-time average run-time this other diff this other diff ------------------------------------------------------------------------------- ConcatStrings: 204ms 76ms +168.4% 213ms 77ms +177.7% CreateStringsWithConcat: 159ms 138ms +15.7% 163ms 142ms +15.1% StringSlicing: 142ms 86ms +65.5% 145ms 88ms +64.6% ------------------------------------------------------------------------------- Totals: 7976ms 7713ms +3.4% 8257ms 7975ms +3.5% I also ran this totally unfair benchmark: x = "abcde" * (20000) # 100k characters for i in xrange(10000000): y = x[1:-1] and found my patched version to be 9759% faster. (You heard that right, 98x faster.) I'm ready to post the patch. However, as a result of this work, the description on the original patch page is really no longer accurate: http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 Shall I close/delete that patch and submit a new patch with a more modern description? After all, there's not a lot of activity on the old patch page... Cheers, /larry/ * As I recall, stringobject.c needs the trailing zero in exactly *one* place: when comparing two zero-length strings. My patch ensures that zero-length slices and concatenations still return nullstring, so this still works as expected. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061020/0a891b49/attachment.html From skip at pobox.com Sat Oct 21 06:42:05 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 20 Oct 2006 23:42:05 -0500 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes Message-ID: <17721.42269.950166.859760@montanaro.dyndns.org> Following up on my earlier post... I svn up'd both my g5 and my g4 powerbook (both running OSX 10.4.8, gcc 4.0.0 apple build 5026), built and tested both. The test suite completed fine on my powerbook, failed on the g5. I tried running regrtest.py twice more on the g5 with the -r flag. It failed the first time, succeeded the second. I then made a series of run with the -f flag (thank you once again for that Se?or Peters). I whittled it down to the following reliably failing pair: $ ./python.exe Lib/test/regrtest.py -l -f tests test_ctypes test_sqlite test test_sqlite failed -- errors occurred; run in verbose mode for details 1 test OK. 1 test failed: test_sqlite For confirmation, this pair works fine on my g4 powerbook. I've gone no further so far. It's bedtime. Maybe someone else can at least try to reproduce what I've come up with so far on other platforms or on another Mac g5. Skip From fredrik at pythonware.com Sat Oct 21 08:23:17 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 21 Oct 2006 08:23:17 +0200 Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? In-Reply-To: <4538708B.8070406@hastings.org> References: <34FE2A7A34BC3544BC3127D023DF3D12128746@EWTEXCH.office.bhtrader.com> <452C6FD8.8070403@v.loewis.de> <4538708B.8070406@hastings.org> Message-ID: Larry Hastings wrote: > I knocked out a prototype of this last week, emailed Mr. Lundh about it, > then forgot about it. It's on my TODO list, so I haven't forgotten about it, but I've been (as usual) busy with other stuff. I'll get there, sooner or later. Posting this to the patch tracker and posting a note to the Py3K mailing list could be a good idea. From talin at acm.org Sat Oct 21 08:37:45 2006 From: talin at acm.org (Talin) Date: Fri, 20 Oct 2006 23:37:45 -0700 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453985ED.7050303@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> Message-ID: <4539C039.90406@acm.org> Interesting - is it possible that the same technique could be used to hide differences in character width? Specifically, if I concatenate an ascii string with a UTF-32 string, can the up-conversion to UTF-32 also be done lazily? If that could be done efficiently, it would resolve some outstanding issues that have come up on the Python-3000 list with regards to str/unicode convergence. Larry Hastings wrote: > > I've significantly enhanced my string-concatenation patch, to the point > where that name is no longer accurate. So I've redubbed it the "lazy > strings" patch. > > The major new feature is that string *slices* are also represented with > a lazy-evaluation placeholder for the actual string, just as > concatenated strings were in my original patch. The lazy slice object > stores a reference to the original PyStringObject * it is sliced from, > and the desired start and stop slice markers. (It only supports step = > 1.) Its ob_sval is NULL until the string is rendered--but that rarely > happens! Not only does this mean string slices are faster, but I bet > this generally reduces overall memory usage for slices too. > > Now, one rule of the Python programming API is that "all strings are > zero-terminated". That part of makes the life of a Python extension > author sane--they don't have to deal with some exotic Python string > class, they can just assume C-style strings everywhere. Ordinarily, > this means a string slice couldn't simply point into the original > string; if it did, and you executed > x = "abcde" > y = x[1:4] > internally y->ob_sval[3] would not be 0, it would be 'e', breaking the > API's rule about strings. > > However! When a PyStringObject lives out its life purely within the > Python VM, the only code that strenuously examines its internals is > stringobject.c. And that code almost never needs the trailing zero*. > So I've added a new static method in stringobject.c: > char * PyString_AsUnterminatedString(PyStringObject *) > If you call it on a lazy-evaluation slice object, it gives you back a > pointer into the original string's ob_sval. The s->ob_size'th element > of this *might not* be zero, but if you call this function you're saying > that's a-okay, you promise not to look at it. (If the PyStringObject * > is any other variety, it calls into PyString_AsString, which renders > string concatenation objects then returns ob_sval.) > > Again: this behavior is *never* observed by anyone outside of > stringobject.c. External users of PyStringObjects call > PyString_AS_STRING(), which renders all lazy concatenation and lazy > slices so they look just like normal zero-terminated PyStringObjects. > With my patch applied, trunk still passes all expected tests. > > Of course, lazy slice objects aren't just for literal slices created > with [x:y]. There are lots of string methods that return what are > effectively string slices, like lstrip() and split(). > > With this code in place, string slices that aren't examined by modules > are very rarely rendered. I ran "pybench -n 2" (two rounds, warp 10 > (whatever that means)) while collecting some statistics. When it > finished, the interpreter had created a total of 640,041 lazy slices, of > which only *19* were ever rendered. > > > Apart from lazy slices, there's only one more enhancement when compared > with v1: string prepending now reuses lazy concatenation objects much > more often. There was an optimization in string_concatenate > (Python/ceval.c) that said: "if the left-side string has two references, > and we're about to overwrite the second reference by storing this > concatenation to an object, tell that object to drop its reference". > That often meant the reference on the string dropped to 1, which meant > PyString_Resize could just resize the left-side string in place and > append the right-side. I modified it so it drops the reference to the > right-hand operand too. With this change, even with a reduction in the > allowable stack depth for right-hand recursion (so it's less likely to > blow the stack), I was able to prepend over 86k strings before it forced > a render. (Oh, for the record: I ensure depth limits are enforced when > combining lazy slices and lazy concatenations, so you still won't blow > your stack when you mix them together.) > > > Here are the highlights of a single apples-to-apples pybench run, 2.6 > trunk revision 52413 ("this") versus that same revision with my patch > applied ("other"): > > Test minimum run-time average run-time > this other diff this other > diff > ------------------------------------------------------------------------------- > > ConcatStrings: 204ms 76ms +168.4% 213ms 77ms > +177.7% > CreateStringsWithConcat: 159ms 138ms +15.7% 163ms 142ms > +15.1% > StringSlicing: 142ms 86ms +65.5% 145ms 88ms > +64.6% > ------------------------------------------------------------------------------- > > Totals: 7976ms 7713ms +3.4% 8257ms > 7975ms +3.5% > > I also ran this totally unfair benchmark: > x = "abcde" * (20000) # 100k characters > for i in xrange(10000000): > y = x[1:-1] > and found my patched version to be 9759% faster. (You heard that right, > 98x faster.) > > > I'm ready to post the patch. However, as a result of this work, the > description on the original patch page is really no longer accurate: > > http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 > > Shall I close/delete that patch and submit a new patch with a more > modern description? After all, there's not a lot of activity on the old > patch page... > > > Cheers, > > > /larry/ > > * As I recall, stringobject.c needs the trailing zero in exactly *one* > place: when comparing two zero-length strings. My patch ensures that > zero-length slices and concatenations still return nullstring, so this > still works as expected. > > > ------------------------------------------------------------------------ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/talin%40acm.org From fredrik at pythonware.com Sat Oct 21 09:10:19 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 21 Oct 2006 09:10:19 +0200 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <4539C039.90406@acm.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539C039.90406@acm.org> Message-ID: Talin wrote: > Interesting - is it possible that the same technique could be used to > hide differences in character width? Specifically, if I concatenate an > ascii string with a UTF-32 string, can the up-conversion to UTF-32 also > be done lazily? of course. and if all you do with the result is write it to an UTF-8 stream, it doesn't need to be done at all. this requires a slightly more elaborate C-level API interface than today's PyString_AS_STRING API, though... (which is why this whole exercise belongs on the Python 3000 lists, not on python-dev for 2.X) From martin at v.loewis.de Sat Oct 21 09:59:30 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 21 Oct 2006 09:59:30 +0200 Subject: [Python-Dev] The "lazy strings" patch [was: PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom] In-Reply-To: <453985ED.7050303@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> Message-ID: <4539D362.9010909@v.loewis.de> Larry Hastings schrieb: > I've significantly enhanced my string-concatenation patch, to the point > where that name is no longer accurate. So I've redubbed it the "lazy > strings" patch. It's not clear to me what you want to achieve with these patches, in particular, whether you want to see them integrated into Python or not. > The major new feature is that string *slices* are also represented with > a lazy-evaluation placeholder for the actual string, just as > concatenated strings were in my original patch. The lazy slice object > stores a reference to the original PyStringObject * it is sliced from, > and the desired start and stop slice markers. (It only supports step = > 1.) I think this specific approach will find strong resistance. It has been implemented many times, e.g. (apparently) in NextStep's NSString, and in Java's string type (where a string holds a reference to a character array, a start index, and an end index). Most recently, it was discussed under the name "string view" on the Py3k list, see http://mail.python.org/pipermail/python-3000/2006-August/003282.html Traditionally, the biggest objection is that even small strings may consume insane amounts of memory. > Its ob_sval is NULL until the string is rendered--but that rarely > happens! Not only does this mean string slices are faster, but I bet > this generally reduces overall memory usage for slices too. Channeling Guido: what real-world applications did you study with this patch to make such a claim? > I'm ready to post the patch. However, as a result of this work, the > description on the original patch page is really no longer accurate: > > http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470 > Shall I close/delete that patch and submit a new patch with a more > modern description? After all, there's not a lot of activity on the old > patch page... Closing the issue and opening a new is fine. Regards, Martin From martin at v.loewis.de Sat Oct 21 19:24:58 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 21 Oct 2006 19:24:58 +0200 Subject: [Python-Dev] OT: fdopen on Windows question In-Reply-To: <17721.9742.690489.77309@montanaro.dyndns.org> References: <17721.9742.690489.77309@montanaro.dyndns.org> Message-ID: <453A57EA.4020201@v.loewis.de> skip at pobox.com schrieb: > That is, can I change stdin from text to binary this way or is it destined > to always be in text mode? You can call _setmode on the file descriptor. Regards, Martin From skip at pobox.com Sat Oct 21 20:03:09 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 21 Oct 2006 13:03:09 -0500 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes Message-ID: <17722.24797.806588.338942@montanaro.dyndns.org> Followup #2... Yesterday I whittled my problems with test_sqlite on my OSX g5 to test_ctypes and test_sqlite: ./python.exe Lib/test/regrtest.py -l -f tests test_ctypes test_sqlite test test_sqlite failed -- errors occurred; run in verbose mode for details 1 test OK. 1 test failed: test_sqlite Today I refined things further. I renamed all the test_*.py files in Lib/ctypes/test/ until all I was left with was test_find.py. It fails if that's the only ctypes test script run: $ ls -l *.py -rw------- 1 buildbot buildbot 6870 Oct 20 06:30 __init__.py -rw------- 1 buildbot buildbot 624 Oct 20 06:30 runtests.py -rw------- 1 buildbot buildbot 3463 Oct 21 12:52 test_find.py montanaro:~/pybot/trunk.montanaro-g5/build/Lib/ctypes/test buildbot$ cd - /Library/Buildbot/pybot/trunk.montanaro-g5/build montanaro:~/pybot/trunk.montanaro-g5/build buildbot$ ./python.exe Lib/test/regrtest.py -l -f tests test_ctypes test_sqlite test test_sqlite failed -- errors occurred; run in verbose mode for details 1 test OK. 1 test failed: test_sqlite test_find.py contains checks for three OpenGL libraries on darwin: gl, glu and glut. If I comment out all those tests, test_sqlite succeeds. If any of them are enabled, test_sqlite fails. I've taken this about as far as I can. I submitted a bug report here: http://python.org/sf/1581906 Skip From janssen at parc.com Sat Oct 21 19:58:33 2006 From: janssen at parc.com (Bill Janssen) Date: Sat, 21 Oct 2006 10:58:33 PDT Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: Your message of "Fri, 20 Oct 2006 23:37:45 PDT." <4539C039.90406@acm.org> Message-ID: <06Oct21.105836pdt."58648"@synergy1.parc.xerox.com> See also the Cedar Ropes work: http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue12/spe986.pdf Bill From martin at v.loewis.de Sat Oct 21 20:10:10 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 21 Oct 2006 20:10:10 +0200 Subject: [Python-Dev] ctypes and win64 In-Reply-To: <45391C8B.1070605@ctypes.org> References: <45391098.8020906@v.loewis.de> <45391C8B.1070605@ctypes.org> Message-ID: <453A6282.5090804@v.loewis.de> Thomas Heller schrieb: > The official Python-2.5.amd64.msi does *not* contain ctypes, so > the official Python-2.5.x.amd64.msi should also not contain ctypes (I assume). That would be good, yes. > Not many people (I assume again) are running 64-bit windows, and use the 64-bit Python > version I also agree. > - but that will probably change soon. It speculation either way, but I disagree. It will take several years until people widely use Win64. For the foreseeable future, there are too many inconveniences to make it practical. > I would like to merge the 64-bit windows related ctypes changes in trunk, as soon as > I'm sure that they work, back into the release25-maint branch. And also make separate > ctypes releases from the release25-maint source code. I will only backport these changes > if I'm convinced that they do not change the functionality of tehe current code. I understand this. Still, integrating such changes formally introduces a new feature to the 2.5 branch (even though the feature isn't exposed readily). Whether or not this is ok is for the release manager to decide. What I don't understand is what the "64-bit windows related ctypes changes" are. Do they include changes to the PCbuild directory? Regards, Martin From jcarlson at uci.edu Sun Oct 22 00:02:14 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 21 Oct 2006 15:02:14 -0700 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453985ED.7050303@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> Message-ID: <20061021111107.0A5F.JCARLSON@uci.edu> Larry Hastings wrote: > > I've significantly enhanced my string-concatenation patch, to the point > where that name is no longer accurate. So I've redubbed it the "lazy > strings" patch. [snip] Honestly, I don't believe that pure strings should be this complicated. The implementation of the standard string and unicode type should be as simple as possible. The current string and unicode implementations are, in my opinion, as simple as possible given Python's needs. As such, I don't see a need to go mucking about with the standard string implementation to make it "lazy" so as to increase performance, reduce memory consumption, etc.. However, having written a somewhat "lazy" string slicing/etc operation class I called a "string view", whose discussion and implementation can be found in the py3k list, I do believe that having a related type, perhaps with the tree-based implementation you have written, or a simple pointer + length variant like I have written, would be useful to have available to Python. I also believe that it smells like a Py3k feature, which suggests that you should toss the whole string reliance and switch to unicode, as str and unicode become bytes and text in Py3k, with bytes being mutable. - Josiah From mark at pandapocket.com Sun Oct 22 00:54:03 2006 From: mark at pandapocket.com (=?utf-8?Q?Mark=20Roberts?=) Date: Sat, 21 Oct 2006 22:54:03 +0000 Subject: [Python-Dev] The "lazy strings" patch Message-ID: <20061021225403.4892.qmail@s402.sureserver.com> Hmm, I have not viewed the patch in question, but I'm curious why we wouldn't want to include such a patch if it were transparent to the user (Python based or otherwise). Especially if it increased performance without sacrificing maintainability or elegance. Further considering the common usage of strings in usual programming, I fail to see why an implementation like this would not be desirable? If there's a widely recognized argument against this, a link will likely sate my curiosity. Thanks, Mark > -------Original Message------- > From: Josiah Carlson > Subject: Re: [Python-Dev] The "lazy strings" patch > Sent: 21 Oct '06 22:02 > > > Larry Hastings wrote: > > > > I've significantly enhanced my string-concatenation patch, to the point > > where that name is no longer accurate. So I've redubbed it the "lazy > > strings" patch. > [snip] > > Honestly, I don't believe that pure strings should be this complicated. > The implementation of the standard string and unicode type should be as > simple as possible. The current string and unicode implementations are, > in my opinion, as simple as possible given Python's needs. > > As such, I don't see a need to go mucking about with the standard string > implementation to make it "lazy" so as to increase performance, reduce > memory consumption, etc.. However, having written a somewhat "lazy" > string slicing/etc operation class I called a "string view", whose > discussion and implementation can be found in the py3k list, I do > believe that having a related type, perhaps with the tree-based > implementation you have written, or a simple pointer + length variant > like I have written, would be useful to have available to Python. > > I also believe that it smells like a Py3k feature, which suggests that > you should toss the whole string reliance and switch to unicode, as str > and unicode become bytes and text in Py3k, with bytes being mutable. > > > - Josiah > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40pandapocket.com > From guido at python.org Sun Oct 22 01:50:12 2006 From: guido at python.org (Guido van Rossum) Date: Sat, 21 Oct 2006 16:50:12 -0700 Subject: [Python-Dev] Modulefinder In-Reply-To: References: Message-ID: Could you also prepare a patch for the p3yk branch? It's broken there too... On 10/13/06, Thomas Heller wrote: > I have patched Lib/modulefinder.py to work with absolute and relative imports. > It also is faster now, and has basic unittests in Lib/test/test_modulefinder.py. > > The work was done in a theller_modulefinder SVN branch. > If nobody objects, I will merge this into trunk, and possibly also into release25-maint, when I have time. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jjl at pobox.com Sun Oct 22 02:04:10 2006 From: jjl at pobox.com (John J Lee) Date: Sun, 22 Oct 2006 00:04:10 +0000 (UTC) Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <20061021225403.4892.qmail@s402.sureserver.com> References: <20061021225403.4892.qmail@s402.sureserver.com> Message-ID: On Sat, 21 Oct 2006, Mark Roberts wrote: [...] > If there's a widely recognized argument against this, a link will likely > sate my curiosity. Quoting from Martin v. Loewis earlier on the same day you posted: """ I think this specific approach will find strong resistance. It has been implemented many times, e.g. (apparently) in NextStep's NSString, and in Java's string type (where a string holds a reference to a character array, a start index, and an end index). Most recently, it was discussed under the name "string view" on the Py3k list, see http://mail.python.org/pipermail/python-3000/2006-August/003282.html Traditionally, the biggest objection is that even small strings may consume insane amounts of memory. """ John From ndunn at ndunn.com Sat Oct 21 16:13:48 2006 From: ndunn at ndunn.com (Neil Dunn) Date: Sat, 21 Oct 2006 15:13:48 +0100 Subject: [Python-Dev] Optional type checking/pluggable type systems for Python Message-ID: Dear All I'm a Master's student at Imperial College London currently selecting a Master's thesis subject. I am exploring the possibility of "optional typing" and "pluggable type systems" (Bracha) for Python. Reading around I see that PEP 246 (object adaption) was dropped for "something better". Is this "something better" currently in production for Python 3000 or just a thinking ground. I'd like to know whether there would be any merit in exploring the project or whether this is something that is going to appear as implementation within the next 6 months (the length of my thesis). If you think it is still something worth exploring I'd plan to pick up the idea as a research project and explore implementations, probabaly in CPython or Jython. Any help with this would be great, could you please reply directly to ndunn at ndunn.com as I haven't subscribed to python-dev for a while now. Thanks, Neil Dunn From aahz at pythoncraft.com Sun Oct 22 05:58:25 2006 From: aahz at pythoncraft.com (Aahz) Date: Sat, 21 Oct 2006 20:58:25 -0700 Subject: [Python-Dev] Optional type checking/pluggable type systems for Python In-Reply-To: References: Message-ID: <20061022035825.GA20602@panix.com> On Sat, Oct 21, 2006, Neil Dunn wrote: > > Any help with this would be great, could you please reply directly to > ndunn at ndunn.com as I haven't subscribed to python-dev for a while now. You should also post this to the python-3000 list; the lists do not all have the same readership. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you don't know what your program is supposed to do, you'd better not start writing it." --Dijkstra From tjreedy at udel.edu Sun Oct 22 06:04:34 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 22 Oct 2006 00:04:34 -0400 Subject: [Python-Dev] Optional type checking/pluggable type systems forPython References: Message-ID: "Neil Dunn" wrote in message news:f56dda5c0610210713k7c500637w25483e473ed263bb at mail.gmail.com... > Dear All > > I'm a Master's student at Imperial College London currently selecting > a Master's thesis subject. I am exploring the possibility of "optional > typing" and "pluggable type systems" (Bracha) for Python. Reading > around I see that PEP 246 (object adaption) was dropped for "something > better". Is this "something better" currently in production for Python > 3000 or just a thinking ground. Thinking, as far as I know. > I'd like to know whether there would be any merit in exploring the > project or whether this is something that is going to appear as > implementation within the next 6 months (the length of my thesis). > > If you think it is still something worth exploring I'd plan to pick up > the idea as a research project and explore implementations, probabaly > in CPython or Jython. > > Any help with this would be great, could you please reply directly to > ndunn at ndunn.com as I haven't subscribed to python-dev for a while now. You can follow both python-dev and py3000 lists as newsgroups via news.gmane.org. It also has archives. From ronaldoussoren at mac.com Sun Oct 22 10:06:05 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 22 Oct 2006 10:06:05 +0200 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes In-Reply-To: <17722.24797.806588.338942@montanaro.dyndns.org> References: <17722.24797.806588.338942@montanaro.dyndns.org> Message-ID: <660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com> On Oct 21, 2006, at 8:03 PM, skip at pobox.com wrote: > Followup #2... > > Yesterday I whittled my problems with test_sqlite on my OSX g5 to > test_ctypes and test_sqlite: > > ./python.exe Lib/test/regrtest.py -l -f tests > test_ctypes > test_sqlite > test test_sqlite failed -- errors occurred; run in verbose mode > for details > 1 test OK. > 1 test failed: > test_sqlite > > Today I refined things further. I renamed all the test_*.py files in > Lib/ctypes/test/ until all I was left with was test_find.py. It > fails if > that's the only ctypes test script run: > > $ ls -l *.py > -rw------- 1 buildbot buildbot 6870 Oct 20 06:30 __init__.py > -rw------- 1 buildbot buildbot 624 Oct 20 06:30 runtests.py > -rw------- 1 buildbot buildbot 3463 Oct 21 12:52 test_find.py > montanaro:~/pybot/trunk.montanaro-g5/build/Lib/ctypes/test > buildbot$ cd - > /Library/Buildbot/pybot/trunk.montanaro-g5/build > montanaro:~/pybot/trunk.montanaro-g5/build buildbot$ ./ > python.exe Lib/test/regrtest.py -l -f tests > test_ctypes > test_sqlite > test test_sqlite failed -- errors occurred; run in verbose mode > for details > 1 test OK. > 1 test failed: > test_sqlite > > test_find.py contains checks for three OpenGL libraries on darwin: > gl, glu > and glut. If I comment out all those tests, test_sqlite succeeds. > If any > of them are enabled, test_sqlite fails. According to a comment in (IIRC) the pyOpenGL sources GLUT on OSX does a chdir() during initialization, that could be the problem here. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061022/48cacba4/attachment.bin From ronaldoussoren at mac.com Sun Oct 22 12:54:54 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 22 Oct 2006 12:54:54 +0200 Subject: [Python-Dev] readlink and unicode strings (SF:1580674) Patch http://www.python.org/sf/1580674 fixes readlink's behaviour w.r.t. Unicode strings: without this patch this function uses the system default encoding instead of the filesystem encoding to convert Unicode objects to plain strings. Like os.listdir, os.readlink will now return a Unicode object when the argument is a Unicode object. What I'd like to know is if this can be backported to the 2.5 branch. The first part of this patch (use filesystem encoding instead of the system encoding) is IMHO a bugfix, the second part might break existing applications (that might not expect a unicode result from os.readlink). The reason I did this patch is that os.path.realpath currently breaks when the path is a unicode string with non-ascii characters and at least one element of the path is a symlink. Ronald Message-ID: Patch http://www.python.org/sf/1580674 fixes readlink's behaviour w.r.t. Unicode strings: without this patch this function uses the system default encoding instead of the filesystem encoding to convert Unicode objects to plain strings. Like os.listdir, os.readlink will now return a Unicode object when the argument is a Unicode object. What I'd like to know is if this can be backported to the 2.5 branch. The first part of this patch (use filesystem encoding instead of the system encoding) is IMHO a bugfix, the second part might break existing applications (that might not expect a unicode result from os.readlink). The reason I did this patch is that os.path.realpath currently breaks when the path is a unicode string with non-ascii characters and at least one element of the path is a symlink. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061022/b5aee3b7/attachment.bin From mal at egenix.com Sun Oct 22 13:02:21 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 22 Oct 2006 13:02:21 +0200 Subject: [Python-Dev] readlink and unicode strings (SF:1580674) Patch http://www.python.org/sf/1580674 fixes readlink's behaviour w.r.t. Unicode strings: without this patch this function uses the system default encoding instead of the filesystem encoding to convert Unicode objects to plain strings. Like os.listdir, os.readlink will now return a Unicode object when the argument is a Unicode object. What I'd like to know is if this can be backported to the 2.5 branch. The first part of this patch (use filesystem encoding instead of the system encoding) is IMHO a bugfix, the second part might break existing applications (that might not expect a unicode result from os.readlink). The reason I did this patch is that os.path.realpath currently breaks when the path is a unicode string with non-ascii characters and at least one element of the path is a symlink. Ronald In-Reply-To: References: Message-ID: <453B4FBD.2010504@egenix.com> Ronald Oussoren wrote: > Patch http://www.python.org/sf/1580674 fixes readlink's behaviour w.r.t. > Unicode strings: without this patch this function uses the system > default encoding instead of the filesystem encoding to convert Unicode > objects to plain strings. Like os.listdir, os.readlink will now return a > Unicode object when the argument is a Unicode object. > > What I'd like to know is if this can be backported to the 2.5 branch. > The first part of this patch (use filesystem encoding instead of the > system encoding) is IMHO a bugfix, the second part might break existing > applications (that might not expect a unicode result from os.readlink). > > The reason I did this patch is that os.path.realpath currently breaks > when the path is a unicode string with non-ascii characters and at least > one element of the path is a symlink. I don't think that an application that passes a Unicode object to os.readlink() would have problems dealing with a Unicode return value. +1 on backporting it to 2.5. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 22 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ronaldoussoren at mac.com Sun Oct 22 14:16:26 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 22 Oct 2006 14:16:26 +0200 Subject: [Python-Dev] readlink and unicode strings (SF:1580674) In-Reply-To: References: Message-ID: On Oct 22, 2006, at 12:54 PM, Ronald Oussoren wrote a message with an annoyingly large subject... Sorry about that, I guess it's time to book a course on basic computer usage :-( Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061022/c8f8260e/attachment.bin From skip at pobox.com Sun Oct 22 14:51:27 2006 From: skip at pobox.com (skip at pobox.com) Date: Sun, 22 Oct 2006 07:51:27 -0500 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes In-Reply-To: <660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com> References: <17722.24797.806588.338942@montanaro.dyndns.org> <660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com> Message-ID: <17723.26959.831139.587804@montanaro.dyndns.org> Ronald> According to a comment in (IIRC) the pyOpenGL sources GLUT on Ronald> OSX does a chdir() during initialization, that could be the Ronald> problem here. How would that explain that it fails on my g5 but not on my powerbook? They are at the same revision of the operating system and compiler. The checksums on the libraries are different though the file sizes are the same. The dates on the files are different as well. I suspect the checksum difference is caused by the different upgrade dates of the two machines and the resulting different times the two systems were "optimized". Is there anyone else with a g5 who can do a vanilla Unix (not framework) build on an up-to-date g5 from an up-to-date Subversion repository? It would be nice if someone else could at least confirm or not confirm this problem. Skip From brett at python.org Sun Oct 22 17:20:17 2006 From: brett at python.org (Brett Cannon) Date: Sun, 22 Oct 2006 08:20:17 -0700 Subject: [Python-Dev] PSF Infrastructure has chosen Roundup as the issue tracker for Python development In-Reply-To: References: Message-ID: Forgot to send this to python-dev. =) ---------- Forwarded message ---------- From: Brett Cannon Date: Oct 20, 2006 1:35 PM Subject: PSF Infrastructure has chosen Roundup as the issue tracker for Python development To: python-list at python.org At the beginning of the month the PSF Infrastructure committee announced that we had reached the decision that JIRA was our recommendation for the next issue tracker for Python development. Realizing, though, that it was a tough call between JIRA and Roundup we said that we would be willing to switch our recommendation to Roundup if enough volunteers stepped forward to help administer the tracker, thus negating Atlassian's offer of free managed hosting. Well, the community stepped up to the challenge and we got plenty of volunteers! In fact, the call for volunteers has led to an offer for professional hosting for Roundup from Upfront Systems. The committee is currently evaluating that offer and will hopefully have a decision made soon. Once a decision has been made we will contact the volunteers as to whom we have selected to help administer the installation (regardless of who hosts the tracker). The administrators and python-dev can then begin working towards deciding what we want from the tracker and its configuration. Once again, thanks to the volunteers for stepping forward to make this happen! -Brett Cannon PSF Infrastructure committee chairman -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061022/d560e566/attachment.htm From exarkun at divmod.com Sun Oct 22 17:48:23 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Sun, 22 Oct 2006 11:48:23 -0400 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes In-Reply-To: <17723.26959.831139.587804@montanaro.dyndns.org> Message-ID: <20061022154823.26151.349628906.divmod.quotient.10044@ohm> On Sun, 22 Oct 2006 07:51:27 -0500, skip at pobox.com wrote: > > Ronald> According to a comment in (IIRC) the pyOpenGL sources GLUT on > Ronald> OSX does a chdir() during initialization, that could be the > Ronald> problem here. > >How would that explain that it fails on my g5 but not on my powerbook? They >are at the same revision of the operating system and compiler. The >checksums on the libraries are different though the file sizes are the same. >The dates on the files are different as well. I suspect the checksum >difference is caused by the different upgrade dates of the two machines and >the resulting different times the two systems were "optimized". > >Is there anyone else with a g5 who can do a vanilla Unix (not framework) >build on an up-to-date g5 from an up-to-date Subversion repository? It >would be nice if someone else could at least confirm or not confirm this >problem. Robert Gravina has seen a problem which bears some resemblance to this one while using PySQLite in a real application on OS X. I've pointed him to this thread; hopefully it's the same issue and a second way of producing the issue will shed some more light on the matter. The top of that thread is available here: http://divmod.org/users/mailman.twistd/pipermail/divmod-dev/2006-October/000707.html Jean-Paul From anthony at interlink.com.au Sun Oct 22 18:03:13 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 23 Oct 2006 02:03:13 +1000 Subject: [Python-Dev] PSF Infrastructure has chosen Roundup as the issue tracker for Python development In-Reply-To: References: Message-ID: <200610230203.16907.anthony@interlink.com.au> Thanks to the folks involved in this prcocess - I'm looking forward to getting the hell away from SF's bug tracker. :-) Anthony From talin at acm.org Sun Oct 22 20:29:26 2006 From: talin at acm.org (Talin) Date: Sun, 22 Oct 2006 11:29:26 -0700 Subject: [Python-Dev] PSF Infrastructure has chosen Roundup as the issue tracker for Python development In-Reply-To: <200610230203.16907.anthony@interlink.com.au> References: <200610230203.16907.anthony@interlink.com.au> Message-ID: <453BB886.4080402@acm.org> Anthony Baxter wrote: > Thanks to the folks involved in this prcocess - I'm looking forward to getting > the hell away from SF's bug tracker. :-) Yes, let us know when the new tracker is up, I want to start using it :) From barry at python.org Mon Oct 23 02:53:55 2006 From: barry at python.org (Barry Warsaw) Date: Sun, 22 Oct 2006 20:53:55 -0400 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes In-Reply-To: <17723.26959.831139.587804@montanaro.dyndns.org> References: <17722.24797.806588.338942@montanaro.dyndns.org> <660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com> <17723.26959.831139.587804@montanaro.dyndns.org> Message-ID: <39070ED3-0989-43B2-BC36-35199EF67CC8@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 22, 2006, at 8:51 AM, skip at pobox.com wrote: > Is there anyone else with a g5 who can do a vanilla Unix (not > framework) > build on an up-to-date g5 from an up-to-date Subversion > repository? It > would be nice if someone else could at least confirm or not confirm > this > problem. By "vanilla unix" you mean a pretty simple ./configure; make; make test? Works for me with Python 2.5 on both my G5s and Intel Macs, all running 10.4.8. Note though that I usually build with CPPFLAGS and LDFLAGS pointing to /opt/local in order to pick up DarwinPorts readline, and you do the same and have a version of sqlite from there you can have problems. For example, we were seeing some very odd infloops in our sqlite layer. We have our own version of sqlite that we expected to be dynamically linked against, but when I used otool -L to check it, I realized we were dynamically linked against a version of sqlite in DarwinPorts. Getting rid of the unnecessary DarwinPorts version and making sure that we were dynamically linking against our version eliminated the infloops. What do you get when you check _sqlite3? % otool -L build/lib.macosx-10.3-ppc-2.5/_sqlite3.so build/lib.macosx-10.3-ppc-2.5/_sqlite3.so: /usr/lib/libsqlite3.0.dylib (compatibility version 9.0.0, current version 9.6.0) /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 92.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.1.7) Any possibility something like that's going on? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRTwSqHEjvBPtnXfVAQLvwQP/VuTQwwXwsauiuQt8E3k05scWsykarLaZ YMJyVwq++DH/X8C5RODG9seYhSMQLF8PKMStmhKWLmlQ9mfFPIobMgsFqXBuI+bD njUOh74O6vcJw1RNKXaERdQ6ABb2t79S6w+Psu5hGOP1NDy/e9GQazw05HpJWWvG 7Py+bDt24oE= =9TjL -----END PGP SIGNATURE----- From skip at pobox.com Mon Oct 23 05:24:07 2006 From: skip at pobox.com (skip at pobox.com) Date: Sun, 22 Oct 2006 22:24:07 -0500 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes In-Reply-To: <39070ED3-0989-43B2-BC36-35199EF67CC8@python.org> References: <17722.24797.806588.338942@montanaro.dyndns.org> <660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com> <17723.26959.831139.587804@montanaro.dyndns.org> <39070ED3-0989-43B2-BC36-35199EF67CC8@python.org> Message-ID: <17724.13783.912914.43233@montanaro.dyndns.org> Barry> What do you get when you check _sqlite3? $ otool -L ./build/lib.mac-10.3-ppc-2.6/_sqlite3.so ./build/lib.macosx-10.3-ppc-2.6/_sqlite3.so: /usr/local/lib/libsqlite3.0.dylib (compatibility version 9.0.0, current version 9.6.0) /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 93.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.1.7) Which I apparently installed on Oct 15 but seem to have forgotten... According to the source in my directory, it's sqlite 3.3.8. On my powerbook it's linked against /usr/lib/libsqlite3.0.dylib... Make clean, run the failing test pair, now it's fine. Otool shows linkage against /usr/lib/libsqlite3.0.dylib...: $ otool -L ./build/lib.macosx-10.3-ppc-2.6/_sqlite3.so ./build/lib.macosx-10.3-ppc-2.6/_sqlite3.so: /usr/lib/libsqlite3.0.dylib (compatibility version 9.0.0, current version 9.6.0) /usr/lib/libmx.A.dylib (compatibility version 1.0.0, current version 93.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.1.7) According to /usr/include/sqlite3.h, what's installed by Apple is 3.1.3. Aside from the possibility that I somehow compiled against /usr/include/sqlite3.h and linked against /usr/local/lib/libsqlite3.0.dylib, what difference should 3.3.8 vs. 3.1.3 have made? Skip From barry at python.org Mon Oct 23 05:52:21 2006 From: barry at python.org (Barry Warsaw) Date: Sun, 22 Oct 2006 23:52:21 -0400 Subject: [Python-Dev] Massive test_sqlite failure on Mac OSX ... sometimes In-Reply-To: <17724.13783.912914.43233@montanaro.dyndns.org> References: <17722.24797.806588.338942@montanaro.dyndns.org> <660FA164-9592-44C9-9BF1-7B716366BBBC@mac.com> <17723.26959.831139.587804@montanaro.dyndns.org> <39070ED3-0989-43B2-BC36-35199EF67CC8@python.org> <17724.13783.912914.43233@montanaro.dyndns.org> Message-ID: <37C9161F-AD7B-44B7-9B81-2E8DE400E6EB@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 22, 2006, at 11:24 PM, skip at pobox.com wrote: > According to /usr/include/sqlite3.h, what's installed by Apple is > 3.1.3. > Aside from the possibility that I somehow compiled against > /usr/include/sqlite3.h and linked against /usr/local/lib/ > libsqlite3.0.dylib, > what difference should 3.3.8 vs. 3.1.3 have made? Dunno, but as much as I love SQLite, I've also found it to be pretty finicky. For example, I once tried to upgrade us from 3.2.1 to 3.2.8 but that caused us a world of hurt, so I reverted back to the last known good version. At some point I'll try to get us on the latest release, but I'm a little gunshy about it. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRTw8e3EjvBPtnXfVAQJbKgP+MjAz/NfUOaDd+ZEg9haJVr7v5JsKTHEl i9n7pLLFToIE81RX3iGHMZwIZyIGHqT9d3gqan8INrvcAtL7hxVvkqAAFRJTmX2Z XVLAjWLYCp9nY6Q3K+yXls798RDoHhZIWvHnNXZJ7Ya2wwSVQoADFdV1GN0pIB07 PnNHa/S83+Q= =4fX8 -----END PGP SIGNATURE----- From larry at hastings.org Mon Oct 23 05:56:31 2006 From: larry at hastings.org (Larry Hastings) Date: Sun, 22 Oct 2006 20:56:31 -0700 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <4539D362.9010909@v.loewis.de> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> Message-ID: <453C3D6F.4060107@hastings.org> Martin v. L?wis wrote: > It's not clear to me what you want to achieve with these patches, > in particular, whether you want to see them integrated into Python or > not. > I would be thrilled if they were, but it seems less likely with every passing day. If you have some advice on how I might increase the patch's chances I would be all ears. It was/is my understanding that the early days of a new major revision was the most judicious time to introduce big changes. If I had offered these patches six months ago for 2.5, they would have had zero chance of acceptance. But 2.6 is in its infancy, and so I assumed now was the time to discuss sea-change patches like this. Anyway, it was my intent to post the patch and see what happened. Being a first-timer at this, and not having even read the core development mailing lists for very long, I had no idea what to expect. Though I genuinely didn't expect it to be this brusque. > I think this specific approach will find strong resistance. I'd say the "lazy strings" patch is really two approaches, "lazy concatenation" and "lazy slices". You are right, though, *both* have "found strong resistance". > Most recently, it was discussed under the name "string view" on the Py3k list, see > http://mail.python.org/pipermail/python-3000/2006-August/003282.html > Traditionally, the biggest objection is that even small strings may > consume insane amounts of memory. > Let's be specific: when there is at least one long-lived small lazy slice of a large string, and the large string itself would otherwise have been dereferenced and freed, and this small slice is never examined by code outside of stringobject.c, this approach means the large string becomes long-lived too and thus Python consumes more memory overall. In pathological scenarios this memory usage could be characterized as "insane". True dat. Then again, I could suggest some scenarios where this would save memory (multiple long-lived large slices of a large string), and others where memory use would be a wash (long-lived slices containing the all or almost all of a large string, or any scenario where slices are short-lived). While I think it's clear lazy slices are *faster* on average, its overall effect on memory use in real-world Python is not yet known. Read on. >> I bet this generally reduces overall memory usage for slices too. >> > Channeling Guido: what real-world applications did you study with > this patch to make such a claim? > I didn't; I don't have any. I must admit to being only a small-scale Python user. Memory use remains about the same in pybench, the biggest Python app I have handy. But, then, it was pretty clearly speculation, not a claim. Yes, I *think* it'd use less memory overall. But I wouldn't *claim* anything yet. The "stringview" discussion you cite was largely speculation, and as I recall there were users in both camps ("it'll use more memory overall" vs "no it won't"). And, while I saw a test case with microbenchmarks, and a "proof-of-concept" where a stringview was a separate object from a string, I didn't see any real-word applications tested with this approach. Rather than start in on speculation about it, I have followed that old maxim of "show me the code". I've produced actual code that works with real strings in Python. I see this as an opportunity for Pythonistas to determine the facts for themselves. Now folks can try the patch with these real-world applications you cite and find out how it really behaves. (Although I realize the Python community is under no obligation to do so.) If experimentation is the best thing here, I'd be happy to revise the patch to facilitate it. For instance, I could add command-line arguments letting you tweak the run-time behavior of the patch, like changing the minimum size of a lazy slice. Perhaps add code so there's a tweakable minimum size of a lazy concatenation too. Or a tweakable minimum *ratio* necessary for a lazy slice. I'm open to suggestions. Cheers, /larry/ From talin at acm.org Mon Oct 23 06:07:42 2006 From: talin at acm.org (Talin) Date: Sun, 22 Oct 2006 21:07:42 -0700 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453C3D6F.4060107@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> Message-ID: <453C400E.5070106@acm.org> Larry Hastings wrote: > Martin v. L?wis wrote: > Let's be specific: when there is at least one long-lived small lazy > slice of a large string, and the large string itself would otherwise > have been dereferenced and freed, and this small slice is never examined > by code outside of stringobject.c, this approach means the large string > becomes long-lived too and thus Python consumes more memory overall. In > pathological scenarios this memory usage could be characterized as "insane". > > True dat. Then again, I could suggest some scenarios where this would > save memory (multiple long-lived large slices of a large string), and > others where memory use would be a wash (long-lived slices containing > the all or almost all of a large string, or any scenario where slices > are short-lived). While I think it's clear lazy slices are *faster* on > average, its overall effect on memory use in real-world Python is not > yet known. Read on. I wonder - how expensive would it be for the string slice to have a weak reference, and 'normalize' the slice when the big string is collected? Would the overhead of the weak reference swamp the savings? -- Talin From martin at v.loewis.de Mon Oct 23 06:48:02 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 Oct 2006 06:48:02 +0200 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453C3D6F.4060107@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> Message-ID: <453C4982.80909@v.loewis.de> Larry Hastings schrieb: > Anyway, it was my intent to post the patch and see what happened. Being > a first-timer at this, and not having even read the core development > mailing lists for very long, I had no idea what to expect. Though I > genuinely didn't expect it to be this brusque. I could have told you :-) The "problem" really is that you are suggesting a major, significant change to the implementation of Python, and one that doesn't fix an obvious bug. The new code is an order of magnitude more complex than the old one, and the impact that it will have is unknown - but in the worst case, it could have serious negative impact, e.g. when the code is full of errors, and causes Python application to crash in masses. This is, of course, FUD: it is the fear that this might happen, the uncertainty about the quality of the code and the doubt about the viability of the approach. There are many aspects to such a change, but my experience is that it primarily takes time. Fredrik Lundh suggested you give up about Python 2.6, and target Python 3.0 right away; it may indeed be the case that Python 2.6 is too close for that kind of change to find enough supporters. If your primary goal was to contribute to open source, you might want to look for other areas of Python: there are plenty of open bugs ("real bugs" :-), unreviewed patches, etc. For some time, it is more satisfying to work on these, since the likelihood of success is higher. Regards, Martin From jcarlson at uci.edu Mon Oct 23 07:00:30 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 22 Oct 2006 22:00:30 -0700 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453C3D6F.4060107@hastings.org> References: <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> Message-ID: <20061022214126.0A6E.JCARLSON@uci.edu> Larry Hastings wrote: > It was/is my understanding that the early days of a new major revision > was the most judicious time to introduce big changes. If I had offered > these patches six months ago for 2.5, they would have had zero chance of > acceptance. But 2.6 is in its infancy, and so I assumed now was the > time to discuss sea-change patches like this. It would be a radical change for Python 2.6, and really the 2.x series, likely requiring nontrivial changes to extension modules that deal with strings, and the assumptions about strings that have held for over a decade. I think 2.6 as an option is a non-starter. Think Py3k, and really, think bytes and unicode. > The "stringview" discussion you cite was largely speculation, and as I > recall there were users in both camps ("it'll use more memory overall" > vs "no it won't"). And, while I saw a test case with microbenchmarks, > and a "proof-of-concept" where a stringview was a separate object from a > string, I didn't see any real-word applications tested with this approach. > > Rather than start in on speculation about it, I have followed that old > maxim of "show me the code". I've produced actual code that works with > real strings in Python. I see this as an opportunity for Pythonistas to > determine the facts for themselves. Now folks can try the patch with > these real-world applications you cite and find out how it really > behaves. (Although I realize the Python community is under no > obligation to do so.) One of the big concerns brought up in the stringview discussion was that of users expecting one thing and getting another. Slicing a larger string producing a 'view', which then keeps the larger string alive, would be a surprise. By making it a separate object that just *knows* about strings (or really, anything that offers a buffer interface), I was able to make an object that was 1) flexible, 2) usable in any Python, 3) doesn't change the core assumptions about Python, 4) is expandable to beyond just *strings*. Reason #4 was my primary reason for writing it, because str disappears in Py3k, which is closer to happening than most of us realize. > If experimentation is the best thing here, I'd be happy to revise the > patch to facilitate it. For instance, I could add command-line > arguments letting you tweak the run-time behavior of the patch, like > changing the minimum size of a lazy slice. Perhaps add code so there's > a tweakable minimum size of a lazy concatenation too. Or a tweakable > minimum *ratio* necessary for a lazy slice. I'm open to suggestions. I believe that would be a waste of time. The odds of it making it into Python 2.x without significant core developer support are pretty close to None, which in Python 2.x is less than 0. I've been down that road, nothing good lies that way. Want my advice? Aim for Py3k text as your primary target, but as a wrapper, not as the core type (I put the odds at somewhere around 0 for such a core type change). If you are good, and want to make guys like me happy, you could even make it support the buffer interface for non-text (bytes, array, mmap, etc.), unifying (via wrapper) the behavior of bytes and text. - Josiah From fredrik at pythonware.com Mon Oct 23 08:03:17 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 23 Oct 2006 08:03:17 +0200 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <20061022214126.0A6E.JCARLSON@uci.edu> References: <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <20061022214126.0A6E.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > It would be a radical change for Python 2.6, and really the 2.x series, > likely requiring nontrivial changes to extension modules that deal with > strings, and the assumptions about strings that have held for over a > decade. the assumptions hidden in everyone's use of the C-level string API is the main concern here, at least for me; radically changing the internal format is not a new idea, but it's always been held off because we have no idea how people are using the C API. From ncoghlan at gmail.com Mon Oct 23 11:49:50 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 Oct 2006 19:49:50 +1000 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <20061022214126.0A6E.JCARLSON@uci.edu> References: <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <20061022214126.0A6E.JCARLSON@uci.edu> Message-ID: <453C903E.7060608@gmail.com> Josiah Carlson wrote: > Want my advice? Aim for Py3k text as your primary target, but as a > wrapper, not as the core type (I put the odds at somewhere around 0 for > such a core type change). If you are good, and want to make guys like > me happy, you could even make it support the buffer interface for > non-text (bytes, array, mmap, etc.), unifying (via wrapper) the behavior > of bytes and text. This is still my preferred approach, too - for local optimisation of an algorithm, a string view type strikes me as an excellent idea. For the core data type, though, keeping the behaviour comparatively simple and predictable counterbalances the desire for more speed. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From skip at pobox.com Mon Oct 23 14:40:17 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 23 Oct 2006 07:40:17 -0500 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453C4982.80909@v.loewis.de> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> Message-ID: <17724.47153.512897.828558@montanaro.dyndns.org> >> Anyway, it was my intent to post the patch and see what happened. >> Being a first-timer at this, and not having even read the core >> development mailing lists for very long, I had no idea what to >> expect. Though I genuinely didn't expect it to be this brusque. Martin> I could have told you :-) The "problem" really is that you are Martin> suggesting a major, significant change to the implementation of Martin> Python, and one that doesn't fix an obvious bug. Come on Martin. Give Larry a break. Lots of changes have been accepted to to the Python core which weren't obvious "bug fixes". In fact, I seem to recall a sprint held recently in Reykjavik where the whole point was just to make Python faster. I believe that was exactly Larry's point in posting the patch. The "one obvious way to do" concatenation and slicing for one of the most heavily used types in python appears to be faster. That seems like a win to me. Skip From steve at holdenweb.com Mon Oct 23 15:51:35 2006 From: steve at holdenweb.com (Steve Holden) Date: Mon, 23 Oct 2006 14:51:35 +0100 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <17724.47153.512897.828558@montanaro.dyndns.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> Message-ID: <453CC8E7.1090000@holdenweb.com> skip at pobox.com wrote: > >> Anyway, it was my intent to post the patch and see what happened. > >> Being a first-timer at this, and not having even read the core > >> development mailing lists for very long, I had no idea what to > >> expect. Though I genuinely didn't expect it to be this brusque. > > Martin> I could have told you :-) The "problem" really is that you are > Martin> suggesting a major, significant change to the implementation of > Martin> Python, and one that doesn't fix an obvious bug. > The "obvious bug" that it fixes is slowness <0.75 wink>. > Come on Martin. Give Larry a break. Lots of changes have been accepted to > to the Python core which weren't obvious "bug fixes". In fact, I seem to > recall a sprint held recently in Reykjavik where the whole point was just to > make Python faster. I believe that was exactly Larry's point in posting the > patch. The "one obvious way to do" concatenation and slicing for one of the > most heavily used types in python appears to be faster. That seems like a > win to me. > I did point out to Larry when he went to c.l.py with the original patch that he would face resistance, so this hasn't blind-sided him. But it seems to me that the only major issue is the inability to provide zero-byte terminators with this new representation. Because Larry's proposal for handling this involves the introduction of a new API that can't already be in use in extensions it's obviously the extension writers who would be given most problems by this patch. I can understand resistance on that score, and I could understand resistance if there were other clear disadvantages to its implementation, but in their absence it seems like the extension modules are the killers. If there were any reliable way to make sure these objects never got passed to extension modules then I'd say "go for it". Without that it does seem like a potentially widespread change to the C API that could affect much code outside the interpreter. This is a great shame. I think Larry showed inventiveness and tenacity to get this far, and deserves credit for his achievements no matter whether or not they get into the core. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From steve at holdenweb.com Mon Oct 23 15:51:35 2006 From: steve at holdenweb.com (Steve Holden) Date: Mon, 23 Oct 2006 14:51:35 +0100 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <17724.47153.512897.828558@montanaro.dyndns.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> Message-ID: <453CC8E7.1090000@holdenweb.com> skip at pobox.com wrote: > >> Anyway, it was my intent to post the patch and see what happened. > >> Being a first-timer at this, and not having even read the core > >> development mailing lists for very long, I had no idea what to > >> expect. Though I genuinely didn't expect it to be this brusque. > > Martin> I could have told you :-) The "problem" really is that you are > Martin> suggesting a major, significant change to the implementation of > Martin> Python, and one that doesn't fix an obvious bug. > The "obvious bug" that it fixes is slowness <0.75 wink>. > Come on Martin. Give Larry a break. Lots of changes have been accepted to > to the Python core which weren't obvious "bug fixes". In fact, I seem to > recall a sprint held recently in Reykjavik where the whole point was just to > make Python faster. I believe that was exactly Larry's point in posting the > patch. The "one obvious way to do" concatenation and slicing for one of the > most heavily used types in python appears to be faster. That seems like a > win to me. > I did point out to Larry when he went to c.l.py with the original patch that he would face resistance, so this hasn't blind-sided him. But it seems to me that the only major issue is the inability to provide zero-byte terminators with this new representation. Because Larry's proposal for handling this involves the introduction of a new API that can't already be in use in extensions it's obviously the extension writers who would be given most problems by this patch. I can understand resistance on that score, and I could understand resistance if there were other clear disadvantages to its implementation, but in their absence it seems like the extension modules are the killers. If there were any reliable way to make sure these objects never got passed to extension modules then I'd say "go for it". Without that it does seem like a potentially widespread change to the C API that could affect much code outside the interpreter. This is a great shame. I think Larry showed inventiveness and tenacity to get this far, and deserves credit for his achievements no matter whether or not they get into the core. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From larry at hastings.org Mon Oct 23 16:58:25 2006 From: larry at hastings.org (Larry Hastings) Date: Mon, 23 Oct 2006 07:58:25 -0700 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453CC8E7.1090000@holdenweb.com> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> <453CC8E7.1090000@holdenweb.com> Message-ID: <453CD891.8020003@hastings.org> Steve Holden wrote: > But it seems to me that the only major issue is the inability to provide > zero-byte terminators with this new representation. > I guess I wasn't clear in my description of the patch; sorry about that. Like "lazy concatenation objects", "lazy slices" render when you call PyString_AsString() on them. Before rendering, the lazy slice's ob_sval will be NULL. Afterwards it will point to a proper zero-terminated string, at which point the object behaves exactly like any other PyStringObject. The only function that *might* return a non-terminated char * is PyString_AsUnterminatedString(). This function is static to stringobject.c--and I would be shocked if it were ever otherwise. > If there were any reliable way to make sure these objects never got > passed to extension modules then I'd say "go for it". If external Python extension modules are as well-behaved as the shipping Python source tree, there simply wouldn't be a problem. Python source is delightfully consistent about using the macro PyString_AS_STRING() to get at the creamy char *center of a PyStringObject *. When code religiously uses that macro (or calls PyString_AsString() directly), all it needs is a recompile with the current stringobject.h and it will Just Work. I genuinely don't know how many external Python extension modules are well-behaved in this regard. But in case it helps: I just checked PIL, NumPy, PyWin32, and SWIG, and all of them were well-behaved. Apart from stringobject.c, there was exactly one spot in the Python source tree which made assumptions about the structure of PyStringObjects (Mac/Modules/macos.c). It's in the block starting with the comment "This is a hack:". Note that this is unfixed in my patch, so just now all code using that self-avowed "hack" will break. Am I correct in understanding that changing the Python minor revision number (2.5 -> 2.6) requires external modules to recompile? (It certainly does on Windows.) If so, I could mitigate the problem by renaming ob_sval. That way, code making explicit reference to it would fail to compile, which I feel is better than silently recompiling unsafe code. Cheers, /larry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061023/a083bf5c/attachment.htm From exarkun at divmod.com Mon Oct 23 17:28:31 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 23 Oct 2006 11:28:31 -0400 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453CD891.8020003@hastings.org> Message-ID: <20061023152831.26151.1913366649.divmod.quotient.11082@ohm> On Mon, 23 Oct 2006 07:58:25 -0700, Larry Hastings wrote: > [snip] >If external Python extension modules are as well-behaved as the shipping >Python source tree, there simply wouldn't be a problem. Python source is >delightfully consistent about using the macro PyString_AS_STRING() to get at >the creamy char *center of a PyStringObject *. When code religiously uses >that macro (or calls PyString_AsString() directly), all it needs is a >recompile with the current stringobject.h and it will Just Work. > >I genuinely don't know how many external Python extension modules are well- >behaved in this regard. But in case it helps: I just checked PIL, NumPy, >PyWin32, and SWIG, and all of them were well-behaved. FWIW, http://www.google.com/codesearch?q=+ob_sval Jean-Paul From p.f.moore at gmail.com Mon Oct 23 17:42:35 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 23 Oct 2006 16:42:35 +0100 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453CD891.8020003@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> <453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org> Message-ID: <79990c6b0610230842k7a0a0facm3b4cc0a9f546b8ef@mail.gmail.com> On 10/23/06, Larry Hastings wrote: > > Steve Holden wrote: > > But it seems to me that the only major issue is the inability to provide > zero-byte terminators with this new representation. > > I guess I wasn't clear in my description of the patch; sorry about that. > > Like "lazy concatenation objects", "lazy slices" render when you call > PyString_AsString() on them. Before rendering, the lazy slice's ob_sval > will be NULL. Afterwards it will point to a proper zero-terminated string, > at which point the object behaves exactly like any other PyStringObject. I had picked up on this comment, and I have to say that I had been a little surprised by the resistance to the change based on the "code would break" argument, when you had made such a thorough attempt to address this. Perhaps others had missed this point, though. > I genuinely don't know how many external Python extension modules are > well-behaved in this regard. But in case it helps: I just checked PIL, > NumPy, PyWin32, and SWIG, and all of them were well-behaved. There's code out there which was written to the Python 1.4 API, and has not been updated since (I know, I wrote some of it!) I wouldn't call it "well-behaved" (it writes directly into the string's character buffer) but I don't believe it would fail (it only uses PyString_AsString to get the buffer address). /* Allocate an Python string object, with uninitialised contents. We * must do it this way, so that we can modify the string in place * later. See the Python source, Objects/stringobject.c for details. */ result = PyString_FromStringAndSize(NULL, len); if (result == NULL) return NULL; p = PyString_AsString(result); while (*str) { if (*str == '\n') *p = '\0'; else *p = *str; ++p; ++str; } > Am I correct in understanding that changing the Python minor revision > number (2.5 -> 2.6) requires external modules to recompile? (It certainly > does on Windows.) If so, I could mitigate the problem by renaming ob_sval. > That way, code making explicit reference to it would fail to compile, which > I feel is better than silently recompiling unsafe code. I think you've covered pretty much all the possible backward compatibility bases. A sufficiently evil extension could blow up, I guess, but that's always going to be true. OTOH, I don't have a comment on the desirability of the patch per se, as (a) I've never been hit by the speed issue, and (b) I'm thoroughly indoctrinated, so I always use ''.join() :-) Paul. From skip at pobox.com Mon Oct 23 17:49:26 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 23 Oct 2006 10:49:26 -0500 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453CD891.8020003@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> <453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org> Message-ID: <17724.58502.481218.773345@montanaro.dyndns.org> Larry> The only function that *might* return a non-terminated char * is Larry> PyString_AsUnterminatedString(). This function is static to Larry> stringobject.c--and I would be shocked if it were ever otherwise. If it's static to stringobject.c it doesn't need a PyString_ prefix. In fact, I'd argue that it shouldn't have one so that people reading the code won't miss the "static" and think it is part of the published API. Larry> Am I correct in understanding that changing the Python minor Larry> revision number (2.5 -> 2.6) requires external modules to Larry> recompile? Yes, in general, though you can often get away without it if you don't mind Python screaming at you about version mismatches. Skip From fredrik at pythonware.com Mon Oct 23 17:59:32 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 23 Oct 2006 17:59:32 +0200 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453CD891.8020003@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> <453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org> Message-ID: Larry Hastings wrote: > Am I correct in understanding that changing the Python minor revision > number (2.5 -> 2.6) requires external modules to recompile? not, in general, on Unix. it's recommended, but things usually work quite well anyway. From jcarlson at uci.edu Mon Oct 23 18:07:51 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 23 Oct 2006 09:07:51 -0700 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <79990c6b0610230842k7a0a0facm3b4cc0a9f546b8ef@mail.gmail.com> References: <453CD891.8020003@hastings.org> <79990c6b0610230842k7a0a0facm3b4cc0a9f546b8ef@mail.gmail.com> Message-ID: <20061023090040.0A7B.JCARLSON@uci.edu> "Paul Moore" wrote: > I had picked up on this comment, and I have to say that I had been a > little surprised by the resistance to the change based on the "code > would break" argument, when you had made such a thorough attempt to > address this. Perhaps others had missed this point, though. I'm also concerned about future usability. Word in the Py3k list is that Python 2.6 will be just about the last Python in the 2.x series, and by directing his implementation at only Python 2.x strings, he's just about guaranteeing obsolescence. By building with unicode and/or objects with a buffer interface in mind, Larry could build with both 2.x and 3.x in mind, and his code wouldn't be obsolete the moment it was released. - Josiah From exarkun at divmod.com Mon Oct 23 18:31:02 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 23 Oct 2006 12:31:02 -0400 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <20061023090040.0A7B.JCARLSON@uci.edu> Message-ID: <20061023163102.26151.1529007871.divmod.quotient.11135@ohm> On Mon, 23 Oct 2006 09:07:51 -0700, Josiah Carlson wrote: > >"Paul Moore" wrote: >> I had picked up on this comment, and I have to say that I had been a >> little surprised by the resistance to the change based on the "code >> would break" argument, when you had made such a thorough attempt to >> address this. Perhaps others had missed this point, though. > >I'm also concerned about future usability. Me too (perhaps in a different way though). >Word in the Py3k list is >that Python 2.6 will be just about the last Python in the 2.x series, >and by directing his implementation at only Python 2.x strings, he's >just about guaranteeing obsolescence. People will be using 2.x for a long time to come. And in the long run, isn't all software obsolete? :) >By building with unicode and/or >objects with a buffer interface in mind, Larry could build with both 2.x >and 3.x in mind, and his code wouldn't be obsolete the moment it was >released. (I'm not sure what the antecedent of "it" is in the above, I'm going to assume it's Python 3.x.) Supporting unicode strings and objects providing the buffer interface seems like a good idea in general, even disregarding Py3k. Starting with str is reasonable though, since there's still plenty of code that will benefit from this change, if it is indeed a beneficial change. Larry, I'm going to try to do some benchmarks against Twisted using this patch, but given my current time constraints, you may be able to beat me to this :) If you're interested, Twisted trunk at HEAD plus this trial plugin: http://twistedmatrix.com/trac/browser/sandbox/exarkun/merit/trunk will let you do some gross measurements using the Twisted test suite. I can give some more specific pointers if this sounds like something you'd want to mess with. Jean-Paul From martin at v.loewis.de Tue Oct 24 00:11:04 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Oct 2006 00:11:04 +0200 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <17724.47153.512897.828558@montanaro.dyndns.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> Message-ID: <453D3DF8.2040304@v.loewis.de> skip at pobox.com schrieb: > >> Anyway, it was my intent to post the patch and see what happened. > >> Being a first-timer at this, and not having even read the core > >> development mailing lists for very long, I had no idea what to > >> expect. Though I genuinely didn't expect it to be this brusque. > > Martin> I could have told you :-) The "problem" really is that you are > Martin> suggesting a major, significant change to the implementation of > Martin> Python, and one that doesn't fix an obvious bug. > > Come on Martin. Give Larry a break. I'm seriously not complaining, I'm explaining. > Lots of changes have been accepted to > to the Python core which weren't obvious "bug fixes". Surely many new features have been implemented over time, but in many cases, they weren't really "big changes", in the sense that you could ignore them if you don't like them. This wouldn't be so in this case: as the string type is very fundamental, people feel a higher interest in its implementation. > In fact, I seem to > recall a sprint held recently in Reykjavik where the whole point was just to > make Python faster. That's true. I also recall there were serious complaints about the outcome of this sprint, and the changes to the struct module in particular. Still, the struct module is of lesser importance than the string type, so the concerns were smaller. > I believe that was exactly Larry's point in posting the > patch. The "one obvious way to do" concatenation and slicing for one of the > most heavily used types in python appears to be faster. That seems like a > win to me. Have you reviewed the patch and can vouch for its correctness, even in boundary cases? Have you tested it in a real application and found a real performance improvement? I have done neither, so I can't speak on the advantages of the patch. I didn't actually object to the inclusion of the patch, either. I was merely stating what I think the problems with "that kind of" patch are. Regards, Martin From martin at v.loewis.de Tue Oct 24 00:36:33 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 24 Oct 2006 00:36:33 +0200 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453CD891.8020003@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> <453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org> Message-ID: <453D43F1.8010104@v.loewis.de> Larry Hastings schrieb: > Am I correct in understanding that changing the Python minor revision > number (2.5 -> 2.6) requires external modules to recompile? (It > certainly does on Windows.) There is an ongoing debate on that. The original intent was that you normally *shouldn't* have to recompile modules just because the Python version changes. Instead, you should do so when PYTHON_API_VERSION changes. Of course, such a change would also cause a change to PYTHON_API_VERSION. Then, even if PYTHON_API_VERSION changes, you aren't *required* to recompile your extension modules. Instead, you get a warning that the API version is different and *might* require recompilation: it does require recompilation if the extension module relies on some of the changed API. With this change, people not recompiling their extension modules would likely see Python crash rather quickly after seeing the warning about incompatible APIs. Regards, Martin From anthony at python.org Tue Oct 24 04:57:42 2006 From: anthony at python.org (Anthony Baxter) Date: Tue, 24 Oct 2006 12:57:42 +1000 Subject: [Python-Dev] RELEASED Python 2.3.6, release candidate 1 Message-ID: <200610241257.51761.anthony@python.org> On behalf of the Python development team and the Python community, I'm announcing the release of Python 2.3.6 (release candidate 1). Python 2.3.6 is a security bug-fix release. While Python 2.5 is the latest version of Python, we're making this release for people who are still running Python 2.3. Unlike the recently released 2.4.4, this release only contains a small handful of security-related bugfixes. See the website for more. * Python 2.3.6 contains a fix for PSF-2006-001, a buffer overrun * in repr() of unicode strings in wide unicode (UCS-4) builds. * See http://www.python.org/news/security/PSF-2006-001/ for more. This is a **source only** release. The Windows and Mac binaries of 2.3.5 were built with UCS-2 unicode, and are therefore not vulnerable to the problem outlined in PSF-2006-001. The PCRE fix is for a long-deprecated module (you should use the 're' module instead) and the email fix can be obtained by downloading the standalone version of the email package. Most vendors who ship Python should have already released a patched version of 2.3.5 with the above fixes, this release is for people who need or want to build their own release, but don't want to mess around with patch or svn. Assuming no major problems crop up, a final release of Python 2.3.6 will follow in about a week's time. Python 2.3.6 will complete python.org's response to PSF-2006-001. If you're still on Python 2.2 for some reason and need to work with UCS-4 unicode strings, please obtain the patch from the PSF-2006-001 security advisory page. Python 2.4.4 and Python 2.5 have both already been released and contain the fix for this security problem. For more information on Python 2.3.6, including download links for source archives, release notes, and known issues, please see: http://www.python.org/2.3.6 Highlights of this new release include: - A fix for PSF-2006-001, a bug in repr() for unicode strings on UCS-4 (wide unicode) builds. - Two other, less critical, security fixes. Enjoy this release, Anthony Anthony Baxter anthony at python.org Python Release Manager (on behalf of the entire python-dev team) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061024/2a934c64/attachment.pgp From mbk.lists at gmail.com Tue Oct 24 07:22:51 2006 From: mbk.lists at gmail.com (Mike Krell) Date: Mon, 23 Oct 2006 22:22:51 -0700 Subject: [Python-Dev] __str__ bug? Message-ID: Is this a bug? If not, how do I override __str__ on a unicode derived class? class S(str): def __str__(self): return '__str__ overridden' class U(unicode): def __str__(self): return '__str__ overridden' def __unicode__(self): return u'__unicode__ overridden' s = S() u = U() print 's:', s print "str(s):", str(s) print 's substitued is "%s"\n' % s print 'u:', u print "str(u):", str(u) print 'u substitued is "%s"' % u ----------------------------------------------------- s: __str__ overridden str(s): __str__ overridden s substitued is "__str__ overridden" u: str(u): __str__ overridden u substitued is "" Results are identical for 2.4.2 and 2.5c2 (running under windows). Mike From Jack.Jansen at cwi.nl Tue Oct 24 11:09:12 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Tue, 24 Oct 2006 11:09:12 +0200 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <453CD891.8020003@hastings.org> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> <453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org> Message-ID: <95509FCF-D8E8-4102-A5B6-15F087723109@cwi.nl> On 23-Oct-2006, at 16:58 , Larry Hastings wrote: > I genuinely don't know how many external Python extension modules > are well-behaved in this regard. But in case it helps: I just > checked PIL, NumPy, PyWin32, and SWIG, and all of them were well- > behaved. > > Apart from stringobject.c, there was exactly one spot in the Python > source tree which made assumptions about the structure of > PyStringObjects (Mac/Modules/macos.c). It's in the block starting > with the comment "This is a hack:". Note that this is unfixed in > my patch, so just now all code using that self-avowed "hack" will > break. As the author of that hack, that gives me an idea for where you should look for code that will break: code that tries to expose low- level C interfaces to Python. (That hack replaced an even earlier worse hack, that took the id() of a string in Python and added a fixed number to it to get at the address of the string, to fill it into a structure, blush). Look at packages such as win32, PyObjC, ctypes, bridges between Python and other languages, etc. That's where implementors are tempted to bend the rules of Official APIs for the benefit of serious optimizations. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061024/fe49d742/attachment.html From ronaldoussoren at mac.com Tue Oct 24 11:28:40 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 24 Oct 2006 11:28:40 +0200 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <95509FCF-D8E8-4102-A5B6-15F087723109@cwi.nl> References: <4523F890.9060804@hastings.org> <453985ED.7050303@hastings.org> <4539D362.9010909@v.loewis.de> <453C3D6F.4060107@hastings.org> <453C4982.80909@v.loewis.de> <17724.47153.512897.828558@montanaro.dyndns.org> <453CC8E7.1090000@holdenweb.com> <453CD891.8020003@hastings.org> <95509FCF-D8E8-4102-A5B6-15F087723109@cwi.nl> Message-ID: <3EF991DD-CDE6-4E1A-99E9-6FC24EAF2DCE@mac.com> On Oct 24, 2006, at 11:09 AM, Jack Jansen wrote: > > Look at packages such as win32, PyObjC, ctypes, bridges between > Python and other languages, etc. That's where implementors are > tempted to bend the rules of Official APIs for the benefit of > serious optimizations. PyObjC should be safe in this regard, I try to conform to the official rules :-) I do use PyString_AS_STRING outside of the GIL in other extensions though, the lazy strings patch would break that. My code is of course bending the rules here and can easily be fixed by introducing a temporary variable. Ronald > -- > Jack Jansen, , http://www.cwi.nl/~jack > If I can't dance I don't want to be part of your revolution -- Emma > Goldman > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > ronaldoussoren%40mac.com -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061024/5370f4fc/attachment-0001.bin From ncoghlan at gmail.com Tue Oct 24 12:59:31 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 Oct 2006 20:59:31 +1000 Subject: [Python-Dev] The "lazy strings" patch In-Reply-To: <20061023152831.26151.1913366649.divmod.quotient.11082@ohm> References: <20061023152831.26151.1913366649.divmod.quotient.11082@ohm> Message-ID: <453DF213.7080300@gmail.com> Jean-Paul Calderone wrote: > On Mon, 23 Oct 2006 07:58:25 -0700, Larry Hastings wrote: >> [snip] >> If external Python extension modules are as well-behaved as the shipping >> Python source tree, there simply wouldn't be a problem. Python source is >> delightfully consistent about using the macro PyString_AS_STRING() to get at >> the creamy char *center of a PyStringObject *. When code religiously uses >> that macro (or calls PyString_AsString() directly), all it needs is a >> recompile with the current stringobject.h and it will Just Work. >> >> I genuinely don't know how many external Python extension modules are well- >> behaved in this regard. But in case it helps: I just checked PIL, NumPy, >> PyWin32, and SWIG, and all of them were well-behaved. > > FWIW, http://www.google.com/codesearch?q=+ob_sval Possible more enlightening (we *know* string objects play with this field!): http://www.google.com/codesearch?hl=en&lr=&q=ob_sval+-stringobject.%5Bhc%5D&btnG=Search Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From chtaylo3 at gmail.com Tue Oct 24 17:18:08 2006 From: chtaylo3 at gmail.com (Christopher Taylor) Date: Tue, 24 Oct 2006 11:18:08 -0400 Subject: [Python-Dev] Hunting down configure script error Message-ID: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> Per my conversation with Martin v. L?wis on the python-list, I think I have found a problem with the configure script and Makefile.in. For Python 2.4.4 it seems that the arguement --libdir does not change the Makefile. Specifically I need this to change the libdir to /usr/lib64 for RH on a x86_64 machine. I'd like to contribute a fix for this, but I'm relatively new so I would appreciate some guidance. In the Makefile, I tried setting LIBDIR to $(exec_prefix)/lib64 and SCRIPTDIR to $(prefix)/lib64 manually. Unfortuantely that created an error when I ran python2.4: Could not find platform independent libraries Could not find platform dependent libraries Consider setting $PYTHONHOME to [:] 'import site' failed; use -v for traceback so I edited my /etc/profile and included: export PYTHONHOME = "/usr" and reran python2.4 and now the only error is: 'import site' failed; use -v for traceback I poked around in /Modules/getpath.c and I'm starting to understand how things are comming together. My question is: how does $(prefix) from the congifure script make it into PREFIX in the c code? I see on line 106 of /Modules/getpath.c that it checks to see if PREFIX is defined and if not set's it to "/usr/local". So I did a grep on PREFIX from the Python2.4.4 dir level and it didn't return anything that looks like PREFIX is being set based on the input to the configure script. Where might this be happening? I'm assuming there's also a similar disconnect for LIBDIR (even though it never get's set properly in the Makefile, even when I edited it by hand, those changes don't make it into the code .... but I don't know where it should be changed in the code.) Respectfully, Christopher Taylor From ncoghlan at gmail.com Tue Oct 24 12:04:22 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 Oct 2006 20:04:22 +1000 Subject: [Python-Dev] __str__ bug? In-Reply-To: References: Message-ID: <453DE526.4020107@gmail.com> Mike Krell wrote: > Is this a bug? If not, how do I override __str__ on a unicode derived class? Based on the behaviour of str and the fact that overriding unicode.__repr__ works just fine, I'd say file a bug on SF. I think this bit in PyUnicode_Format needs to use PyUnicode_CheckExact instead of PyUnicode_Check: case 's': case 'r': if (PyUnicode_Check(v) && c == 's') { The corresponding code in PyString_Format makes a call to _PyObject_Str which deals with subclasses correctly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From mbk.lists at gmail.com Tue Oct 24 18:02:24 2006 From: mbk.lists at gmail.com (Mike Krell) Date: Tue, 24 Oct 2006 09:02:24 -0700 Subject: [Python-Dev] __str__ bug? In-Reply-To: <453DE526.4020107@gmail.com> References: <453DE526.4020107@gmail.com> Message-ID: > Based on the behaviour of str and the fact that overriding unicode.__repr__ > works just fine, I'd say file a bug on SF. Done. This is item 1583863. Mike From chtaylo3 at gmail.com Tue Oct 24 19:47:33 2006 From: chtaylo3 at gmail.com (Christopher Taylor) Date: Tue, 24 Oct 2006 13:47:33 -0400 Subject: [Python-Dev] Hunting down configure script error In-Reply-To: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> Message-ID: <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> Ok, here's what I found: In addition to the configure script not taking changing LIBDIR, Modules/getpath.c uses a hardcoded value for static chat lib_python[] = "lib/python" VERSION; which appears on line 134. So even if the configure script changes LIBDIR it won't do much good because the value is hardcoded in (as opposed to using LIBDIR). So I can tell that this would need to be changed ... anyone else know much about this? I'm wondering if my posts are going through?? Respectfully, Christopher Taylor From skip at pobox.com Tue Oct 24 20:10:34 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 24 Oct 2006 13:10:34 -0500 Subject: [Python-Dev] Hunting down configure script error In-Reply-To: <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> Message-ID: <17726.22298.840098.330357@montanaro.dyndns.org> Christopher> I'm wondering if my posts are going through?? Yup. Sorry, but I've no useful comments to make on your problems though. Skip From steve at holdenweb.com Tue Oct 24 20:13:07 2006 From: steve at holdenweb.com (Steve Holden) Date: Tue, 24 Oct 2006 19:13:07 +0100 Subject: [Python-Dev] Hunting down configure script error In-Reply-To: <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> Message-ID: Christopher Taylor wrote: > Ok, here's what I found: In addition to the configure script not > taking changing LIBDIR, Modules/getpath.c uses a hardcoded value for > static chat lib_python[] = "lib/python" VERSION; > which appears on line 134. So even if the configure script changes > LIBDIR it won't do much good because the value is hardcoded in (as > opposed to using LIBDIR). > > So I can tell that this would need to be changed ... anyone else know > much about this? > > I'm wondering if my posts are going through?? > Your posts are making it. It's just that everyone's ignoring you :) regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From chtaylo3 at gmail.com Tue Oct 24 20:28:05 2006 From: chtaylo3 at gmail.com (Christopher Taylor) Date: Tue, 24 Oct 2006 14:28:05 -0400 Subject: [Python-Dev] Hunting down configure script error In-Reply-To: References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> Message-ID: <2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com> > Your posts are making it. It's just that everyone's ignoring you :) I feel loved ..... Seriously, why would somoene ignore this? this is obviously not a pebkac problem..... Respectfully, Christopher Taylor From skip at pobox.com Tue Oct 24 20:38:45 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 24 Oct 2006 13:38:45 -0500 Subject: [Python-Dev] Hunting down configure script error In-Reply-To: <2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com> References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> <2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com> Message-ID: <17726.23989.775944.384634@montanaro.dyndns.org> >> Your posts are making it. It's just that everyone's ignoring you :) Christopher> I feel loved ..... Christopher> Seriously, why would somoene ignore this? this is Christopher> obviously not a pebkac problem..... I'm not sure what a "pebkac" problem is. I will attempt to channel the other members of the group (OMMMMMM...) and suggest that folks are either (like me) unfamiliar with the problem domain or too busy at the moment to look into the details. Your Best Bet (tm) would be to file a bug report on SourceForge so it doesn't get completely forgotten. Skip From fredrik at pythonware.com Tue Oct 24 20:46:18 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 24 Oct 2006 20:46:18 +0200 Subject: [Python-Dev] Hunting down configure script error In-Reply-To: <17726.23989.775944.384634@montanaro.dyndns.org> References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> <2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com> <17726.23989.775944.384634@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > I'm not sure what a "pebkac" problem is. http://en.wikipedia.org/wiki/PEBKAC You'll learn some new nonsense every day ;-) From steve at holdenweb.com Tue Oct 24 21:32:11 2006 From: steve at holdenweb.com (Steve Holden) Date: Tue, 24 Oct 2006 20:32:11 +0100 Subject: [Python-Dev] Hunting down configure script error In-Reply-To: <17726.23989.775944.384634@montanaro.dyndns.org> References: <2590773a0610240818q57cac275g3bda9fd4fb7a2b32@mail.gmail.com> <2590773a0610241047i12c5661esf997ec9df031e18b@mail.gmail.com> <2590773a0610241128mf1b71e0r421986f586470f06@mail.gmail.com> <17726.23989.775944.384634@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > >> Your posts are making it. It's just that everyone's ignoring you :) > > Christopher> I feel loved ..... > > Christopher> Seriously, why would somoene ignore this? this is > Christopher> obviously not a pebkac problem..... > > I'm not sure what a "pebkac" problem is. Problem Exists Between Chair And Keyboard regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From RD6T-KJYM at asahi-net.or.jp Tue Oct 24 21:41:27 2006 From: RD6T-KJYM at asahi-net.or.jp (Tamito KAJIYAMA) Date: 25 Oct 2006 04:41:27 +0900 Subject: [Python-Dev] __str__ bug? Message-ID: <453E6C67.125074.001@leopold.j.asahi-net.or.jp> I believe you've overriden unicode.__str__ as you expect. class S(str): def __str__(self): return "S.__str__" class U(unicode): def __str__(self): return "U.__str__" print str(S()) print str(U()) This script prints: S.__str__ U.__str__ Regards, -- KAJIYAMA, Tamito >Is this a bug? If not, how do I override __str__ on a unicode derived class? > >class S(str): > def __str__(self): return '__str__ overridden' > >class U(unicode): > def __str__(self): return '__str__ overridden' > def __unicode__(self): return u'__unicode__ overridden' > >s = S() >u = U() > >print 's:', s >print "str(s):", str(s) >print 's substitued is "%s"\n' % s >print 'u:', u >print "str(u):", str(u) >print 'u substitued is "%s"' % u > >----------------------------------------------------- > >s: __str__ overridden >str(s): __str__ overridden >s substitued is "__str__ overridden" > >u: >str(u): __str__ overridden >u substitued is "" > >Results are identical for 2.4.2 and 2.5c2 (running under windows). > > Mike From mbk.lists at gmail.com Tue Oct 24 22:14:22 2006 From: mbk.lists at gmail.com (Mike Krell) Date: Tue, 24 Oct 2006 13:14:22 -0700 Subject: [Python-Dev] __str__ bug? In-Reply-To: <453E6C67.125074.001@leopold.j.asahi-net.or.jp> References: <453E6C67.125074.001@leopold.j.asahi-net.or.jp> Message-ID: > class S(str): > def __str__(self): return "S.__str__" > > class U(unicode): > def __str__(self): return "U.__str__" > > print str(S()) > print str(U()) > > This script prints: > > S.__str__ > U.__str__ Yes, but "print U()" prints nothing, and the explicit str() should not be necessary. Mike From bjourne at gmail.com Wed Oct 25 02:11:48 2006 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Wed, 25 Oct 2006 02:11:48 +0200 Subject: [Python-Dev] PEP 355 status In-Reply-To: References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> Message-ID: <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> On 10/1/06, Guido van Rossum wrote: > On 9/30/06, Giovanni Bajo wrote: > > It would be terrific if you gave us some clue about what is wrong in PEP355, so > > that the next guy does not waste his time. For instance, I find PEP355 > > incredibly good for my own path manipulation (much cleaner and concise than the > > awful os.path+os+shutil+stat mix), and I have trouble understanding what is > > *so* wrong with it. > > > > You said "it's an amalgam of unrelated functionality", but you didn't say what > > exactly is "unrelated" for you. > > Sorry, no time. But others in this thread clearly agreed with me, so > they can guide you. I'd like to write a post mortem for PEP 355. But one important question that haven't been answered is if there is a possibility for a path-like PEP to succeed in the future? If so, does the path-object implementation have to prove itself in the wild before it can be included in Python? From earlier posts it seems like you don't like the concept of path objects, which others have found very interesting. If that is the case, then it would be nice to hear it explicitly. :) -- mvh Bj?rn From amk at amk.ca Wed Oct 25 03:02:48 2006 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 24 Oct 2006 21:02:48 -0400 Subject: [Python-Dev] Python 2.4.4 docs? Message-ID: <20061025010248.GA805@Siri.local> Does someone need to unpack the 2.4.4 docs in the right place so that http://www.python.org/doc/2.4.4/ works? --amk From fdrake at acm.org Wed Oct 25 05:24:47 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 24 Oct 2006 23:24:47 -0400 Subject: [Python-Dev] Python 2.4.4 docs? In-Reply-To: <20061025010248.GA805@Siri.local> References: <20061025010248.GA805@Siri.local> Message-ID: <200610242324.47328.fdrake@acm.org> On Tuesday 24 October 2006 21:02, A.M. Kuchling wrote: > Does someone need to unpack the 2.4.4 docs in the right place so that > http://www.python.org/doc/2.4.4/ works? That would be me, and yes, and done. Sorry for the delay; life's just been busy lately. Time for me to go look at the release PEP again... -Fred -- Fred L. Drake, Jr. From talin at acm.org Wed Oct 25 05:42:59 2006 From: talin at acm.org (Talin) Date: Tue, 24 Oct 2006 20:42:59 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> Message-ID: <453EDD43.3050609@acm.org> BJ?rn Lindqvist wrote: > On 10/1/06, Guido van Rossum wrote: >> On 9/30/06, Giovanni Bajo wrote: >>> It would be terrific if you gave us some clue about what is wrong in PEP355, so >>> that the next guy does not waste his time. For instance, I find PEP355 >>> incredibly good for my own path manipulation (much cleaner and concise than the >>> awful os.path+os+shutil+stat mix), and I have trouble understanding what is >>> *so* wrong with it. >>> >>> You said "it's an amalgam of unrelated functionality", but you didn't say what >>> exactly is "unrelated" for you. >> Sorry, no time. But others in this thread clearly agreed with me, so >> they can guide you. > > I'd like to write a post mortem for PEP 355. But one important > question that haven't been answered is if there is a possibility for a > path-like PEP to succeed in the future? If so, does the path-object > implementation have to prove itself in the wild before it can be > included in Python? From earlier posts it seems like you don't like > the concept of path objects, which others have found very interesting. > If that is the case, then it would be nice to hear it explicitly. :) Let me take a crack at it - I'm always good for spouting off an arrogant opinion :) Part 1: "Amalgam of Unrelated Functionality" To me, the Path module felt very much like the "swiss army knife" anti-pattern - a whole lot of functions that had little in common other than the fact that paths were involved. More specifically, I think its important to separate the notion of paths as abstract "reference" objects from filesystem manipulators. When I call a function that operates on a path, I want to clearly distinguish between a function that merely does a transformation on the path string, vs. one that actually hits the disk. This goes along with the "principle of least surprise" - it should never be the case that I cause an i/o operation to occur when I wasn't expecting it. For example, a function that computes the parent directory of a path should not IMHO be a sibling of a function which tests for the existence or readability of a file. I tend to think of paths and filesystems as broken down into 3 distinct domains, which are locators, inodes, and files. I realize that not all file systems on all platforms use the term 'inode', and have somewhat different semantics, but they all have some object which fulfills that role. -- A locator is an abstract description of how to "get to" a resource. A file path is a "locator" in exactly the sense that a URL is. Locators need not refer to 'real' resources in order to be valid. A locator to a non-existent resource still maintains a consistent structure, and can be manipulated and transformed without ever actually dereferencing it. A locator does not, however, have any properties or attributes - you cannot tell, for example, the creation date of a file by looking at its locator. -- An inode is a descriptor that points to some actual content. It actually lives on the filesystem, and has attributes (such as creation data, last modified date, permissions, etc.) -- 'Files' are raw content streams - they are the actual bytes that make up the data within the file. Files do not have 'names' or 'dates' directly in of themselves - only the inodes that describe them do. Now, I don't insist that everyone in the world should classify things the way I do - I'm just describing how I see it. Were I to come up with my own path-related APIs, they would most likely be divided into 3 sub-modules corresponding to the 3 subdivisions listed above. I would want to make it clear that when you are operating strictly at the locator level, you aren't touching inodes or files; When you are operating at the inode level, you aren't touching file content. Part 2: Should paths be objects? I should mention that while I appreciate the power of OOP, I am also very much against the kind of OOP-absolutism that has been taught in many schools of software engineering in the last two decades. There are a lot of really good, formal, well-thought-out systems of program organization, and OOP is only one of many. A classic example is relational algebra which forms the basis for relational databased - the basic notion that all operations on tabular data can be "composed" or "chained" in exactly the way that mathematical formula can be. In relational algebra, you can take a view of a view of a view, or a subquery of a query of a view of a table, and so on. Even single, scalar values - such as the count of the number of results of a query - are of the same data type as a 'relation', and can be operated on as such, or fed as input to a subsequent operation. I bring up the example of relational algebra because it applies to paths as well: There is a kind of "path algebra", where an operation on a path results in another path, which can be operated on further. Now, one way to achieve this kind of path algebra is to make paths an object, and to overload the various functions and operators so that they, too, return paths. However, path algebra can be implemented just as easily in a functional style as in an object style. Properly done, a functional design shouldn't be significantly more bulky or wordy than an object design; The fact that the existing legacy API fails this test has more to do with history than any inherent advantages of OOP vs. functional style. (Actually, the OOP approach has a slight advantage in terms of the amount of syntactic sugar available, but that is [a] an artifact of the current Python feature set, and [b] not necessarily a good thing if it leads to gratuitous, Perl-ish cleverness.) As a point of comparison, the Java Path API and the C# .Net Path API have similar capabilities, however the former is object-based whereas the latter is functional and operates on strings. Having used both of them extensively, I find I prefer the C# style, mainly due to the ease of intra-conversion with regular strings - being able to read strings from configuration files, for example, and immediately operate on them without having to convert to path form. I don't find "p.GetParent()" much harder or easier to type than "Path.GetParent( p )"; but I do prefer "Path.GetParent( string )" over "Path( string ).GetParent()". However, this is only a *mild* preference - I could go either way, and wouldn't put up much of a fight about it. (I should not that the Java Path API does *not* follow my scheme of separation between locators and inodes, while the C# API does, which is another reason why I prefer the C# approach.) Part 3: Does this mean that the current API cannot be improved? Certainly not! I think everyone (well, almost) agrees that there is much room for improvement in the current APIs. They certainly need to be refactored and recategorized. But I don't think that the solution is to take all of the path-related functions and drop them into a single class, or even a single module. --- Anyway, I hope that (a) that answers your questions, and (b) isn't too divergent from most people's views about Path. -- Talin From talin at acm.org Wed Oct 25 05:51:02 2006 From: talin at acm.org (Talin) Date: Tue, 24 Oct 2006 20:51:02 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453EDD43.3050609@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> Message-ID: <453EDF26.4040309@acm.org> (one additional postscript - One thing I would be interested in is an approach that unifies file paths and URLs so that there is a consistent locator scheme for any resource, whether they be in a filesystem, on a web server, or stored in a zip file.) -- Talin From stephen at xemacs.org Wed Oct 25 07:33:22 2006 From: stephen at xemacs.org (stephen at xemacs.org) Date: Wed, 25 Oct 2006 14:33:22 +0900 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453EDF26.4040309@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org> Message-ID: <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp> Talin writes: > (one additional postscript - One thing I would be interested in is an > approach that unifies file paths and URLs so that there is a consistent > locator scheme for any resource, whether they be in a filesystem, on a > web server, or stored in a zip file.) +1 But doesn't file:/// do that for files, and couldn't we do something like zipfile:///nantoka.zip#foo/bar/baz.txt? Of course, we'd want to do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too. That way leads to madness.... From scott+python-dev at scottdial.com Wed Oct 25 07:34:12 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Wed, 25 Oct 2006 01:34:12 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org> <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp> Message-ID: <453EF754.6040105@scottdial.com> stephen at xemacs.org wrote: > Talin writes: > > (one additional postscript - One thing I would be interested in is an > > approach that unifies file paths and URLs so that there is a consistent > > locator scheme for any resource, whether they be in a filesystem, on a > > web server, or stored in a zip file.) > > +1 > > But doesn't file:/// do that for files, and couldn't we do something > like zipfile:///nantoka.zip#foo/bar/baz.txt? Of course, we'd want to > do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too. That > way leads to madness.... > It would make more sense to register protocol handlers to this magical unification of resource manipulation. But allow me to perform my first channeling of Guido.. YAGNI. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From talin at acm.org Wed Oct 25 07:38:46 2006 From: talin at acm.org (Talin) Date: Tue, 24 Oct 2006 22:38:46 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org> <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp> Message-ID: <453EF866.1060503@acm.org> stephen at xemacs.org wrote: > Talin writes: > > (one additional postscript - One thing I would be interested in is an > > approach that unifies file paths and URLs so that there is a consistent > > locator scheme for any resource, whether they be in a filesystem, on a > > web server, or stored in a zip file.) > > +1 > > But doesn't file:/// do that for files, and couldn't we do something > like zipfile:///nantoka.zip#foo/bar/baz.txt? Of course, we'd want to > do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too. That > way leads to madness.... file:/// does indeed to it, but only the network module understands strings in that format. Ideally, you should be able to pass "file:///..." to a regular "open" function. I wouldn't expect it to be able to understand "http://". But the "file:" protocol should always be supported. In other words, I'm not proposing that the built-in file i/o package suddenly grow an understanding of network schema types. All I am proposing is a unified name space. - Talin From stephen at xemacs.org Wed Oct 25 09:44:59 2006 From: stephen at xemacs.org (stephen at xemacs.org) Date: Wed, 25 Oct 2006 16:44:59 +0900 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453EF754.6040105@scottdial.com> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org> <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp> <453EF754.6040105@scottdial.com> Message-ID: <17727.5627.756820.797525@uwakimon.sk.tsukuba.ac.jp> Scott Dial writes: > stephen at xemacs.org wrote: > > Talin writes: > > > (one additional postscript - One thing I would be interested in is an > > > approach that unifies file paths and URLs so that there is a consistent > > > locator scheme for any resource, whether they be in a filesystem, on a > > > web server, or stored in a zip file.) > > > > +1 > It would make more sense to register protocol handlers to this magical > unification of resource manipulation. I don't think it's that magical, and it's not manipulation, it's location. The question is, register where and on what? For example on my Mac there are some PDFs I want to open in Preview and others in Acrobat. To the extent that I have some classes which are one or the other, I might want to register the handler to a wildcard path object. > But allow me to perform my first channeling of Guido.. YAGNI. True, but only because when I do need that kind of stuff I'm normally writing Emacs Lisp, not Python. We have a wide variety of functions for manipulating path strings, and they make exactly the distinction between path and inode/content that Talin does (where a path is being manipulated, the function has "filename" in its name, where a file or its metadata is being accessed, the function's name contains "file"). Nonetheless there are two or three places where programmers I respect have chosen to invent path classes to handle hairy special cases. These classes are very useful in those special cases. One place where this gets especially hairy is in the TRAMP package, which allows you to construct "remote paths" involving (for example) logging into host A by ssh, from there to host B by ssh, and finally a "relay download" of the content from host C to the local host by scp. The net effect is that you can specify the path in your "open file" dialog, and Emacs does the rest automatically; the only differences the user sees between that and a local file is the length of the path string and the time it takes to actually access the contents. Once you've done that, that process is embedded into Emacs's notion of the "current directory", so you can list the directory containing the resource, or access siblings, very conveniently. I don't expect to reproduce that functionality in Python personally, but such use cases do exist. Whether a general path class can be invented that doesn't accumulate cruft faster than use cases is another issue. From talin at acm.org Wed Oct 25 09:50:49 2006 From: talin at acm.org (Talin) Date: Wed, 25 Oct 2006 00:50:49 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453EF754.6040105@scottdial.com> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org> <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp> <453EF754.6040105@scottdial.com> Message-ID: <453F1759.5050601@acm.org> Scott Dial wrote: > stephen at xemacs.org wrote: >> Talin writes: >> > (one additional postscript - One thing I would be interested in is >> an > approach that unifies file paths and URLs so that there is a >> consistent > locator scheme for any resource, whether they be in a >> filesystem, on a > web server, or stored in a zip file.) >> >> +1 >> >> But doesn't file:/// do that for files, and couldn't we do something >> like zipfile:///nantoka.zip#foo/bar/baz.txt? Of course, we'd want to >> do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too. That >> way leads to madness.... >> > > It would make more sense to register protocol handlers to this magical > unification of resource manipulation. But allow me to perform my first > channeling of Guido.. YAGNI. > I'm thinking that it was a tactical error on my part to throw in the whole "unified URL / filename namespace" idea, which really has nothing to do with the topic. Lets drop it, or start another topic, and let this thread focus on critiques of the path module, which is probably more relevant at the moment. -- Talin From mal at egenix.com Wed Oct 25 11:40:06 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 25 Oct 2006 11:40:06 +0200 Subject: [Python-Dev] __str__ bug? In-Reply-To: References: <453E6C67.125074.001@leopold.j.asahi-net.or.jp> Message-ID: <453F30F6.9060002@egenix.com> Mike Krell wrote: >> class S(str): >> def __str__(self): return "S.__str__" >> >> class U(unicode): >> def __str__(self): return "U.__str__" >> >> print str(S()) >> print str(U()) >> >> This script prints: >> >> S.__str__ >> U.__str__ > > Yes, but "print U()" prints nothing, and the explicit str() should not > be necessary. The main difference here is that the string object defines a tp_print slot, while Unicode doesn't. As a result, tp_print for the string subtype is called and this does an extra check for subtypes: if (! PyString_CheckExact(op)) { int ret; /* A str subclass may have its own __str__ method. */ op = (PyStringObject *) PyObject_Str((PyObject *)op); if (op == NULL) return -1; ret = string_print(op, fp, flags); Py_DECREF(op); return ret; } For Unicode, the PyObject_Print() API defaults to using PyObject_Str() which uses the tp_str slot. This maps directly to a Unicode API that works on the internals and doesn't apply any extra checks to see if it was called on a subtype. Note that this is true for many of the __special__ slot methods you can implement on subtypes of built-in types - they don't always work as you might expect. Now in this rather common case, I guess we could add support to the Unicode object to do extra checks like the string object does. Dito for the %-formatting. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Wed Oct 25 11:47:29 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Oct 2006 19:47:29 +1000 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453EDD43.3050609@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> Message-ID: <453F32B1.4030101@gmail.com> Talin wrote: > Part 3: Does this mean that the current API cannot be improved? > > Certainly not! I think everyone (well, almost) agrees that there is much > room for improvement in the current APIs. They certainly need to be > refactored and recategorized. > > But I don't think that the solution is to take all of the path-related > functions and drop them into a single class, or even a single module. +1 from me. (for both the fraction I quoted and everything else you said, including the locator/inode/file distinction - although I'd also add that 'symbolic link' and 'directory' exist at a similar level as 'file'). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Wed Oct 25 13:13:30 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 25 Oct 2006 13:13:30 +0200 Subject: [Python-Dev] __str__ bug? In-Reply-To: References: <453DE526.4020107@gmail.com> Message-ID: <453F46DA.4000700@v.loewis.de> Mike Krell schrieb: >> Based on the behaviour of str and the fact that overriding unicode.__repr__ >> works just fine, I'd say file a bug on SF. > > Done. This is item 1583863. Of course, it would be even better if you could also include a patch. Regards, Martin From talin at acm.org Wed Oct 25 18:49:44 2006 From: talin at acm.org (Talin) Date: Wed, 25 Oct 2006 09:49:44 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453F32B1.4030101@gmail.com> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com> Message-ID: <453F95A8.5090201@acm.org> Nick Coghlan wrote: > Talin wrote: >> Part 3: Does this mean that the current API cannot be improved? >> >> Certainly not! I think everyone (well, almost) agrees that there is >> much room for improvement in the current APIs. They certainly need to >> be refactored and recategorized. >> >> But I don't think that the solution is to take all of the path-related >> functions and drop them into a single class, or even a single module. > > +1 from me. > > (for both the fraction I quoted and everything else you said, including > the locator/inode/file distinction - although I'd also add that > 'symbolic link' and 'directory' exist at a similar level as 'file'). I would tend towards classifying directory operations as inode-level operations, that you are working at the "filesystem as graph" level, rather than the "stream of bytes" level. When you iterate over a directory, what you are getting back is effectively inodes (well, directory entries are distinct from inodes in the underlying filesystem, but from Python there's no practical distinction.) If I could draw a UML diagram in ASCII, I would have "inode --> points to --> directory or file" and "directory --> contains * --> inode". That would hopefully make things clearer. Symbolic links, I am not so sure about; In some ways, hard links are easier to classify. --- Having done a path library myself (in C++, for our code base at work), the trickiest part is getting the Windows path manipulations right, and fitting them into a model that allows writing of platform-agnostic code. This is especially vexing when you realize that its often useful to manipulate unix-style paths even when running under Win32 and vice versa. A prime example is that I have a lot of Python code at work that manipulates Perforce client specs files. The path specifications in these files are platform-agnostic, and use forward slashes regardless of the host platform, so "os.path.normpath" doesn't do the right thing for me. > Cheers, > Nick. From pje at telecommunity.com Wed Oct 25 18:56:37 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 25 Oct 2006 12:56:37 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453F95A8.5090201@acm.org> References: <453F32B1.4030101@gmail.com> <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com> Message-ID: <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com> At 09:49 AM 10/25/2006 -0700, Talin wrote: >Having done a path library myself (in C++, for our code base at work), >the trickiest part is getting the Windows path manipulations right, and >fitting them into a model that allows writing of platform-agnostic code. >This is especially vexing when you realize that its often useful to >manipulate unix-style paths even when running under Win32 and vice >versa. A prime example is that I have a lot of Python code at work that >manipulates Perforce client specs files. The path specifications in >these files are platform-agnostic, and use forward slashes regardless of >the host platform, so "os.path.normpath" doesn't do the right thing for me. You probably want to use the posixpath module directly in that case, though perhaps you've already discovered that. From talin at acm.org Wed Oct 25 19:16:48 2006 From: talin at acm.org (Talin) Date: Wed, 25 Oct 2006 10:16:48 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com> References: <453F32B1.4030101@gmail.com> <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com> <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com> Message-ID: <453F9C00.3090300@acm.org> Phillip J. Eby wrote: > At 09:49 AM 10/25/2006 -0700, Talin wrote: >> Having done a path library myself (in C++, for our code base at work), >> the trickiest part is getting the Windows path manipulations right, and >> fitting them into a model that allows writing of platform-agnostic code. >> This is especially vexing when you realize that its often useful to >> manipulate unix-style paths even when running under Win32 and vice >> versa. A prime example is that I have a lot of Python code at work that >> manipulates Perforce client specs files. The path specifications in >> these files are platform-agnostic, and use forward slashes regardless of >> the host platform, so "os.path.normpath" doesn't do the right thing >> for me. > > You probably want to use the posixpath module directly in that case, > though perhaps you've already discovered that. Never heard of it. Its not in the standard library, is it? I don't see it in the table of contents or the index. From fdrake at acm.org Wed Oct 25 19:36:31 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 25 Oct 2006 13:36:31 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453F9C00.3090300@acm.org> References: <453F32B1.4030101@gmail.com> <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com> <453F9C00.3090300@acm.org> Message-ID: <200610251336.31398.fdrake@acm.org> On Wednesday 25 October 2006 13:16, Talin wrote: > Never heard of it. Its not in the standard library, is it? I don't see > it in the table of contents or the index. This is a documentation bug. :-( I'd thought they were mentioned *somewhere*, but it looks like I'm wrong. os.path is an alias for one of several different real modules; which is selected depends on the platform. I see the following: macpath, ntpath, os3emxpath, riscospath. (ntpath is used for all Windows versions, not just NT.) -Fred -- Fred L. Drake, Jr. From pje at telecommunity.com Wed Oct 25 20:19:22 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 25 Oct 2006 14:19:22 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453F9C00.3090300@acm.org> References: <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com> <453F32B1.4030101@gmail.com> <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com> <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20061025141358.027248d8@sparrow.telecommunity.com> At 10:16 AM 10/25/2006 -0700, Talin wrote: >Phillip J. Eby wrote: > > At 09:49 AM 10/25/2006 -0700, Talin wrote: > >> Having done a path library myself (in C++, for our code base at work), > >> the trickiest part is getting the Windows path manipulations right, and > >> fitting them into a model that allows writing of platform-agnostic code. > >> This is especially vexing when you realize that its often useful to > >> manipulate unix-style paths even when running under Win32 and vice > >> versa. A prime example is that I have a lot of Python code at work that > >> manipulates Perforce client specs files. The path specifications in > >> these files are platform-agnostic, and use forward slashes regardless of > >> the host platform, so "os.path.normpath" doesn't do the right thing > >> for me. > > > > You probably want to use the posixpath module directly in that case, > > though perhaps you've already discovered that. > >Never heard of it. Its not in the standard library, is it? I don't see >it in the table of contents or the index. posixpath, ntpath, macpath, et al are the platform-specific path manipulation modules that are aliased to os.path. However, each of these modules' string path manipulation functions can be imported and used on any platform. See below: Linux: Python 2.3.5 (#1, Aug 25 2005, 09:17:44) [GCC 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.path >>> import ntpath >>> dir(ntpath) ['__all__', '__builtins__', '__doc__', '__file__', '__name__', 'abspath', 'altsep', 'basename', 'commonprefix', 'curdir', 'defpath', 'dirname', 'exists', 'expanduser', 'expandvars', 'extsep', 'getatime', 'getctime', 'getmtime', 'getsize', 'isabs', 'isdir', 'isfile', 'islink', 'ismount', 'join', 'normcase', 'normpath', 'os', 'pardir', 'pathsep', 'realpath', 'sep', 'split', 'splitdrive', 'splitext', 'splitunc', 'stat', 'supports_unicode_filenames', 'sys', 'walk'] Windows: Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> import os >>> os.path >>> import posixpath >>> dir(posixpath) ['__all__', '__builtins__', '__doc__', '__file__', '__name__', '_varprog', 'abspath', 'altsep', 'basename', 'commonprefix', 'curdir', 'defpath', 'dirname', 'exists', 'expanduser', 'expandvars', 'extsep', 'getatime', 'getctime', 'getmtime', 'getsize', 'isabs', 'isdir', 'isfile', 'islink', 'ismount', 'join', 'normcase', 'normpath', 'os', 'pardir', 'pathsep', 'realpath', 'samefile', 'sameopenfile', 'samestat', 'sep', 'split', 'splitdrive', 'splitext', 'stat', 'supports_unicode_filenames', 'sys', 'walk'] Note, therefore, that any "path object" system should also allow you to create and manipulate foreign paths. That is, it should have variants for each path type, rather than being locked to the local platform's path strings. Of course, the most common need for this is manipulating posix paths on non-posix platforms, but sometimes one must deal with Windows paths on Unix, too. From fredrik at pythonware.com Wed Oct 25 21:23:32 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 25 Oct 2006 21:23:32 +0200 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453F9C00.3090300@acm.org> References: <453F32B1.4030101@gmail.com> <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453F32B1.4030101@gmail.com> <5.1.1.6.0.20061025125608.0409f138@sparrow.telecommunity.com> <453F9C00.3090300@acm.org> Message-ID: Talin wrote: >> You probably want to use the posixpath module directly in that case, >> though perhaps you've already discovered that. > > Never heard of it. Its not in the standard library, is it? I don't see > it in the table of contents or the index. http://effbot.org/librarybook/posixpath.htm From mahs at telcopartners.com Wed Oct 25 23:06:21 2006 From: mahs at telcopartners.com (Michael Spencer) Date: Wed, 25 Oct 2006 14:06:21 -0700 Subject: [Python-Dev] Fwd: Re: ANN compiler2 : Produce bytecode from Python 2.5 AST In-Reply-To: <453d4890$0$22566$9b622d9e@news.freenet.de> References: <453d4890$0$22566$9b622d9e@news.freenet.de> Message-ID: Martin v. L?wis wrote: > Georg Brandl schrieb: >> Perhaps you can bring up a discussion on python-dev about your improvements >> and how they could be integrated into the standard library... > > Let me second this. The compiler package is largely unmaintained and > was known to be broken (and perhaps still is). A replacement > implementation, especially if it comes with a new maintainer, would > be welcome. > > Regards, > Martin Hello python-dev. I use AST-based code inspection and manipulation, and I've been looking forward to using v2.5 ASTs for their increased accuracy, consistency and speed. However, there is as yet no Python-exposed mechanism for compiling v2.5 ASTs to bytecode. So to meet my own need and interest I've been implementing 'compiler2', similar in scope to the stblib compiler package, but generating code from Python 2.5 _ast.ASTs. The code has evolved considerably from the compiler package: in aggregate the changes amount to a re-write. More about the package and its status below. I'm introducing this project here to discuss whether and how these changes should be integrated with the stdlib. I believe there is a prima facie need to have a builtin/stdlib capability for compiling v2.5 ASTs from Python, and there is some advantage to having that be implemented in Python. There is also a case for deprecating the v2.4 ASTs to ease maintenance and reduce the confusion associated with two different AST formats. If there is interest, I'm willing make compiler2 stdlib-ready. I'm also open to alternative approaches, including doing nothing. compiler2 Objectives and Status =============================== My goal is to get compiler2 to produce identical output to __builtin__.compile (at least optionally), while also providing an accessible framework for AST-manipulation, experimental compiler optimizations and customization. compiler2 is not finished - there are some unresolved bugs, and open questions on interface design - but already it produces identical output to __builtin__.compile for all of the stdlib modules and their tests (except for the stackdepth attribute which is different in 12 cases). All but three stdlib modules pass their tests after being compiled using compiler2. More on goals, status, known issues etc... in the project readme.txt at: http://svn.brownspencer.com/pycompiler/branches/new_ast/readme.txt Code is available in Subversion at http://svn.brownspencer.com/pycompiler/branches/new_ast/ The main test script is test/test_compiler.py which compiles all the modules in /Lib and /Lib/test and compares the output with __builtin__.compile. Best regards Michael Spencer From greg.ewing at canterbury.ac.nz Thu Oct 26 01:39:24 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Oct 2006 12:39:24 +1300 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453EDD43.3050609@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> Message-ID: <453FF5AC.4060500@canterbury.ac.nz> Talin wrote: > (Actually, the OOP approach has a slight advantage in terms of the > amount of syntactic sugar available, Even if you don't use any operator overloading, there's still the advantage that an object provides a namespace for its methods. Without that, you either have to use fairly verbose function names or keep qualifying them with a module name. Code that uses the current path functions tends to contain a lot of os.path.this(os.path.that(...)) stuff which is quite tedious to write and read. Another consideration is that having paths be a distinct data type allows for the possibility of file system references that aren't just strings. In Classic MacOS, for example, the definitive way of referencing a file is by a (volRefum, dirID, name) tuple, and textual paths aren't guaranteed to be unique or even to exist. > (I should not that the Java Path API does *not* follow my scheme of > separation between locators and inodes, while the C# API does, which is > another reason why I prefer the C# approach.) A compromise might be to have all the "path algebra" operations be methods, and everything else functions which operate on path objects. That would make sense, because the path algebra ought to be a closed set of operations that's tightly coupled to the platform's path semantics. -- Greg From talin at acm.org Thu Oct 26 04:48:32 2006 From: talin at acm.org (Talin) Date: Wed, 25 Oct 2006 19:48:32 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453FF5AC.4060500@canterbury.ac.nz> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz> Message-ID: <45402200.1010308@acm.org> Greg Ewing wrote: > Talin wrote: >> (Actually, the OOP approach has a slight advantage in terms of the >> amount of syntactic sugar available, > > Even if you don't use any operator overloading, there's > still the advantage that an object provides a namespace > for its methods. Without that, you either have to use > fairly verbose function names or keep qualifying them > with a module name. Code that uses the current path > functions tends to contain a lot of > os.path.this(os.path.that(...)) stuff which is quite > tedious to write and read. Given the flexibility that Python allows in naming the modules that you import, I'm not sure that this is a valid objection -- you can make the module name as short as you feel comfortable with. > Another consideration is that having paths be a > distinct data type allows for the possibility of file > system references that aren't just strings. In > Classic MacOS, for example, the definitive way of > referencing a file is by a (volRefum, dirID, name) > tuple, and textual paths aren't guaranteed to be > unique or even to exist. That's true of textual paths in general - i.e. even on unix, textual paths aren't guaranteed to be unique or exist. Its been a while since I used classic MacOS - how do you handle things like configuration files with path names in them? >> (I should not that the Java Path API does *not* follow my scheme of >> separation between locators and inodes, while the C# API does, which >> is another reason why I prefer the C# approach.) > > A compromise might be to have all the "path algebra" > operations be methods, and everything else functions > which operate on path objects. That would make sense, > because the path algebra ought to be a closed set > of operations that's tightly coupled to the platform's > path semantics. Personally, this is one of those areas where I am strongly tempted to violate TOOWTDI - I can see use cases where string-based paths would be more convenient and less typing, and other use cases where object-based paths would be more convenient and less typing. If I were designing a path library, I would create a string-based system as the lowest level, and an object based system on top of it (the reason for doing it that was is simply so that people who want to use strings don't have to suffer the cost of creating temporary path objects to do simple things like joins.) Moreover, I would keep the naming conventions of the two systems similar, if at all possible possible - thus, the object methods would have the same (short) names as the functions within the module. So for example: # Import new, refactored module io.path from io import path # Case 1 using strings path1 = path.join( "/Libraries/Frameworks", "Python.Framework" ) parent = path.parent( path1 ) # Case 2 using objects pathobj = path.Path( "/Libraries/Frameworks" ) pathobj += "Python.Framework" parent = pathobj.parent() Let me riff on this just a bit more - don't take this all too seriously though: Refactored organization of path-related modules (under a new name so as not to conflict with existing modules): io.path -- path manipulations io.dir -- directory functions, including dirwalk io.fs -- dealing with filesystem objects (inodes, symlinks, etc.) io.file -- file read / write streams # Import directory module import io.dir # String based API for entry in io.dir.listdir( "/Library/Frameworks" ): print entry # Entry is a string # Object based API dir = io.dir.Directory( "/Library/Frameworks" ) for entry in dir: # Iteration protocol on dir object print entry # entry is an obj, but __str__() returns path text # Dealing with various filesystems: pass in a format parameter dir = io.dir.Directory( "/Library/Frameworks" ) print entry.path( format="NT" ) # entry printed in NT format # Or you can just use a format specifier for PEP 3101 string format: print "Path in local system format is {0}".format( entry ) print "Path in NT format is {0:NT}".format( entry ) print "Path in OS X format is {0:OSX}".format( entry ) Anyway, off the top of my head, that's what a refactored path API would look like if I were doing it :) (Yes, the names are bad, can't think of better ATM.) -- Talin From greg.ewing at canterbury.ac.nz Thu Oct 26 02:52:29 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Oct 2006 13:52:29 +1300 Subject: [Python-Dev] PEP 355 status In-Reply-To: <453EF866.1060503@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453EDF26.4040309@acm.org> <17726.63266.556414.720992@uwakimon.sk.tsukuba.ac.jp> <453EF866.1060503@acm.org> Message-ID: <454006CD.7000006@canterbury.ac.nz> Talin wrote: > Ideally, you should be able to pass > "file:///..." to a regular "open" function. I'm not so sure about that. Consider that "file:///foo.bar" is a valid relative pathname on Unix to a file called "foo.bar" in a directory called "file:". That's not to say there shouldn't be a function available that understands it, but I wouldn't want it built into all functions that take pathnames. -- Greg From foom at fuhm.net Thu Oct 26 09:00:57 2006 From: foom at fuhm.net (James Y Knight) Date: Thu, 26 Oct 2006 03:00:57 -0400 Subject: [Python-Dev] PEP 355 status In-Reply-To: <45402200.1010308@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz> <45402200.1010308@acm.org> Message-ID: On Oct 25, 2006, at 10:48 PM, Talin wrote: > That's true of textual paths in general - i.e. even on unix, textual > paths aren't guaranteed to be unique or exist. > > Its been a while since I used classic MacOS - how do you handle things > like configuration files with path names in them? You aren't supposed to use paths at all. You're supposed to use an Alias whenever you're doing long term storage of a reference to a file. This allows the user to move the file around on the disk without breaking the reference, which is nice. The alias is an opaque datastructure which contains a bunch of redundant information used to locate the file. In particular, both pathname and (volumeId, dirId, name), as well as some other stuff like file size, etc. to help do fuzzy matching if the original file can't be found via the obvious locators. And for files on a file server, it also contains information on how to reconnect to the server if necessary. Much of the alias infrastructure carries over into OSX, although the strictures against using paths have been somewhat watered down. At least in OSX, you don't have the issue of the user renaming the boot volume and thus breaking every path someone ill-advisedly stored (since volume name was part of the path). For an example of aliases in OSX, open a file in TextEdit, see that it gets into the "recent items" menu. Now, move it somewhere else and rename it, and notice that it's still accessible from the menu. Seperately, try deleting the file and renaming another to the same name. Notice that it also succeeds in referencing this new file. Hm, how's this related to python? I'm not quite sure. :) James From greg.ewing at canterbury.ac.nz Thu Oct 26 10:29:43 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Oct 2006 21:29:43 +1300 Subject: [Python-Dev] PEP 355 status In-Reply-To: <45402200.1010308@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz> <45402200.1010308@acm.org> Message-ID: <454071F7.70104@canterbury.ac.nz> Talin wrote: > That's true of textual paths in general - i.e. even on unix, textual > paths aren't guaranteed to be unique or exist. What I mean is that it's possible for two different files to have the same pathname (since you can mount two volumes with identical names at the same time, or for a file to exist on disk yet not be accessible via any pathname (because it would exceed 255 characters). I'm not aware of any analogous situations in unix. > Its been a while since I used classic MacOS - how do you handle things > like configuration files with path names in them? True native classic MacOS software generally doesn't use pathnames. Things like textual config files are really a foreign concept to it. If you wanted to store config info, you'd probably store an alias, which points at the moral equivalent of the files inode number, and use a GUI for editing it. However all this is probably not very relevant now, since as far as I know, classic MacOS is no longer supported in current Python versions. I'm just pointing out that the flexibility would be there if any similarly offbeat platform needed to be supported in the future. > # Or you can just use a format specifier for PEP 3101 string format: > print "Path in local system format is {0}".format( entry ) > print "Path in NT format is {0:NT}".format( entry ) > print "Path in OS X format is {0:OSX}".format( entry ) I don't think that expressing one platform's pathnames in the format of another is something you can do in general, e.g. going from Windows to Unix, what do you do with the drive letter? You can only really do it if you have some sort of network file system connection, and then you need more information than just the path in order to do the translation. -- Greg From talin at acm.org Thu Oct 26 11:12:15 2006 From: talin at acm.org (Talin) Date: Thu, 26 Oct 2006 02:12:15 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <454071F7.70104@canterbury.ac.nz> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz> <45402200.1010308@acm.org> <454071F7.70104@canterbury.ac.nz> Message-ID: <45407BEF.7010204@acm.org> Greg Ewing wrote: > Talin wrote: > >> That's true of textual paths in general - i.e. even on unix, textual >> paths aren't guaranteed to be unique or exist. > > What I mean is that it's possible for two different > files to have the same pathname (since you can mount > two volumes with identical names at the same time, or > for a file to exist on disk yet not be accessible via > any pathname (because it would exceed 255 characters). > I'm not aware of any analogous situations in unix. > >> Its been a while since I used classic MacOS - how do you handle things >> like configuration files with path names in them? > > True native classic MacOS software generally doesn't > use pathnames. Things like textual config files are > really a foreign concept to it. If you wanted to store > config info, you'd probably store an alias, which > points at the moral equivalent of the files inode > number, and use a GUI for editing it. > > However all this is probably not very relevant now, > since as far as I know, classic MacOS is no longer > supported in current Python versions. I'm just > pointing out that the flexibility would be there > if any similarly offbeat platform needed to be > supported in the future. I'm not sure that PEP 355 included any such support - IIRC, the path object was a subclass of string. That isn't, however, a defense against what you are saying - just because neither the current system or the proposed improvement support the kinds of file references you are speaking of, doesn't mean it shouldn't be done. However, this does kind of suck for a cross-platform scripting language like Python. It means that any cross-platform app which requires access to multiple data files that contain inter-file references essentially has to implement its own virtual file system. (Python module imports being a case in point.) One of the things that I really love about Python programming is that I can sit down and start hacking on a new project without first having to go through an agonizing political decision about what platforms I should support. It used to be that I would spend hours ruminating over things like "Well...if I want any market share at all, I really should implement this as Windows program...but on the other hand, I won't enjoy writing it nearly as much." Then comes along Python and removes all of that bothersome hacker-angst. Because of this, I am naturally disinclined to incorporate into my programs any concept which doesn't translate to other platforms. I don't mind writing some platform-specific code, as long as it doesn't take over my program. It seems that any Python program that manipulated paths would have to be radically different in the environment that you describe. How about this: In my ontology of path APIs given earlier, I would tend to put the MacOS file reference in the category of "file locator schemes other than paths". In other words, what you are describing isn't IMHO a path at all, but it is like a path in that it describes how to get to a file. (Its almost like an inode or dirent in some ways.) An alternative approach is to try and come up with an encoding scheme that allows you to represent all of that platform-specific semantics in a string. This leaves you with the unhappy choice of "inventing" a new path syntax for an old platform. however. >> # Or you can just use a format specifier for PEP 3101 string format: >> print "Path in local system format is {0}".format( entry ) >> print "Path in NT format is {0:NT}".format( entry ) >> print "Path in OS X format is {0:OSX}".format( entry ) > > I don't think that expressing one platform's pathnames > in the format of another is something you can do in > general, e.g. going from Windows to Unix, what do you > do with the drive letter? Yeah, probably not. See, I told you not to take it too seriously! But I do feel that its important to be able to manipulate posix-style path syntax on non-posix platfosm, given how many cross-platform applications there are that have a cross-platform path syntax. In my own work, I find that drive letters are never explicitly specified in config files. Any application such as a parser, template generator, or resource manager (in other words, any application whose data files are routinely checked in to the source control system or shared across a network) tend to 'see' only relative paths in their input files, and embedding absolute paths is considered an error on the user's part. Of course, those same apps *do* internally convert all those relative paths to absolute, so that they can be compared and resolved with respect to some common base. Then again, in my opinion, the only *really* absolute paths are fully-qualified URLs. So there. :) > You can only really do it if you have some sort of > network file system connection, and then you need > more information than just the path in order to do > the translation. > > -- > Greg From greg.ewing at canterbury.ac.nz Fri Oct 27 01:12:20 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Oct 2006 12:12:20 +1300 Subject: [Python-Dev] PEP 355 status In-Reply-To: <45407BEF.7010204@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <453EDD43.3050609@acm.org> <453FF5AC.4060500@canterbury.ac.nz> <45402200.1010308@acm.org> <454071F7.70104@canterbury.ac.nz> <45407BEF.7010204@acm.org> Message-ID: <454140D4.3040100@canterbury.ac.nz> Talin wrote: > It seems that any Python program that manipulated paths > would have to be radically different in the environment that you describe. I can sympathise with that. The problem is really inherent in the nature of the platforms -- it's just not possible to do everything in a native classic MacOS way and be cross-platform at the same time. There has to be a compromise somewhere. With classic MacOS the compromise was usually to use pathnames and to heck with the consequences. You could get away with it most of the time. > In other words, what you are describing isn't IMHO a > path at all, but it is like a path in that it describes how to get to a > file. Yes, that's true. Calling it a "path" would be something of a historical misnomer. > An alternative approach is to try and come up with an encoding scheme > that allows you to represent all of that platform-specific semantics in > a string. Yes, I thought of that, too. That's what you would have to do under the current scheme if you ever encountered a platform which truly had no textual representation of file locations. But realistically, it seems unlikely that such a platform will be invented in the foreseeable future (even classic MacOS *had* a notion of paths, even if it wasn't the preferred representation). So all this is probably YAGNI. -- Greg From kbk at shore.net Fri Oct 27 03:30:00 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Thu, 26 Oct 2006 21:30:00 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200610270130.k9R1U0NN007852@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 434 open ( +3) / 3430 closed ( +5) / 3864 total ( +8) Bugs : 929 open (+13) / 6285 closed (+12) / 7214 total (+25) RFE : 245 open ( +1) / 240 closed ( +0) / 485 total ( +1) New / Reopened Patches ______________________ various datetime methods fail in restricted mode (2006-10-17) http://python.org/sf/1578643 opened by lplatypus PyErr_Format corrections (2006-10-17) http://python.org/sf/1578999 opened by Martin v. L?wis posix.readlink doesn't use filesystemencoding (2006-10-19) http://python.org/sf/1580674 opened by Ronald Oussoren Duplicated declaration of PyCallable_Check (2006-10-20) CLOSED http://python.org/sf/1580872 opened by Matthias Klose Allow textwrap to preserve leading and trailing whitespace (2006-10-20) http://python.org/sf/1581073 opened by Dwayne Bailey tarfile.py: 100-char filenames are truncated (2006-10-24) CLOSED http://python.org/sf/1583506 opened by Lars Gust?bel tarfile.py: better use of TarInfo objects with longnames (2006-10-24) http://python.org/sf/1583880 opened by Lars Gust?bel Tix: subwidget names (bug #1472877) (2006-10-25) http://python.org/sf/1584712 opened by Matthias Kievernagel Patches Closed ______________ patch for building trunk with VC6 (2006-03-24) http://python.org/sf/1457736 closed by loewis a faster Modulefinder (2005-11-11) http://python.org/sf/1353872 closed by theller Duplicated declaration of PyCallable_Check (2006-10-20) http://python.org/sf/1580872 closed by loewis Exec stacks in python 2.5 (2006-09-18) http://python.org/sf/1560695 closed by loewis tarfile.py: 100-char filenames are truncated (2006-10-24) http://python.org/sf/1583506 closed by gbrandl New / Reopened Bugs ___________________ 2.4.4c1 will not build when cross compiling (2006-10-16) CLOSED http://python.org/sf/1578513 opened by smithj --disable-sunaudiodev --disable-tk does not work (2006-10-17) http://python.org/sf/1579029 opened by ThurnerRupert Segfault provoked by generators and exceptions (2006-10-17) http://python.org/sf/1579370 opened by Mike Klaas Use flush() before os.exevp() (2006-10-18) http://python.org/sf/1579477 opened by Thomas Guettler Wrong syntax for PyDateTime_IMPORT in documentation (2006-10-18) CLOSED http://python.org/sf/1579796 opened by David Faure not configured for tk (2006-10-18) http://python.org/sf/1579931 opened by Carl Wenrich glob.glob("c:\\[ ]\*) doesn't work (2006-10-19) http://python.org/sf/1580472 opened by Koblaid "make install" for Python 2.4.4 not working properly (2006-10-19) http://python.org/sf/1580563 opened by Andreas Jung Configure script does not work for RHEL 4 x86_64 (2006-10-19) http://python.org/sf/1580726 reopened by gbrandl Configure script does not work for RHEL 4 x86_64 (2006-10-19) http://python.org/sf/1580726 reopened by spotvt01 Configure script does not work for RHEL 4 x86_64 (2006-10-19) http://python.org/sf/1580726 opened by Chris httplib hangs reading too much data (2006-10-19) http://python.org/sf/1580738 opened by Dustin J. Mitchell Definition of a "character" is wrong (2006-10-20) http://python.org/sf/1581182 opened by Adam Olsen pickle protocol 2 failure on int subclass (2006-10-20) http://python.org/sf/1581183 opened by Anders J. Munch missing __enter__ + __getattr__ forwarding (2006-10-21) http://python.org/sf/1581357 opened by Hirokazu Yamamoto Text search gives bad count if called from variable trace (2006-10-20) http://python.org/sf/1581476 opened by Russell Owen test_sqlite fails on OSX G5 arch if test_ctypes is run (2006-10-21) http://python.org/sf/1581906 opened by Skip Montanaro email.header decode within word (2006-10-22) http://python.org/sf/1582282 opened by Tokio Kikuchi Python is dumping core after the test test_ctypes (2006-10-23) http://python.org/sf/1582742 opened by shashi Bulding source with VC6 fails due to missing files (2006-10-23) CLOSED http://python.org/sf/1582856 opened by Ulrich Hockenbrink class member inconsistancies (2006-10-23) CLOSED http://python.org/sf/1583060 opened by EricDaigno Different behavior when stepping through code w/ pdb (2006-10-24) http://python.org/sf/1583276 opened by John Ehresman tarfile incorrectly handles long filenames (2006-10-24) CLOSED http://python.org/sf/1583537 opened by Mike Looijmans yield+break stops tracing (2006-10-24) http://python.org/sf/1583862 opened by Lukas Lalinsky __str__ cannot be overridden on unicode-derived classes (2006-10-24) http://python.org/sf/1583863 opened by Mike K SSL "issuer" and "server" functions problems - security (2006-10-24) http://python.org/sf/1583946 opened by John Nagle remove() during iteration causes items to be skipped (2006-10-24) CLOSED http://python.org/sf/1584028 opened by Kevin Rabsatt os.tempnam fails on SUSE Linux to accept directory argument (2006-10-25) CLOSED http://python.org/sf/1584723 opened by Andreas Events in list return None not True on wait() (2006-10-26) CLOSED http://python.org/sf/1585135 opened by SpinMess Bugs Closed ___________ from_param and _as_parameter_ truncating 64-bit value (2006-10-12) http://python.org/sf/1575945 closed by theller 2.4.4c1 will not build when cross compiling (2006-10-16) http://python.org/sf/1578513 closed by loewis Error with callback function and as_parameter with NumPy ndp (2006-10-10) http://python.org/sf/1574584 closed by theller PyThreadState_Clear() docs incorrect (2003-04-17) http://python.org/sf/723205 deleted by theller Wrong syntax for PyDateTime_IMPORT in documentation (2006-10-18) http://python.org/sf/1579796 closed by akuchling Configure script does not work for RHEL 4 x86_64 (2006-10-19) http://python.org/sf/1580726 closed by loewis Example typo in section 4 of 'Installing Python Modules' (2006-10-12) http://python.org/sf/1576348 closed by akuchling Bulding source with VC6 fails due to missing files (2006-10-23) http://python.org/sf/1582856 closed by loewis class member inconsistancies (2006-10-23) http://python.org/sf/1583060 closed by gbrandl mac installer profile patch vs. .bash_login (2006-09-19) http://python.org/sf/1561243 closed by sf-robot idle in python 2.5c1 freezes on macos 10.3.9 (2006-08-18) http://python.org/sf/1542949 closed by sf-robot Launcher reset to factory button provides bad command-line (2006-10-03) http://python.org/sf/1570284 closed by sf-robot tarfile incorrectly handles long filenames (2006-10-24) http://python.org/sf/1583537 deleted by cdwave remove() during iteration causes items to be skipped (2006-10-24) http://python.org/sf/1584028 closed by rhettinger os.tempnam fails on SUSE Linux to accept directory argument (2006-10-25) http://python.org/sf/1584723 closed by gbrandl Events in list return None not True on wait() (2006-10-26) http://python.org/sf/1585135 closed by gbrandl New / Reopened RFE __________________ Add os.link() and os.symlink() support for Windows (2006-10-16) http://python.org/sf/1578269 opened by M.-A. Lemburg From steven.bethard at gmail.com Fri Oct 27 21:11:27 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri, 27 Oct 2006 13:11:27 -0600 Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-01 to 2006-09-15 Message-ID: Here's the summary for the first half of September. As always, comments and corrections are greatly appreciated! ============= Announcements ============= ---------------------------- QOTF: Quote of the Fortnight ---------------------------- Through a cross-posting slip-up, Jean-Paul Calderone managed to provide us with some inspiring thoughts on mailing-list archives: One could just as easily ask why no one bothers to read mailing list archives to see if their question has been answered before. No one will ever know, it is just one of the mysteries of the universe. Contributing thread: - `[Twisted-Python] Newbie question `__ ------------------------- Monthly Arlington sprints ------------------------- Jeffrey Elkner has arranged for monthly Arlington Python sprints. See the `Arlington sprint wiki`_ for more details. .. _Arlington sprint wiki: http://wiki.python.org/moin/ArlingtonSprint Contributing thread: - `Arlington sprints to occur monthly `__ ========= Summaries ========= ----------------------------------------- Signals, threads and blocking C functions ----------------------------------------- Gustavo Carneiro explained a problem that pygtk was running into. Their main loop function, ``gtk_main()``, blocks forever. If there are threads in the program, they cannot receive signals because Python catches the signal and calls ``Py_AddPendingCall()``, relying on the main thread to call ``Py_MakePendingCalls()``. Since with pygtk, the main thread is blocked calling a C function, it has no way other than polling to decide when ``Py_MakePendingCalls()`` needs to be called. Gustavo was hoping for some sort of API so that his blocking thread could get notified when ``Py_AddPendingCall()`` had been called. There was a long discussion about the feasibility of this and other solutions to his problem. One of the main problems is that almost nothing can safely be done from a signal handler context, so some people felt like having Python invoke arbitrary third-party code was a bad idea. Gustavo was reasonably confident that he could write to a pipe within that context, which was all he needed to do to solve his problem, but Nick Maclaren explained in detail some of the problems, e.g. writing proper synchronization primitives that are signal-handler safe. Jan Kanis suggested that threads in a pygtk program should occasionally check the signal handler flags and calls PyGTK's callback to wake up the main thread. But Gustavo explained that things like the GnomeVFS library have their own thread pools and know nothing about Python so can't make such a callback. Adam Olsen that Python could create a single non-blocking pipe for all signals. When a signal was handled, the signal number would be written to that pipe as a single byte. Third-party libraries, like pygtk, could poll the appropriate file descriptor, waking up and handing control back to Python when a signal was received. There were some disadvantages to this approach, e.g. if there is a large burst of signals, some of them would be lost, but folks seemed to think that these kinds of things would not cause many real-world problems. Gustavo and Adam then worked out the code in a little more detail. The `Py_signal_pipe patch`_ was posted to SourceForge. .. _Py_signal_pipe patch: http://bugs.python.org/1564547 Contributing thread: - `Signals, threads, blocking C functions `__ ------------------------ API for str.rpartition() ------------------------ Raymond Hettinger pointed out that in cases where the separator was not found, ``str.rpartition()`` was putting the remainder of the string in the wrong spot, e.g. ``str.rpartition()`` worked like:: 'axbxc'.rpartition('x') == ('axb', 'x', 'c') 'axb'.rpartition('x') == ('a', 'x', 'b') 'a'.rpartition('x') == ('a', '', '') # should be ('', '', 'a') Thus code that used ``str.rpartition()`` in a loop or recursively would likely never terminate. Raymond checked in a fix for this, spawning an enormous discussion about how the three bits ``str.rpartition()`` returns should be named. There was widespread disagreement on which side was the "head" and which side was the "tail", and the only unambiguous one seemed to be "left, sep, right". Raymond and others were not as happy with this version because it was no longer suggestive of the use cases, but it looked like this might be the best compromise. Contributing threads: - `Problem withthe API for str.rpartition() `__ - `Fwd: Problem withthe API for str.rpartition() `__ --------------- Unicode Imports --------------- Kristj?n V. J?nsson submitted a `unicode import patch`_ that would allow unicode paths in sys.path and use the unicode file API on Windows. It got a definite "no" from the Python 2.5 release managers since it was already too late in the release process. Nonetheless there was a long discussion about whether or not it should be considered a bug or a feature. Martin v. L?wis explained that it was definitely a feature because it would break existing introspection tools expecting things like __file__ to be 8-bit strings (not unicode strings as they would be with the patch). .. _unicode import patch: http://bugs.python.org/1552880 Contributing thread: - `Unicode Imports `__ ------------------------- Exception and __unicode__ ------------------------- Marcin 'Qrczak' Kowalczyk reported a `TypeError from unicode()`_ when applied to an Exception class. Brett Cannon explained the source of this: BaseException defined a ``__unicode__`` descriptor which was complaining when it was handed a class, not an instance. The easiest solution seemed to be the best for Python 2.5: simply rip out the ``__unicode__`` method entirely. M.-A. Lemburg suggested that for Python 2.6 this should be fixed by introducing a tp_unicode slot. .. _TypeError from unicode(): http://bugs.python.org/1551432 Contributing thread: - `2.5 status `__ -------------------------- Slowdown in inspect module -------------------------- Fernando Perez reported an enormous slowdown in Python 2.5's inspect module. Nick Coghlan figured out that this was a result of ``inspect.findsource()`` calling ``os.path.abspath()`` and ``os.path.normpath()`` repeatedly on the module's file name. Nick provided a `patch to speed things up`_ by caching the absolute, normalized file names. .. _patch to speed things up: http://bugs.python.org/1553314 Contributing thread: - `inspect.py very slow under 2.5 `__ -------------------------------- Cross-platform float consistency -------------------------------- Andreas Raab asked about trying to minimize some of the cross-platform differences in floating-point calcuations, by using something like fdlibm_. Tim Peters pointed him to a `previous thread on this issue`_ and suggested that best route was probably to package a Python wrapper for fdlibm_ and see how much interest there was. .. _fdlibm: http://www.netlib.org/fdlibm/ .. _previous thread on this issue: http://mail.python.org/pipermail/python-list/2005-July/290164.html Contributing thread: - `Cross-platform math functions? `__ ----------------------------------- Refcounting and errors in functions ----------------------------------- Mihai Ibanescu pointed out that refcount status for functions that can fail is generally poorly documented. Greg Ewing explained that refcounting behavior should be independent of whether the call succeeds or fails, but it was clear that this was not always the case. Mihai promised to file a low-severity bug so that this problem wouldn't be lost. Contributing thread: - `Py_BuildValue and decref `__ ------------ Python 2.3.6 ------------ Barry Warsaw offered to push out a Python 2.3.6 if folks were interested in getting some bugfixes out to the platforms which were still running Python 2.3. After an underwhelming response, he retracted the offer. Contributing threads: - `Interest in a Python 2.3.6? `__ - `Interest in a Python 2.3.6? `__ - `Python 2.4.4 was: Interest in a Python 2.3.6? `__ ----------------------------------- Effbot Python library documentation ----------------------------------- Johann C. Rocholl asked about the status of http://effbot.org/lib/, Fredrik Lundh's alternative format and rendering for the Python library documentation. Fredrik indicated that due to the pushback from some folks on python-dev, they've been working mainly "under the radar" on this. (At least until some inconsiderate soul put them in the summary...) ;-) Contributing threads: - `That library reference, yet again `__ - `That library reference, yet again `__ ================ Deferred Threads ================ - `IronPython and AST branch `__ ================== Previous Summaries ================== - `Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented `__ - `Error while building 2.5rc1 pythoncore_pgo on VC8 `__ - `gcc 4.2 exposes signed integer overflows `__ - `no remaining issues blocking 2.5 release `__ - `new security doc using object-capabilities `__ =============== Skipped Threads =============== - `A test suite for unittest `__ - `Fwd: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc `__ - `Weekly Python Patch/Bug Summary `__ - `Windows build slave down until Tuesday-ish `__ - `[Python-checkins] TRUNK IS UNFROZEN, available for 2.6 work if you are so inclined `__ - `Exception message for invalid with statement usage `__ - `buildbot breakage `__ - `Change in file() behavior in 2.5 `__ - `'with' bites Twisted `__ - `What windows tool chain do I need for python 2.5 extensions? `__ - `2.5c2 `__ - `_PyGILState_NoteThreadState should be static or not? `__ - `BRANCH FREEZE: release25-maint, 00:00UTC 12 September 2006 `__ - `datetime's strftime implementation: by design or bug `__ - `Subversion 1.4 `__ - `RELEASED Python 2.5 (release candidate 2) `__ - `Maybe we should have a C++ extension for testing... `__ - `.pyc file has different result for value "1.79769313486232e+308" than .py file `__ - `release is done, but release25-maint branch remains near-frozen `__ - `fun threading problem `__ - `Thank you all `__ From theller at ctypes.org Fri Oct 27 21:24:40 2006 From: theller at ctypes.org (Thomas Heller) Date: Fri, 27 Oct 2006 21:24:40 +0200 Subject: [Python-Dev] Modulefinder In-Reply-To: References: Message-ID: <45425CF8.7030606@ctypes.org> > On 10/13/06, Thomas Heller wrote: >> I have patched Lib/modulefinder.py to work with absolute and relative imports. >> It also is faster now, and has basic unittests in Lib/test/test_modulefinder.py. >> >> The work was done in a theller_modulefinder SVN branch. >> If nobody objects, I will merge this into trunk, and possibly also into release25-maint, when I have time. > Guido van Rossum schrieb: > Could you also prepare a patch for the p3yk branch? It's broken there too... > I'm currently looking into this now. IIUC, 'import foo' is an absolute import now - is this the only change to the import machinery? Thomas From tjreedy at udel.edu Fri Oct 27 21:45:43 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 27 Oct 2006 15:45:43 -0400 Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-01 to 2006-09-15 References: Message-ID: > Adam Olsen that Python could create a single non-blocking pipe for a /that/suggested that/ From theller at ctypes.org Fri Oct 27 21:54:53 2006 From: theller at ctypes.org (Thomas Heller) Date: Fri, 27 Oct 2006 21:54:53 +0200 Subject: [Python-Dev] Modulefinder In-Reply-To: References: Message-ID: <4542640D.9030104@ctypes.org> > On 10/13/06, Thomas Heller wrote: >> I have patched Lib/modulefinder.py to work with absolute and relative imports. >> It also is faster now, and has basic unittests in Lib/test/test_modulefinder.py. >> >> The work was done in a theller_modulefinder SVN branch. >> If nobody objects, I will merge this into trunk, and possibly also into release25-maint, when I have time. > Guido van Rossum schrieb: > Could you also prepare a patch for the p3yk branch? It's broken there too... > Patch uploaded, and assigned to you. http://www.python.org/sf/1585966 Oh, and BTW: py3k SVN doesn't compile under windows. Thomas From oliphant.travis at ieee.org Fri Oct 27 22:05:31 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Fri, 27 Oct 2006 14:05:31 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python Message-ID: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep_dtypes.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20061027/77410d52/attachment-0001.txt From steven.bethard at gmail.com Sat Oct 28 00:23:50 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri, 27 Oct 2006 16:23:50 -0600 Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-16 to 2006-09-30 Message-ID: Thanks to all of those who have already given me feedback on the last summary. Here's the next one (for the second half of September). I found the "OS X universal binaries" and "Finer-grained locking than the GIL" discussions particularly hard to follow, so I'd especially appreciate corrections on those. Thanks! ========= Summaries ========= --------------- Import features --------------- Fabio Zadrozny ran into the `previously reported relative import issues`_ where a ``from . import xxx`` always fails from a top-level module. This is because relative imports rely on the ``__name__`` of a module, so when it is just ``"__main__"``, they can't handle it properly. On the subject of imports, Guido said that one of the missing import features was to be able to say "*this* package lives *here*". Paul Moore whipped up a Python API to an import hook that could do this, but indicated that a full mechanism would need to pay more attention to the environment (e.g. PYTHONPATH and .pth files). There was also some discussion about trying to have a sort of per-module ``sys.path`` so that you could have multiple versions of the same module present, with different modules importing different versions. Phillip J. Eby suggested that this was probably not a very common need, and that implementing it would be quite difficult with things like C extensions only being able to be loaded once. In general, people seemed interested in a pure-Python implementation of the import mechanism so that they could play with some of these approaches. It looked like Brett Cannon would probably be working on that. .. _previously reported relative import issues: http://www.python.org/dev/summary/2006-06-16_2006-06-30/#relative-imports-and-pep-338-executing-modules-as-scripts Contributing thread: - `New relative import issue `__ ---------------------------- Python library documentation ---------------------------- A less-trolly-than-usual post from Xah Lee started a discussion about the Python documentation. Greg Ewing and others suggested following the documentation style of the Inside Macintosh series: first an "About this module" narrative explaining the concepts and how they fit together, followed by the extensive API reference. Most people agreed that simply extracting the documentation from the docstrings was a bad idea -- it lacks the high-level overview and gives equal importance to all functions, regardless of their use. Contributing thread: - `Python Doc problems `__ ----------------------- OS X universal binaries ----------------------- Jack Howarth asked about creating universal binaries for OS X that would support 32-bit or 64-bit on both PPC and x86. Ronald Oussoren pointed out that the 32-bit part of this was already supported, but indicated that adding 64-bit support simultaneously might be more difficult. Ronald seemed to think that modifications to pyconfig.h.in might solve the problem, though he was worried that this might cause distutils to detect some architecture features incorrectly. Contributing thread: - `python, lipo and the future? `__ ---------------------------------- Finer-grained locking than the GIL ---------------------------------- Martin Devera was looking into replacing the global interpreter lock (GIL) with finer-grained locking, tuned to minimize locking by assuming that most objects were used only by a single thread. For objects that were shared across multiple threads, this approach would allow non-blocking reads, but require all threads to "come home" before modifications could be made. Phillip J. Eby pointed out that most object accesses in Python are actually modifications too, due to reference counting, so it looked like Martin's proposal wouldn't work well with the current refcounting implementation of Python. After Martin v. L?wis found a bug in the locking algorithm, Martin Devera decided to take his idea back to the drawing board. Contributing thread: - `deja-vu .. python locking `__ --------------------------- OS X and ssize_t formatting --------------------------- The buildbots spotted an OS X error in the itertools module. After Jack Diederich fixed a bug where ``size_t`` had been used instead of ``ssize_t``, Neal Norwitz noticed some problems with ``%zd`` on OS X. Despite documentation to the contrary in both the man page and the C99 Standard, using that specifier on OS X treats a negative number as an unsigned number. Ronald Oussoren and others reported the bug to Apple. Contributing thread: - `test_itertools fails for trunk on x86 OS X machine `__ ------------------- itertools.flatten() ------------------- Michael Foord asked about including a flatten function that would take a sequence with sub-sequences nested to an arbitrary depth and create a simple non-nested sequence from that. People were strongly opposed to adding this as a builtin, but even as an itertools function, there was disagreement. How should strings, dicts and other arbitrary iterables be flattened? Since there wasn't one clear answer, it looked like the proposal didn't have much of a chance. Contributing thread: - `Suggestion for a new built-in - flatten `__ ------------------------------- Class definition syntax changes ------------------------------- Fabio Zadrozny noted that in Python 2.5, classes can now be declared as:: class C(): ... Some folks wanted the result to be a new-style class, but the presence or absence of ``()`` was deemed too subtle of a cue to make the new-style/old-style distinction. For the Python 2.X series, explicit subclassing of ``object`` will still be necessary. Contributing thread: - `Grammar change in classdef `__ ---------------------- Python 2.5 and GCC 4.2 ---------------------- Armin Rigo found some more signed integer overflows when using GCC 4.2 like the ones `reported earlier`_. Because Python 2.5 final was scheduled to be released in 24 hours, and it looked like there wouldn't be too many people affected these problems, they were deferred until 2.5.1. For the moment at least, the README indicates that GCC 4.1 and 4.2 shouldn't be used to compile Python. .. _reported earlier: http://www.python.org/dev/summary/2006-08-16_2006-08-31/#gcc-4-2-and-integer-overflows Contributing threads: - `Before 2.5 - More signed integer overflows `__ - `GCC 4.x incompatibility `__ ---------------------------------- Discard method for dicts and lists ---------------------------------- Gustavo Niemeyer and Greg Ewing suggested adding ``dict.discard()`` and ``list.discard()`` along the lines of ``set.discard()``. Fred L. Drake, Jr. explained that ``dict.discard(foo)`` is currently supported with ``dict.pop(foo, None)``. There was more debate about the ``list`` version, but most people seemed to think that wrapping ``list.remove()`` with the appropriate if-statement or try-except was fine. Contributing threads: - `dict.discard `__ - `list.discard? (Re: dict.discard) `__ -------------------- weakref enhancements -------------------- tomer filiba offered some additions to the weakref module, weakattr_ and weakmethod_. Raymond Hettinger questioned how frequently these would be useful in the real world, but both tomer and Alex Martelli assured him that they had real-world use-cases for these. However, there didn't generally seem to be enough support for them to include them in the standard library. .. _weakattr: http://sebulba.wikispaces.com/recipe+weakattr .. _weakmethod: http://sebulba.wikispaces.com/recipe+weakmethod Contributing thread: - `weakref enhancements `__ ------------------------ AST structure guarantees ------------------------ Anthony Baxter asked that the AST structure get the same guarantees as the byte-code format, that is, that it would change as little as possible so that people who wanted to hack it wouldn't have to change their code for each release. Pretty much everyone agreed that this was a good idea. In a related thread, Sanghyeon Seo asked if the AST structure should become part of the Python specification so that other implementations like IronPython_ would use it as well. While most people felt like it would be good if the various specifications had similar AST representations, it seemed like mandating it as part of the implementation would lock things down too much. .. _IronPython: http://www.codeplex.com/Wiki/View.aspx?ProjectName=IronPython Contributing threads: - `IronPython and AST branch `__ - `IronPython and AST branch `__ - `AST structure and maintenance branches `__ ----------------------------- PEP 302: phase 2 import hooks ----------------------------- For his dissertation work, Brett Cannon needed to implement phase 2 of the `PEP 302`_ import hooks. He asked for feedback on whether it would be easier to do this within the current C code, or whether it would be better to rewrite the import mechanisms in Python first. Phillip J. Eby gave some advice on how to restructure things, and suggested that the C code was somewhat delicate and having a Python implementation around would be a Good Thing. Armin Rigo strongly recommended rewriting things in Python. .. _PEP 302: http://www.python.org/dev/peps/pep-0302/ Contributing thread: - `difficulty of implementing phase 2 of PEP 302 in Python source `__ ---------------------------------------------------- Testsuite fails on Windows if a space is in the path ---------------------------------------------------- Martin v. L?wis was trying to fix some bugs where spaces in Windows paths caused some of the testsuite to fail. For example, test_popen was getting an error because ``os.popen`` invoked:: cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print sys.version" which failed complaining that c:\Program is not a valid executable. Jean-Paul Calderon and Tim Peters explained that the ``cmd.exe`` part is necessary to force proper cmd.exe-style argument parsing and to allow environment variable substitution. After scrutinizing the MS quoting rules, it seemed like fixing this for Python 2.5 was too likely to introduce incompatibilities, so it was postponed to 2.6. Contributing thread: - `Testsuite fails on Windows if a space is in the path `__ ----------------------------------------- PEP 353: Backwards-compatibility #defines ----------------------------------------- David Abrahams suggested a modification to the suggested backwards-compatibility #define incantation of `PEP 353`_ so that the PY_SSIZE_T_MAX and PY_SSIZE_T_MIN would only ever get defined once. There was some discussion about whether or not this was absolutely necessary, but everyone agreed that the change was probably sensible regardless. .. _PEP 353: http://www.python.org/dev/peps/pep-0353/ Contributing thread: - `Pep 353: Py_ssize_t advice `__ ----------------- Bare-bones Python ----------------- Milan Krcmar asked about what he could drop from Python to make it small enough to fit on a platform with only 2 MiB of flash ROM and 16 MiB of RAM. Giovanni Bajo suggested dropping the CJK codecs (which account for about 800K), though he also noted that after that there weren't any really low-hanging fruit. Martin v. L?wis suggested that he might also get a gain out of dropping support for dynamic loading of extension modules, and linking all necessary modules statically. Gustavo Niemeyer pointed him to `Python for S60`_ and `Python for Maemo`_ which had to undergo similar stripping down. .. _Python for S60: http://opensource.nokia.com/projects/pythonfors60/ .. _Python for Maemo: http://pymaemo.sf.net Contributing thread: - `Minipython `__ ================ Deferred Threads ================ - `Removing __del__ `__ - `Caching float(0.0) `__ - `PEP 355 status `__ - `PEP 351 - do while `__ ================== Previous Summaries ================== - `Signals, threads, blocking C functions `__ =============== Skipped Threads =============== - `Thank you all `__ - `BRANCH FREEZE/IMMINENT RELEASE: Python 2.5 (final). 2006-09-19, 00:00UTC `__ - `RELEASED Python 2.5 (FINAL) `__ - `release25-maint branch - please keep frozen for a day or two more. `__ - `Download URL typo `__ - `Exceptions and slicing `__ - `Weekly Python Patch/Bug Summary `__ - `release25-maint is UNFROZEN `__ - `Small Py3k task: fix modulefinder.py `__ - `win32 - results from Lib/test - 2.5 release-maint `__ - `Weekly Python Patch/Bug Summary ** REVISED ** `__ - `[Python-checkins] release25-maint is UNFROZEN `__ - `Python network Programmign `__ - `Relative import bug? `__ - `GCC patch for catching errors in PyArg_ParseTuple `__ - `Typo.pl scan of Python 2.5 source code `__ - `Maybe we should have a C++ extension for testing... `__ - `Python 2.5 bug? Changes in behavior of traceback module `__ - `Need help with C - problem in sqlite3 module `__ - `PyErr_CheckSignals error return value `__ - `python-dev summary for 2006-08-01 to 2006-08-15 `__ - `2.4.4c1 October 11, 2.4.4 final October 18 `__ - `[SECUNIA] "buffer overrun in repr() for unicode strings" Potential Vulnerability (fwd) `__ - `List of candidate 2.4.4 bugs? `__ - `openssl - was: 2.4.4c1 October 11, 2.4.4 final October 18 `__ - `Collecting 2.4.4 fixes `__ - `os.unlink() closes file? `__ - `Tix not included in 2.5 for Windows `__ - `Possible semantic changes for PEP 352 in 2.6 `__ From martin at v.loewis.de Sat Oct 28 01:44:21 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 28 Oct 2006 01:44:21 +0200 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: <454299D5.4090804@v.loewis.de> Travis E. Oliphant schrieb: > The datatype is an object that specifies how a certain block of > memory should be interpreted as a basic data-type. > > >>> datatype(float) > datatype('float64') I can't speak on the specific merits of this proposal, or whether this kind of functionality is desirable. However, I'm -1 on the addition of a builtin for this functionality (the PEP doesn't actually say that there is another builtin, but the examples suggest so). Instead, putting it into the sys, array, struct, or ctypes modules might be more appropriate, as might be the introduction of another module. Regards, Martin From anthony at interlink.com.au Sat Oct 28 01:48:44 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Sat, 28 Oct 2006 09:48:44 +1000 Subject: [Python-Dev] [Python-checkins] r52482 - in python/branches/release25-maint: Lib/urllib.py Lib/urllib2.py Misc/NEWS In-Reply-To: <20061027171334.28AD01E4003@bag.python.org> References: <20061027171334.28AD01E4003@bag.python.org> Message-ID: <200610280948.45016.anthony@interlink.com.au> On Saturday 28 October 2006 03:13, andrew.kuchling wrote: > 2.4 backport candidate, probably. FWIW, I'm not planning on doing any more "collect all the bugfixes" releases of 2.4. It's now in the same category as 2.3 - that is, only really serious bugs (in particular, security related bugs) will get a new release, and then only with the serious bugfixes applied. One active maintenance branch is quite enough to deal with, IMHO. -- Anthony Baxter It's never too late to have a happy childhood. From martin at v.loewis.de Sat Oct 28 01:50:03 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 28 Oct 2006 01:50:03 +0200 Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-16 to 2006-09-30 In-Reply-To: References: Message-ID: <45429B2B.6070404@v.loewis.de> Steven Bethard schrieb: > Jack Howarth asked about creating universal binaries for OS X that > would support 32-bit or 64-bit on both PPC and x86. Ronald Oussoren > pointed out that the 32-bit part of this was already supported, but > indicated that adding 64-bit support simultaneously might be more > difficult. Ronald seemed to think that modifications to pyconfig.h.in > might solve the problem, though he was worried that this might cause > distutils to detect some architecture features incorrectly. Ronald can surely speak for himself, but I think the problem is slightly different. There were different strategies discussed for changing pyconfig.h (with an include, or with #ifdefs), and in all cases, distutils would fail to detect the architecture properly. That's not really a problem of pyconfig.h, but of the way that distutils uses to detect bitsizes - which inherently cannot work for universal binaries (i.e. you should look at the running interpreter, not at pyconfig.h). Regards, Martin From greg.ewing at canterbury.ac.nz Sat Oct 28 02:23:26 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 28 Oct 2006 13:23:26 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: <4542A2FE.9070409@canterbury.ac.nz> Travis E. Oliphant wrote: > PEP: > Title: Adding data-type objects to the standard library Not sure about having 3 different ways to specify the structure -- it smacks of Too Many Ways To Do It to me. Also, what if I want to refer to fields by name but don't want to have to work out all the offsets (which is tedious, error-prone and hostile to modification)? -- Greg From typo_pl at hotmail.com Sat Oct 28 05:35:48 2006 From: typo_pl at hotmail.com (Johnny Lee) Date: Sat, 28 Oct 2006 03:35:48 +0000 Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code Message-ID: I grabbed the latest Python2.5 code via subversion and ran my typo script on it. Weeding out the obvious false positives and Neal's comments leaves about 129 typos. See http://www.geocities.com/typopl/typoscan.htm Should I enter the typos as bugs in the Python bug db? J > Date: Fri, 22 Sep 2006 21:51:38 -0700> From: nnorwitz at gmail.com> To: typo_pl at hotmail.com> Subject: Re: [Python-Dev] Typo.pl scan of Python 2.5 source code> CC: python-dev at python.org> > On 9/22/06, Johnny Lee wrote:> >> > Hello,> > My name is Johnny Lee. I have developed a *ahem* perl script which scans> > C/C++ source files for typos.> > Hi Johnny.> > Thanks for running your script, even if it is written in Perl and ran> on Windows. :-)> > > The Python 2.5 typos can be classified into 7 types.> >> > 2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size);> > If realloc() fails, it will return NULL. If you assign the return value to> > the same variable you passed into realloc,> > then you've overwritten the variable and possibly leaked the memory that the> > variable pointed to.> > A bunch of these warnings were accurate and a bunch were not. There> were 2 reasons for the false positives. 1) The pointer was aliased,> thus not lost, 2) On failure, we exited (Parser/*.c)> > > 4) if ((X!=0) || (X!=1))> > These 2 cases occurred in binascii. I have no idea if the warning is> wright or the code is.> > > 6) XX;;> > Just being anal here. Two semicolons in a row. Second one is extraneous.> > I already checked in a fix for these on HEAD. Hard for even me to> screw up those fixes. :-)> > > 7) extraneous test for non-NULL ptr> > Several memory calls that free memory accept NULL ptrs.> > So testing for NULL before calling them is redundant and wastes code space.> > Now some codepaths may be time-critical, but probably not all, and smaller> > code usually helps.> > I ignored these as I'm not certain all the platforms we run on accept> free(NULL).> > Below is my categorization of the warnings except #7. Hopefully> someone will fix all the real problems in the first batch.> > Thanks again!> > n> --> > # Problems> Objects\fileobject.c (338): realloc overwrite src if NULL; 17:> file->f_setbuf=(char*)PyMem_Realloc(file->f_setbuf,bufsize)> Objects\fileobject.c (342): using PyMem_Realloc result w/no check> 30: setvbuf(file->f_fp, file->f_setbuf, type, bufsize);> [file->f_setbuf]> Objects\listobject.c (2619): using PyMem_MALLOC result w/no check> 30: garbage[i] = selfitems[cur]; [garbage]> Parser\myreadline.c (144): realloc overwrite src if NULL; 17:> p=(char*)PyMem_REALLOC(p,n+incr)> Modules\_csv.c (564): realloc overwrite src if NULL; 17:> self->field=PyMem_Realloc(self->field,self->field_size)> Modules\_localemodule.c (366): realloc overwrite src if NULL; 17:> buf=PyMem_Realloc(buf,n2)> Modules\_randommodule.c (290): realloc overwrite src if NULL; 17:> key=(unsigned#long*)PyMem_Realloc(key,bigger*sizeof(*key))> Modules\arraymodule.c (1675): realloc overwrite src if NULL; 17:> self->ob_item=(char*)PyMem_REALLOC(self->ob_item,itemsize*self->ob_size)> Modules\cPickle.c (536): realloc overwrite src if NULL; 17:> self->buf=(char*)realloc(self->buf,n)> Modules\cPickle.c (592): realloc overwrite src if NULL; 17:> self->buf=(char*)realloc(self->buf,bigger)> Modules\cPickle.c (4369): realloc overwrite src if NULL; 17:> self->marks=(int*)realloc(self->marks,s*sizeof(int))> Modules\cStringIO.c (344): realloc overwrite src if NULL; 17:> self->buf=(char*)realloc(self->buf,self->buf_size)> Modules\cStringIO.c (380): realloc overwrite src if NULL; 17:> oself->buf=(char*)realloc(oself->buf,oself->buf_size)> Modules\_ctypes\_ctypes.c (2209): using PyMem_Malloc result w/no> check 30: memset(obj->b_ptr, 0, dict->size); [obj->b_ptr]> Modules\_ctypes\callproc.c (1472): using PyMem_Malloc result w/no> check 30: strcpy(conversion_mode_encoding, coding);> [conversion_mode_encoding]> Modules\_ctypes\callproc.c (1478): using PyMem_Malloc result w/no> check 30: strcpy(conversion_mode_errors, mode);> [conversion_mode_errors]> Modules\_ctypes\stgdict.c (362): using PyMem_Malloc result w/no> check 30: memset(stgdict->ffi_type_pointer.elements, 0,> [stgdict->ffi_type_pointer.elements]> Modules\_ctypes\stgdict.c (376): using PyMem_Malloc result w/no> check 30: memset(stgdict->ffi_type_pointer.elements, 0,> [stgdict->ffi_type_pointer.elements]> > # No idea if the code or tool is right.> Modules\binascii.c (1161)> Modules\binascii.c (1231)> > # Platform specific files. I didn't review and won't fix without testing.> Python\thread_lwp.h (107): using malloc result w/no check 30:> lock->lock_locked = 0; [lock]> Python\thread_os2.h (141): using malloc result w/no check 30:> (long)sem)); [sem]> Python\thread_os2.h (155): using malloc result w/no check 30:> lock->is_set = 0; [lock]> Python\thread_pth.h (133): using malloc result w/no check 30:> memset((void *)lock, '\0', sizeof(pth_lock)); [lock]> Python\thread_solaris.h (48): using malloc result w/no check 30:> funcarg->func = func; [funcarg]> Python\thread_solaris.h (133): using malloc result w/no check 30:> if(mutex_init(lock,USYNC_THREAD,0)) [lock]> > # Who cares about these modules.> Modules\almodule.c:182> Modules\svmodule.c:547> > # Not a problem.> Parser\firstsets.c (76)> Parser\grammar.c (40)> Parser\grammar.c (59)> Parser\grammar.c (83)> Parser\grammar.c (102)> Parser\node.c (95)> Parser\pgen.c (52)> Parser\pgen.c (69)> Parser\pgen.c (126)> Parser\pgen.c (438)> Parser\pgen.c (462)> Parser\tokenizer.c (797)> Parser\tokenizer.c (869)> Modules\_bsddb.c (2633)> Modules\_csv.c (1069)> Modules\arraymodule.c (1871)> Modules\gcmodule.c (1363)> Modules\zlib\trees.c (375) _________________________________________________________________ Get the new Windows Live Messenger! http://get.live.com/messenger/overview -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061028/46c37e2d/attachment.htm From ncoghlan at gmail.com Sat Oct 28 06:31:29 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Oct 2006 14:31:29 +1000 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4542A2FE.9070409@canterbury.ac.nz> References: <4542A2FE.9070409@canterbury.ac.nz> Message-ID: <4542DD21.6000100@gmail.com> Greg Ewing wrote: > Travis E. Oliphant wrote: >> PEP: >> Title: Adding data-type objects to the standard library I've used 'datatype' below for consistency, but can we please call them something other than data types? Data layouts? Data formats? Binary layouts? Binary formats? 'type' is already a meaningful term in Python, and having to check whether 'data type' meant a type definition or a data format definition could get annoying. > Not sure about having 3 different ways to specify > the structure -- it smacks of Too Many Ways To Do > It to me. There are actually 5 ways, but the different mechanisms all have different use case (and I'm going to suggest getting rid of the dictionary form). Type-object: Simple conversion of the builtin types (would be good for instances to be able to hook this as with other type conversion functions). 2-tuple: Makes it easy to specify a contiguous C-style array of a given data type. However, rather than doing type-based dispatch here, I would prefer to see this version handled via an optional 'shape' argument, so that all sequences can be handled consistently (more on that below). >>> datatype(int, 5) # short for datatype([(int, 5)]) datatype('int32', (5,)) # describes a 5*4=20-byte block of memory laid out as # a[0], a[1], a[2], a[3], a[4] String-object: The basic formatting definition (I'd be interested in the differences between this definition scheme and the struct definition scheme - one definite goal for an implementation would be an update to the struct module to accept datatype objects, or at least a conversion mechanism for creating a struct layout description from a datatype definition) List object: As for string object, but permits naming of each of the fields. I don't like treating tuples differently from lists, so I'd prefer for this handling applied to be applied to all iterables that don't meet one of the other special cases (direct conversion, string, dictionary). I'd also prefer the meta-information to come *after* the name, and for the name to be completely optional (leaving the corresponding field unnamed). So the possible sequence entries would be: datatype (name, datatype) (name, datatype, shape) where name must be a string or 2-tuple, datatype must be acceptable as a constructor argument, and the shape must be an integer or tuple. For example: datatype(([(('coords', [1,2]), 'f4')), ('address', 'S30'), ]) datatype([('simple', 'i4'), ('nested', [('name', 'S30'), ('addr', 'S45'), ('amount', 'i4') ] ), ]) >>> datatype(['V8', ('var2', 'i1'), 'V3', ('var3', 'f8')] datatype([('', '|V8'), ('var2', '|i1'), ('', '|V3'), ('var3', ' Also, what if I want to refer to fields by name > but don't want to have to work out all the offsets > (which is tedious, error-prone and hostile to > modification)? Use the list definition form. In the current PEP, you would need to define names for all of the uninteresting fields. With the changes I've suggested above, you wouldn't even have to name the fields you don't care about - just describe them. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sat Oct 28 06:37:26 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Oct 2006 14:37:26 +1000 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4542DD21.6000100@gmail.com> References: <4542A2FE.9070409@canterbury.ac.nz> <4542DD21.6000100@gmail.com> Message-ID: <4542DE86.6000706@gmail.com> Nick Coghlan wrote: > There are actually 5 ways, but the different mechanisms all have different use > case (and I'm going to suggest getting rid of the dictionary form). D'oh, I though I deleted that parenthetical comment... obviously, I changed my mind on this point :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sat Oct 28 06:40:17 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Oct 2006 14:40:17 +1000 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <454299D5.4090804@v.loewis.de> References: <454299D5.4090804@v.loewis.de> Message-ID: <4542DF31.5090408@gmail.com> Martin v. L?wis wrote: > Travis E. Oliphant schrieb: >> The datatype is an object that specifies how a certain block of >> memory should be interpreted as a basic data-type. >> >> >>> datatype(float) >> datatype('float64') > > I can't speak on the specific merits of this proposal, or whether this > kind of functionality is desirable. However, I'm -1 on the addition of > a builtin for this functionality (the PEP doesn't actually say that > there is another builtin, but the examples suggest so). Instead, putting > it into the sys, array, struct, or ctypes modules might be more > appropriate, as might be the introduction of another module. I'd say the answer to where we put it will be dependent on what happens to the idea of adding a NumArray style fixed dimension array type to the standard library. If that gets exposed through the array module as array.dimarray, then it would make sense to expose the associated data layout descriptors as array.datatype. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From oliphant.travis at ieee.org Sat Oct 28 08:45:55 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sat, 28 Oct 2006 00:45:55 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <454299D5.4090804@v.loewis.de> References: <454299D5.4090804@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis E. Oliphant schrieb: >> The datatype is an object that specifies how a certain block of >> memory should be interpreted as a basic data-type. >> >> >>> datatype(float) >> datatype('float64') > > I can't speak on the specific merits of this proposal, or whether this > kind of functionality is desirable. However, I'm -1 on the addition of > a builtin for this functionality (the PEP doesn't actually say that > there is another builtin, but the examples suggest so). I was intentionally vague. I don't see a need for it to be a built-in, but didn't know where exactly to "put it," I should have made it a question for discussion. -Travis From oliphant.travis at ieee.org Sat Oct 28 08:49:41 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sat, 28 Oct 2006 00:49:41 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4542A2FE.9070409@canterbury.ac.nz> References: <4542A2FE.9070409@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Travis E. Oliphant wrote: >> PEP: >> Title: Adding data-type objects to the standard library > > Not sure about having 3 different ways to specify > the structure -- it smacks of Too Many Ways To Do > It to me. You might be right, but they all have use-cases. I've actually removed most of the multiple ways that NumPy allows for creating data-types. > > Also, what if I want to refer to fields by name > but don't want to have to work out all the offsets I don't know what you mean. You just use the list-style to define a data-format with fields. The offsets are worked out for you. The only use for offsets was the dictionary form. The dictionary form stems from a desire to use the fields dictionary of a data-type as a data-type specification (which it is essentially is). -Travis From ronaldoussoren at mac.com Sat Oct 28 09:50:42 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sat, 28 Oct 2006 09:50:42 +0200 Subject: [Python-Dev] DRAFT: python-dev summary for 2006-09-16 to 2006-09-30 In-Reply-To: <45429B2B.6070404@v.loewis.de> References: <45429B2B.6070404@v.loewis.de> Message-ID: <9C891890-24DA-43B0-AB0E-A04D3063306E@mac.com> On Oct 28, 2006, at 1:50 AM, Martin v. L?wis wrote: > Steven Bethard schrieb: >> Jack Howarth asked about creating universal binaries for OS X that >> would support 32-bit or 64-bit on both PPC and x86. Ronald Oussoren >> pointed out that the 32-bit part of this was already supported, but >> indicated that adding 64-bit support simultaneously might be more >> difficult. Ronald seemed to think that modifications to pyconfig.h.in >> might solve the problem, though he was worried that this might cause >> distutils to detect some architecture features incorrectly. > > Ronald can surely speak for himself, but I think the problem is > slightly > different. There were different strategies discussed for changing > pyconfig.h (with an include, or with #ifdefs), and in all cases, > distutils would fail to detect the architecture properly. That's not > really a problem of pyconfig.h, but of the way that distutils uses > to detect bitsizes - which inherently cannot work for universal > binaries (i.e. you should look at the running interpreter, not > at pyconfig.h). That depends on what you want to do. If you want to use the information about byteorder and bitsizes to drive the build of an extension you're better of using pyconfig.h instead of using the values of the currently running interpreter. If you want to use the information to generate raw data files in the platform byteorder and bitsizes you're better of using the struct module, so there's really no good reason to look at WORDS_BIGENDIAN and the various SIZEOF_ macros through distutils. An example of this was the build of expat: before I merged the universal binary patches setup.py looked at sys.byteorder and then added a define to the build flags for expat. With the universal patches I changed this to an include-file that looks at the value in pyconfig.h and sets the define that expat expects. This is needed because with universal binaries the byteorder and bitsizes are no longer configure-time constants but are compile-time constants. Note that adding support for universal builds for 32-bit architecturs was relatively easy because only one variable in pyconfig.h needed to be patched and GCC has explicit support for getting the required information. The patch for 32-bit/64-bit builds will probably require sniffing the current architecture (e.g. "#ifdef __i386__") and settings values that way. The cleanest way to do that is in introduction of an additional include file. It also requires changes to setup.py because all mac-specific modules won't build in 64-bit code in released versions of the OS (because OSX only has a 64-bit unix-layer in 10.4, 10.5 will be 64-bit throughout). Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3562 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061028/96560ce2/attachment.bin From mal at egenix.com Sat Oct 28 10:40:01 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 28 Oct 2006 10:40:01 +0200 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: <45431761.1020401@egenix.com> Travis E. Oliphant wrote: > > > ------------------------------------------------------------------------ > > PEP: > Title: Adding data-type objects to the standard library > Attributes > > kind -- returns the basic "kind" of the data-type. The basic kinds > are: > 't' - bit, > 'b' - bool, > 'i' - signed integer, > 'u' - unsigned integer, > 'f' - floating point, > 'c' - complex floating point, > 'S' - string (fixed-length sequence of char), > 'U' - fixed length sequence of UCS4, Shouldn't this read "fixed length sequence of Unicode" ?! The underlying code unit format (UCS2 and UCS4) depends on the Python version. > 'O' - pointer to PyObject, > 'V' - Void (anything else). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 28 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From oliphant.travis at ieee.org Sat Oct 28 11:32:09 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sat, 28 Oct 2006 03:32:09 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45431761.1020401@egenix.com> References: <45431761.1020401@egenix.com> Message-ID: M.-A. Lemburg wrote: > Travis E. Oliphant wrote: >> >> ------------------------------------------------------------------------ >> >> PEP: >> Title: Adding data-type objects to the standard library >> Attributes >> >> kind -- returns the basic "kind" of the data-type. The basic kinds >> are: >> 't' - bit, >> 'b' - bool, >> 'i' - signed integer, >> 'u' - unsigned integer, >> 'f' - floating point, >> 'c' - complex floating point, >> 'S' - string (fixed-length sequence of char), >> 'U' - fixed length sequence of UCS4, > > Shouldn't this read "fixed length sequence of Unicode" ?! > The underlying code unit format (UCS2 and UCS4) depends on the > Python version. Well, in NumPy 'U' always means UCS4. So, I just copied that over. See my questions at the bottom which talk about how to handle this. A data-format does not necessarily have to correspond to something Python represents with an Object. -Travis From g.brandl at gmx.net Sat Oct 28 15:39:09 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 28 Oct 2006 15:39:09 +0200 Subject: [Python-Dev] build bots, log output Message-ID: Hi, I wonder if it's possible that the build bot notification mails that go to python-checkins include the last 10-15 lines from the log. This would make it much easier to decide whether a buildbot failure is an old, esoteric one (e.g. test_wait4 sem_post: Success make: *** [buildbottest] Killed on the hppa one) or a new one, really caused by one's checkin. The alternative would be to fix the tests/buildbots not to have these esoteric failures anymore Georg From arigo at tunes.org Sat Oct 28 15:54:15 2006 From: arigo at tunes.org (Armin Rigo) Date: Sat, 28 Oct 2006 15:54:15 +0200 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: <20061028135415.GA13049@code0.codespeak.net> Hi Travis, On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote: > This PEP proposes adapting the data-type objects from NumPy for > inclusion in standard Python, to provide a consistent and standard > way to discuss the format of binary data. How does this compare with ctypes? Do we really need yet another, incompatible way to describe C-like data structures in the standard library? A bientot, Armin From mal at egenix.com Sat Oct 28 20:10:31 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 28 Oct 2006 20:10:31 +0200 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> Message-ID: <45439D17.5010306@egenix.com> Travis E. Oliphant wrote: > M.-A. Lemburg wrote: >> Travis E. Oliphant wrote: >>> ------------------------------------------------------------------------ >>> >>> PEP: >>> Title: Adding data-type objects to the standard library >>> Attributes >>> >>> kind -- returns the basic "kind" of the data-type. The basic kinds >>> are: >>> 't' - bit, >>> 'b' - bool, >>> 'i' - signed integer, >>> 'u' - unsigned integer, >>> 'f' - floating point, >>> 'c' - complex floating point, >>> 'S' - string (fixed-length sequence of char), >>> 'U' - fixed length sequence of UCS4, >> Shouldn't this read "fixed length sequence of Unicode" ?! >> The underlying code unit format (UCS2 and UCS4) depends on the >> Python version. > > Well, in NumPy 'U' always means UCS4. So, I just copied that over. See > my questions at the bottom which talk about how to handle this. A > data-format does not necessarily have to correspond to something Python > represents with an Object. Ok, but why are you being specific about UCS4 (which is an internal storage format), while you are not specific about e.g. the internal bit size of the integers (which could be 32 or 64 bit) ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 28 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From jcarlson at uci.edu Sat Oct 28 20:42:36 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 28 Oct 2006 11:42:36 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45439D17.5010306@egenix.com> References: <45439D17.5010306@egenix.com> Message-ID: <20061028113844.0B08.JCARLSON@uci.edu> "M.-A. Lemburg" wrote: > > Travis E. Oliphant wrote: > > M.-A. Lemburg wrote: > >> Travis E. Oliphant wrote: > >>> ------------------------------------------------------------------------ > >>> > >>> PEP: > >>> Title: Adding data-type objects to the standard library > >>> Attributes > >>> > >>> kind -- returns the basic "kind" of the data-type. The basic kinds > >>> are: > >>> 't' - bit, > >>> 'b' - bool, > >>> 'i' - signed integer, > >>> 'u' - unsigned integer, > >>> 'f' - floating point, > >>> 'c' - complex floating point, > >>> 'S' - string (fixed-length sequence of char), > >>> 'U' - fixed length sequence of UCS4, > >> Shouldn't this read "fixed length sequence of Unicode" ?! > >> The underlying code unit format (UCS2 and UCS4) depends on the > >> Python version. > > > > Well, in NumPy 'U' always means UCS4. So, I just copied that over. See > > my questions at the bottom which talk about how to handle this. A > > data-format does not necessarily have to correspond to something Python > > represents with an Object. > > Ok, but why are you being specific about UCS4 (which is an internal > storage format), while you are not specific about e.g. the > internal bit size of the integers (which could be 32 or 64 bit) ? I think that even on 64 bit platforms, using 'int' or 'long' generally means 32 bit. In order to get 64 bit ints, one needs to use 'long long'. Sharing some of the codes with the struct module, though arbitrary, doesn't seem like a bad idea to me. Of course offering specifically 32 and 64 bit ints would make sense to me. - Josiah From fredrik at pythonware.com Sat Oct 28 20:42:49 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 28 Oct 2006 20:42:49 +0200 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <20061028113844.0B08.JCARLSON@uci.edu> References: <45439D17.5010306@egenix.com> <20061028113844.0B08.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > I think that even on 64 bit platforms, using 'int' or 'long' generally > means 32 bit. In order to get 64 bit ints, one needs to use 'long long'. real 64-bit platforms use the LP64 standard, where long and pointers are both 64 bits: http://www.unix.org/version2/whatsnew/lp64_wp.html From oliphant.travis at ieee.org Sat Oct 28 21:10:49 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sat, 28 Oct 2006 13:10:49 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45439D17.5010306@egenix.com> References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> Message-ID: M.-A. Lemburg wrote: > Travis E. Oliphant wrote: >> M.-A. Lemburg wrote: >>> Travis E. Oliphant wrote: >>>> ------------------------------------------------------------------------ >>>> >>>> PEP: >>>> Title: Adding data-type objects to the standard library >>>> Attributes >>>> >>>> kind -- returns the basic "kind" of the data-type. The basic kinds >>>> are: >>>> 't' - bit, >>>> 'b' - bool, >>>> 'i' - signed integer, >>>> 'u' - unsigned integer, >>>> 'f' - floating point, >>>> 'c' - complex floating point, >>>> 'S' - string (fixed-length sequence of char), >>>> 'U' - fixed length sequence of UCS4, >>> Shouldn't this read "fixed length sequence of Unicode" ?! >>> The underlying code unit format (UCS2 and UCS4) depends on the >>> Python version. >> Well, in NumPy 'U' always means UCS4. So, I just copied that over. See >> my questions at the bottom which talk about how to handle this. A >> data-format does not necessarily have to correspond to something Python >> represents with an Object. > > Ok, but why are you being specific about UCS4 (which is an internal > storage format), while you are not specific about e.g. the > internal bit size of the integers (which could be 32 or 64 bit) ? > The 'kind' does not specify how "big" the data-type (data-format) is. A number is needed to represent the number of bytes. In this case, the 'kind' does not specify how large the data-type is. You can have 'u1', 'u2', 'u4', etc. The same is true with Unicode. You can have 10-character unicode elements, 20-character, etc. But, we have to be clear about what a "character" is in the data-format. -Travis From oliphant.travis at ieee.org Sat Oct 28 21:21:35 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sat, 28 Oct 2006 13:21:35 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <20061028135415.GA13049@code0.codespeak.net> References: <20061028135415.GA13049@code0.codespeak.net> Message-ID: Armin Rigo wrote: > Hi Travis, > > On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote: >> This PEP proposes adapting the data-type objects from NumPy for >> inclusion in standard Python, to provide a consistent and standard >> way to discuss the format of binary data. > > How does this compare with ctypes? Do we really need yet another, > incompatible way to describe C-like data structures in the standard > library? Part of what the data-type, data-format object is trying to do is bring together all the disparate ways to represent data that *already* exists in the standard library. What is needed is a definitive way to describe data and then have array struct ctypes all be compatible with that same method. That's why I'm proposing the PEP. It's a unification effort not yet-another-method. One of the big reasons for it is to move something like the array interface into Python. There are tens to hundreds of people mostly in the scientific computing community that want to see Python grow more support for NumPy-like things. I keep getting requests to "do something" to make Python more aware of arrays. This PEP is part of that effort. In particular, something like the array interface should be available in Python. The easiest way to do this is to extend the buffer protocol to allow objects to share information about shape, strides, and data-format of a block of memory. But, how do you represent data-format in Python? What will the objects pass back and forth to each other to do it? C-types has a solution which creates multiple objects to do it. This is an un-wieldy over-complicated solution for the array interface. The array objects have a solution using the a single object that carries the data-format information. The solution we have for arrays deserves consideration. It could be placed inside the array module if desired, but again, I'm really looking for something that would allow the extend buffer protocol (to be proposed soon) to share data-type information. That could be done with the array-interface objects (strings, lists, and tuples), but then every body who uses the interface will have to write their own "decoders" to process the data-format information. I actually think ctypes would benefit from this data-format specification too. Recognizing all these diverging ways to essentially talk about the same thing is part of what prompted this PEP. -Travis From martin at v.loewis.de Sat Oct 28 21:24:55 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 28 Oct 2006 21:24:55 +0200 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> Message-ID: <4543AE87.7080909@v.loewis.de> Travis E. Oliphant schrieb: > In this case, the 'kind' does not specify how large the data-type is. > You can have 'u1', 'u2', 'u4', etc. > > The same is true with Unicode. You can have 10-character unicode > elements, 20-character, etc. But, we have to be clear about what a > "character" is in the data-format. That is certainly confusing. In u1, u2, u4, the digit seems to indicate the size of a single value (1 byte, 2 bytes, 4 bytes). Right? Yet, in U20, it does *not* indicate the size of a single value but of an array? And then, it's not the size, but the number of elements? Regards, Martin From martin at v.loewis.de Sat Oct 28 21:25:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 28 Oct 2006 21:25:32 +0200 Subject: [Python-Dev] build bots, log output In-Reply-To: References: Message-ID: <4543AEAC.7090005@v.loewis.de> Georg Brandl schrieb: > I wonder if it's possible that the build bot notification mails that go > to python-checkins include the last 10-15 lines from the log. This would > make it much easier to decide whether a buildbot failure is an old, > esoteric one (e.g. It should be possible to implement that. To do so, one would have to modify the source of the buildbot master. Regards, Martin From mal at egenix.com Sat Oct 28 21:31:34 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 28 Oct 2006 21:31:34 +0200 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> Message-ID: <4543B016.7070002@egenix.com> Travis E. Oliphant wrote: > M.-A. Lemburg wrote: >> Travis E. Oliphant wrote: >>> M.-A. Lemburg wrote: >>>> Travis E. Oliphant wrote: >>>>> ------------------------------------------------------------------------ >>>>> >>>>> PEP: >>>>> Title: Adding data-type objects to the standard library >>>>> Attributes >>>>> >>>>> kind -- returns the basic "kind" of the data-type. The basic kinds >>>>> are: >>>>> 't' - bit, >>>>> 'b' - bool, >>>>> 'i' - signed integer, >>>>> 'u' - unsigned integer, >>>>> 'f' - floating point, >>>>> 'c' - complex floating point, >>>>> 'S' - string (fixed-length sequence of char), >>>>> 'U' - fixed length sequence of UCS4, >>>> Shouldn't this read "fixed length sequence of Unicode" ?! >>>> The underlying code unit format (UCS2 and UCS4) depends on the >>>> Python version. >>> Well, in NumPy 'U' always means UCS4. So, I just copied that over. See >>> my questions at the bottom which talk about how to handle this. A >>> data-format does not necessarily have to correspond to something Python >>> represents with an Object. >> Ok, but why are you being specific about UCS4 (which is an internal >> storage format), while you are not specific about e.g. the >> internal bit size of the integers (which could be 32 or 64 bit) ? >> > > The 'kind' does not specify how "big" the data-type (data-format) is. A > number is needed to represent the number of bytes. > > In this case, the 'kind' does not specify how large the data-type is. > You can have 'u1', 'u2', 'u4', etc. > > The same is true with Unicode. You can have 10-character unicode > elements, 20-character, etc. But, we have to be clear about what a > "character" is in the data-format. I understand and that's why I'm asking why you made the range explicit in the definition. The definition should talk about Unicode code points. The number of bytes then determines whether you can only represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only) or UCS4 (4 bytes, all currently assigned code points). This is similar to the range for integers (ie. ZZ_0), where the number of bytes determines the range of numbers that can be represented. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 28 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From talin at acm.org Sat Oct 28 21:34:39 2006 From: talin at acm.org (Talin) Date: Sat, 28 Oct 2006 12:34:39 -0700 Subject: [Python-Dev] PEP 355 status In-Reply-To: <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> Message-ID: <4543B0CF.1000300@acm.org> BJ?rn Lindqvist wrote: > I'd like to write a post mortem for PEP 355. But one important > question that haven't been answered is if there is a possibility for a > path-like PEP to succeed in the future? If so, does the path-object > implementation have to prove itself in the wild before it can be > included in Python? From earlier posts it seems like you don't like > the concept of path objects, which others have found very interesting. > If that is the case, then it would be nice to hear it explicitly. :) So...how's that post mortem coming along? Did you get a sufficient answer to your questions? And the more interesting question is, will the effort to reform Python's path functionality continue? From reading all the responses to your post, I feel that the community is on the whole supportive of the idea of refactoring os.path and friends, but they prefer a different approach; And several of the responses sketch out some suggestions for what that approach might be. So what happens next? -- Talin From g.brandl at gmx.net Sat Oct 28 22:07:27 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 28 Oct 2006 22:07:27 +0200 Subject: [Python-Dev] build bots, log output In-Reply-To: <4543AEAC.7090005@v.loewis.de> References: <4543AEAC.7090005@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Georg Brandl schrieb: >> I wonder if it's possible that the build bot notification mails that go >> to python-checkins include the last 10-15 lines from the log. This would >> make it much easier to decide whether a buildbot failure is an old, >> esoteric one (e.g. > > It should be possible to implement that. To do so, one would have to > modify the source of the buildbot master. I'd volunteer to do it if I knew where the source of the buildbot master can be found :) Georg From oliphant.travis at ieee.org Sun Oct 29 02:18:04 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sat, 28 Oct 2006 18:18:04 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4543AE87.7080909@v.loewis.de> References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543AE87.7080909@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis E. Oliphant schrieb: >> In this case, the 'kind' does not specify how large the data-type is. >> You can have 'u1', 'u2', 'u4', etc. >> >> The same is true with Unicode. You can have 10-character unicode >> elements, 20-character, etc. But, we have to be clear about what a >> "character" is in the data-format. > > That is certainly confusing. In u1, u2, u4, the digit seems to indicate > the size of a single value (1 byte, 2 bytes, 4 bytes). Right? Yet, > in U20, it does *not* indicate the size of a single value but of an > array? And then, it's not the size, but the number of elements? > Good point. In NumPy, unicode support was added "in parallel" with string arrays where there is not the ambiguity. So, yes, it's true that the unicode case is a special-case. The other way to handle it would be to describe the 'code'-point size (i.e. 'U1', 'U2', 'U4' for UCS-1, UCS-2, UCS-4) and then have the length be encoded as an "array" of those types. This was not the direction we took with NumPy (which is what I'm using as a reference) because I wanted Unicode and string arrays to look the same and thought of strings differently. How to handle unicode data-formats could definitely be improved. Suggestions are welcome. -Travis From greg.ewing at canterbury.ac.nz Sun Oct 29 02:15:40 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Oct 2006 13:15:40 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4542DD21.6000100@gmail.com> References: <4542A2FE.9070409@canterbury.ac.nz> <4542DD21.6000100@gmail.com> Message-ID: <4543F2AC.4020909@canterbury.ac.nz> Nick Coghlan wrote: > Greg Ewing wrote: >> Also, what if I want to refer to fields by name >> but don't want to have to work out all the offsets > Use the list definition form. With the changes I've > suggested above, you wouldn't even have to name the fields you don't > care about - just describe them. That would be okay. I still don't see a strong justification for having a one-big-string form as well as a list/tuple/dict form, though. -- Greg From greg.ewing at canterbury.ac.nz Sun Oct 29 02:17:46 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Oct 2006 13:17:46 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4542DF31.5090408@gmail.com> References: <454299D5.4090804@v.loewis.de> <4542DF31.5090408@gmail.com> Message-ID: <4543F32A.3000705@canterbury.ac.nz> Nick Coghlan wrote: > I'd say the answer to where we put it will be dependent on what happens to the > idea of adding a NumArray style fixed dimension array type to the standard > library. If that gets exposed through the array module as array.dimarray, then > it would make sense to expose the associated data layout descriptors as > array.datatype. Seem to me that arrays are a sub-concept of binary data, not the other way around. So maybe both arrays and data types should be in a module called 'binary' or some such. -- Greg From greg.ewing at canterbury.ac.nz Sun Oct 29 02:10:58 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Oct 2006 14:10:58 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> Message-ID: <4543FFA2.30002@canterbury.ac.nz> Travis E. Oliphant wrote: > The 'kind' does not specify how "big" the data-type (data-format) is. What exactly does "bit" mean in that context? -- Greg From greg.ewing at canterbury.ac.nz Sun Oct 29 02:25:26 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Oct 2006 14:25:26 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543AE87.7080909@v.loewis.de> Message-ID: <45440306.3050805@canterbury.ac.nz> Travis E. Oliphant wrote: > How to handle unicode data-formats could definitely be improved. > Suggestions are welcome. 'U4*10' string of 10 4-byte Unicode chars Then for consistency you'd want 'S*10' rather than just 'S10' (or at least allow it as an alternative). -- Greg From oliphant.travis at ieee.org Sun Oct 29 08:46:39 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sun, 29 Oct 2006 01:46:39 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4543F32A.3000705@canterbury.ac.nz> References: <454299D5.4090804@v.loewis.de> <4542DF31.5090408@gmail.com> <4543F32A.3000705@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Nick Coghlan wrote: >> I'd say the answer to where we put it will be dependent on what happens to the >> idea of adding a NumArray style fixed dimension array type to the standard >> library. If that gets exposed through the array module as array.dimarray, then >> it would make sense to expose the associated data layout descriptors as >> array.datatype. > > Seem to me that arrays are a sub-concept of binary data, > not the other way around. So maybe both arrays and data > types should be in a module called 'binary' or some such. Yes, very good point. That's probably one reason I'm proposing the data-type first before the array interface in the extended buffer protocol. -Travis From oliphant.travis at ieee.org Sun Oct 29 08:50:38 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sun, 29 Oct 2006 01:50:38 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4543FFA2.30002@canterbury.ac.nz> References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543FFA2.30002@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Travis E. Oliphant wrote: > >> The 'kind' does not specify how "big" the data-type (data-format) is. > > What exactly does "bit" mean in that context? Do you mean "big" ? It's how many bytes the kind is using. So, 'u4' is a 4-byte unsigned integer and 'u2' is a 2-byte unsigned integer. -Travis From oliphant.travis at ieee.org Sun Oct 29 08:48:23 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sun, 29 Oct 2006 01:48:23 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4543F2AC.4020909@canterbury.ac.nz> References: <4542A2FE.9070409@canterbury.ac.nz> <4542DD21.6000100@gmail.com> <4543F2AC.4020909@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Nick Coghlan wrote: > >> Greg Ewing wrote: > >>> Also, what if I want to refer to fields by name >>> but don't want to have to work out all the offsets > >> Use the list definition form. With the changes I've >> suggested above, you wouldn't even have to name the fields you don't >> care about - just describe them. > > That would be okay. > > I still don't see a strong justification for having a > one-big-string form as well as a list/tuple/dict form, > though. Compaction of representation is all. It's used quite a bit in numarray, which is where most of the 'kind' names came from as well. When you don't want to name fields it is a really nice feature (but it doesn't nest well). -Travis From martin at v.loewis.de Sun Oct 29 08:57:02 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 29 Oct 2006 08:57:02 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> Message-ID: <45445ECE.9050504@v.loewis.de> Travis E. Oliphant schrieb: > What is needed is a definitive way to describe data and then have > > array > struct > ctypes > > all be compatible with that same method. That's why I'm proposing the > PEP. It's a unification effort not yet-another-method. As I unification mechanism, I think it is insufficient. I doubt it can express all the concepts that ctypes supports. Regards, Martin From oliphant.travis at ieee.org Sun Oct 29 08:49:18 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sun, 29 Oct 2006 01:49:18 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45440306.3050805@canterbury.ac.nz> References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543AE87.7080909@v.loewis.de> <45440306.3050805@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Travis E. Oliphant wrote: > >> How to handle unicode data-formats could definitely be improved. >> Suggestions are welcome. > > 'U4*10' string of 10 4-byte Unicode chars > I like that. Thanks. -Travis From robert.kern at gmail.com Sun Oct 29 09:11:47 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 29 Oct 2006 02:11:47 -0600 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45445ECE.9050504@v.loewis.de> References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis E. Oliphant schrieb: >> What is needed is a definitive way to describe data and then have >> >> array >> struct >> ctypes >> >> all be compatible with that same method. That's why I'm proposing the >> PEP. It's a unification effort not yet-another-method. > > As I unification mechanism, I think it is insufficient. I doubt it > can express all the concepts that ctypes supports. What do you think is missing that can't be added? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From martin at v.loewis.de Sun Oct 29 09:18:12 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 29 Oct 2006 09:18:12 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543AE87.7080909@v.loewis.de> Message-ID: <454463C4.1080009@v.loewis.de> Travis E. Oliphant schrieb: > How to handle unicode data-formats could definitely be improved. As before, I'm doubtful what the actual needs are. For example, is it desired to support generation of ID3v2 tags with such a data format? The tag is specified here: http://www.id3.org/id3v2.4.0-structure.txt In ID3v1, text fields have a specified width, and are supposed to be encoded in Latin-1, and padded with zero bytes. In ID3v2, text fields start with an encoding declaration (say, \x03 for UTF-8), then followed with a null-terminated sequence of UTF-8 bytes. Is it the intent of this PEP to support such data structures, and allow the user to fill in a Unicode object, and then the processing is automatic? (i.e. in ID3v1, the string gets automatically Latin-1-encoded and zero-padded, in ID3v2, it gets automatically UTF-8 encoded, and null-terminated) If that is not to be supported, what are the use cases? Regards, Martin From oliphant.travis at ieee.org Sun Oct 29 09:26:38 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sun, 29 Oct 2006 01:26:38 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45445ECE.9050504@v.loewis.de> References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis E. Oliphant schrieb: >> What is needed is a definitive way to describe data and then have >> >> array >> struct >> ctypes >> >> all be compatible with that same method. That's why I'm proposing the >> PEP. It's a unification effort not yet-another-method. > > As I unification mechanism, I think it is insufficient. I doubt it > can express all the concepts that ctypes supports. > Please clarify what you mean. Are you saying that a single object can't carry all the information about binary data that ctypes allows with it's multi-object approach? I don't agree with you, if that is the case. Sure, perhaps I've not included certain cases, so give an example. Besides, I don't think this is the right view of "unification". I'm not saying that ctypes should get rid of it's many objects used for interfacing with C-functions. I'm saying we should introduce a single-object mechanism for describing binary data so that the many-object approach of c-types does not become some kind of de-facto standard. C-types can "translate" this object-instance to its internals if and when it needs to. In the mean-time, how are other packages supposed to communicate binary information about data with each other? Remember the context that the data-format object is presented in. Two packages need to share a chunk of memory (the package authors do not know each other and only have and Python as a common reference). They both want to describe that the memory they are sharing has some underlying binary structure. How do they do that? Please explain to me how the buffer protocol can be extended so that information about "what is in the memory" can be shared without a data-format object? -Travis From oliphant.travis at ieee.org Sun Oct 29 09:53:09 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sun, 29 Oct 2006 01:53:09 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <454463C4.1080009@v.loewis.de> References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543AE87.7080909@v.loewis.de> <454463C4.1080009@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis E. Oliphant schrieb: >> How to handle unicode data-formats could definitely be improved. > > As before, I'm doubtful what the actual needs are. For example, is > it desired to support generation of ID3v2 tags with such a data > format? The tag is specified here: > Perhaps I was not clear enough about what I'm try to do. For a long time a lot of people have wanted something like Numeric in Python itself. There have been many hurdles to that goal. After discussions at SciPy 2006 with Guido, we decided that the best way to proceed at this point was to extend the buffer protocol to allow packages to share array-like information with each-other. There are several things missing from the buffer protocol that NumPy needs in order to be able to really understand the (fixed-size) memory another package has allocated and is sharing. The most important of these is 1) Shape information 2) Striding information 3) Data-format information (how is each element perceived). Shape and striding information can be shared with a C-array of integers. How is data-format information supposed to be shared? We've come up with a very flexible way to do this in NumPy using a single Python object. This Python object supports describing the layout of any fixed-size chunk of memory (right now in units of bytes --- bit fields could be added, though). I'm proposing to add this object to Python so that the buffer protcol has a fast and efficient way to share #3. That's really all I'm after. It also bothers me that so many ways to describe binary data are being used out there. This is a problem that deserves being solved. And, no, ctypes hasn't solved it (we can't directly use the ctypes solution). Perhaps this PEP doesn't hit all the corners, but a data-format object *is* a useful thing to consider. The array object in Python already has a PyArray_Descr * structure that is a watered-down version of what I'm talking about. In fact, this is what Numeric built from (or vice-versa actually). And NumPy has greatly enhanced this object for any conceivable structure. Guido seemed to think the data-type objects were nice when he saw them at SciPy 2006, and so I'm presenting a PEP. Without the data-format object, I'm don't know how to extend the buffer protocol to communicate data-format information. Do you have a better idea? I have no trouble limiting the data-type object to the buffer protocol extension PEP, but I do think it could gain wider use. > > Is it the intent of this PEP to support such data structures, > and allow the user to fill in a Unicode object, and then the > processing is automatic? (i.e. in ID3v1, the string gets > automatically Latin-1-encoded and zero-padded, in ID3v2, it > gets automatically UTF-8 encoded, and null-terminated) > No, the point of the data-format object is to communicate information about data-formats not to encode or decode anything. Users of the data-format object could decide what they wanted to do with that information. We just need a standard way to communicate it through the buffer protocol. -Travis From martin at v.loewis.de Sun Oct 29 11:04:04 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 29 Oct 2006 11:04:04 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543AE87.7080909@v.loewis.de> <454463C4.1080009@v.loewis.de> Message-ID: <45447C94.2000309@v.loewis.de> Travis E. Oliphant schrieb: > I'm proposing to add this object to Python so that the buffer protcol > has a fast and efficient way to share #3. That's really all I'm after. I admit that I don't understand this objective. Why is it desirable to support such an extended buffer protocol? What specific application would be made possible if it was available and implemented in the relevant modules and data types? What are the relevant modules and data types that should implement it? > It also bothers me that so many ways to describe binary data are being > used out there. This is a problem that deserves being solved. And, no, > ctypes hasn't solved it (we can't directly use the ctypes solution). > Perhaps this PEP doesn't hit all the corners, but a data-format object > *is* a useful thing to consider. IMO, it is only useful if it realistically can support all the use cases that it intends to support. If this PEP is about defining the elements of arrays, I doubt it can realistically support everything you can express in ctypes. There is no support for pointers (except for PyObject*), no support for incomplete (recursive) types, no support for function pointers, etc. Vice versa: why exactly can't you use the data type system of ctypes? If I want to say "int[10]", I do py> ctypes.c_long * 10 To rewrite the examples from the PEP: datatype(float) => ctypes.c_double datatype(int) => ctypes.c_long datatype((int, 5)) => ctypes.c_long * 5 datatype((float, (3,2)) => (ctypes.c_double * 3) * 2 struct { int simple; struct nested { char name[30]; char addr[45]; int amount; } => py> from ctypes import * py> class nested(Structure): ... _fields_ = [("name", c_char*30), ("addr", c_char*45), ("amount", c_long)] ... py> class struct(Structure): ... _fields_ = [("simple", c_int), ("nested", nested)] ... > Guido seemed to think the data-type objects were nice when he saw them > at SciPy 2006, and so I'm presenting a PEP. I have no objection to including NumArray as-is into Python. I just wonder were the rationale for this PEP comes from, i.e. why do you need to exchange this information across different modules? > Without the data-format object, I'm don't know how to extend the buffer > protocol to communicate data-format information. Do you have a better > idea? See above: I can't understand where the need for an extended buffer protocol comes from. I can see why NumArray needs reflection, and needs to keep information to interpret the bytes in the array. But why is it important that the same information is exposed by other data types? >> Is it the intent of this PEP to support such data structures, >> and allow the user to fill in a Unicode object, and then the >> processing is automatic? (i.e. in ID3v1, the string gets >> automatically Latin-1-encoded and zero-padded, in ID3v2, it >> gets automatically UTF-8 encoded, and null-terminated) >> > > No, the point of the data-format object is to communicate information > about data-formats not to encode or decode anything. Users of the > data-format object could decide what they wanted to do with that > information. We just need a standard way to communicate it through the > buffer protocol. This was actually a different sub-thread: why do you need to support the 'U' code (or the 'S' code, for that matter)? In what application do you have fixed size Unicode arrays, as opposed to Unicode strings? Regards, Martin From martin at v.loewis.de Sun Oct 29 11:10:22 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 29 Oct 2006 11:10:22 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> Message-ID: <45447E0E.6050106@v.loewis.de> Travis E. Oliphant schrieb: >> As I unification mechanism, I think it is insufficient. I doubt it >> can express all the concepts that ctypes supports. >> > > Please clarify what you mean. > > Are you saying that a single object can't carry all the information > about binary data that ctypes allows with it's multi-object approach? I'm not sure what you mean by "single object". If I use the tuple syntax, e.g. datatype((float, (3,2)) There are also multiple objects (the float, the 3, and the 2). You get a single "root" object back, but so do you in ctypes. But this isn't really what I meant. Instead, I think the PEP lacks various concepts from C data types, such as pointers, unions, function pointers, alignment/packing. > In the mean-time, how are other packages supposed to communicate binary > information about data with each other? This is my other question. Why should they? > Remember the context that the data-format object is presented in. Two > packages need to share a chunk of memory (the package authors do not > know each other and only have and Python as a common reference). They > both want to describe that the memory they are sharing has some > underlying binary structure. Can you please give an example of such two packages, and an application that needs them share data? Regards, Martin From martin at v.loewis.de Sun Oct 29 11:20:08 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 29 Oct 2006 11:20:08 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> Message-ID: <45448058.5020700@v.loewis.de> Robert Kern schrieb: >> As I unification mechanism, I think it is insufficient. I doubt it >> can express all the concepts that ctypes supports. > > What do you think is missing that can't be added? I can factually only report what is missing. Whether it can be added, I don't know. As I just wrote in a few other messages: pointers, unions, functions pointers, packed structs, incomplete/recursive types. Also "flexible array members" (i.e. open-ended arrays). While it may be possible to come up with a string syntax to describe all these things (*), I wonder whether it should be done, and whether NumArray can then support this extended data model. Regards, Martin (*) perhaps with the exception of incomplete types: C needs forward references in its own syntax. From anthony at interlink.com.au Sun Oct 29 12:20:12 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Sun, 29 Oct 2006 21:20:12 +1000 Subject: [Python-Dev] build bots, log output In-Reply-To: References: Message-ID: <200610292220.14695.anthony@interlink.com.au> On Saturday 28 October 2006 23:39, Georg Brandl wrote: > Hi, > > I wonder if it's possible that the build bot notification mails that go > to python-checkins include the last 10-15 lines from the log. This would > make it much easier to decide whether a buildbot failure is an old, > esoteric one (e.g. A better solution (awaiting sufficient round-tuits) would be to add an option to regrtest that's used by the buildslaves that uses particularly markup around success/fail indications. The buildmaster can pick those up, and keep track of existing pass/fails. Then it could send an email only when one changes. We might also add a daily or every couple of days reminder saying "The following tests are failing on the following platforms, and have been for X days now". Buildmaster code is on dinsdale, in (I think) ~buildbot. It's also in SVN. This solution doesn't require changes to the buildslave code at all - only to the buildmaster and to regrtest. -- Anthony Baxter It's never too late to have a happy childhood. From amk at amk.ca Sun Oct 29 14:47:11 2006 From: amk at amk.ca (A.M. Kuchling) Date: Sun, 29 Oct 2006 08:47:11 -0500 Subject: [Python-Dev] PyCon: proposals due by Tuesday 10/31 Message-ID: <20061029134711.GA15254@rogue.amk.ca> Final reminder: if you want to submit a proposal to PyCon, you should do it by end of Tuesday, October 31st. for more info The deadline for tutorials is November 15th: http://us.pycon.org/TX2007/CallForTutorials PyCon is the Python community conference, held next February 23-25 near Dallas; a tutorial day will be on February 22. See http://us.pycon.org/ for more info. --amk From edcjones at comcast.net Sun Oct 29 17:05:00 2006 From: edcjones at comcast.net (Edward C. Jones) Date: Sun, 29 Oct 2006 11:05:00 -0500 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: <4544D12C.4020209@comcast.net> Travis E. Oliphant wrote: > It also bothers me that so many ways to describe binary data are > being used out there. This is a problem that deserves being solved. Is there a survey paper somewhere about binary formats? What formats are used in particle physics, bio-informatics, astronomy, etc? What software is used to read and write binary data? What descriptive languages are used for data (SQL, XML, etc)? From ndbecker2 at gmail.com Sun Oct 29 17:50:57 2006 From: ndbecker2 at gmail.com (Neal Becker) Date: Sun, 29 Oct 2006 11:50:57 -0500 Subject: [Python-Dev] PEP: Adding data-type objects to Python References: Message-ID: I have watched numpy with interest for a long time. My own interest is to possibly use the c-api to wrap c++ algorithms to use from python. One thing that has concerned me, and continues to concern me with this proposal, is that it seems to suffer from a very fat interface. I certainly have not studied the options in any depth, but my gut feeling is that the interface is too fat and too complex. I wonder if it's possible to avoid this. I wonder if this is an example of all the methods sinking to the base class. From p.f.moore at gmail.com Sun Oct 29 18:00:25 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 29 Oct 2006 17:00:25 +0000 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45447E0E.6050106@v.loewis.de> References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> <45447E0E.6050106@v.loewis.de> Message-ID: <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com> On 10/29/06, "Martin v. L?wis" wrote: > Travis E. Oliphant schrieb: > > Remember the context that the data-format object is presented in. Two > > packages need to share a chunk of memory (the package authors do not > > know each other and only have and Python as a common reference). They > > both want to describe that the memory they are sharing has some > > underlying binary structure. > > Can you please give an example of such two packages, and an application > that needs them share data? Here's an example. PIL handles images (in various formats) in memory, as blocks of binary image data. NumPy provides methods for manipulating in-memory blocks of data. Now, if I want to use NumPy to manipulate that data in place (for example, to cap the red component at 128, and equalise the range of the green component) my code needs to know the format of the memory block that PIL exposes. I am assuming that in-place manipulation is better, because there is no need for repeated copies of the data to be made (this would be true for large images). If PIL could expose a descriptor for its data structure, NumPy code could manipulate it in place without fear of corrupting it. Of course, this can be done by the end user reading the PIL documentation and transcribing the documented format into the NumPy code. But I would argue that it's better if the PIL block is self-describing in a way that avoids the need for a manual transcription of the format. To do this *without* needing the PIL and NumPy developers to co-operate needs an independent standard, which is what I assume this PEP is intended to provide. Paul. From jcarlson at uci.edu Sun Oct 29 18:35:57 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 29 Oct 2006 10:35:57 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com> References: <45447E0E.6050106@v.loewis.de> <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com> Message-ID: <20061029101933.0B0E.JCARLSON@uci.edu> "Paul Moore" wrote: > On 10/29/06, "Martin v. L?wis" wrote: > > Travis E. Oliphant schrieb: > > > Remember the context that the data-format object is presented in. Two > > > packages need to share a chunk of memory (the package authors do not > > > know each other and only have and Python as a common reference). They > > > both want to describe that the memory they are sharing has some > > > underlying binary structure. > > > > Can you please give an example of such two packages, and an application > > that needs them share data? > > To do this *without* needing the PIL and NumPy developers to > co-operate needs an independent standard, which is what I assume this > PEP is intended to provide. One could also toss wxPython, VTK, or any one of the other GUI libraries into the mix for visualizing those images, of which wxPython just acquired no-copy display of PIL images, and being able to manipulate them with numpy (of which some wxPython built in classes use numpy to speed up manipulation) would be very useful. Of all of the intended uses, I'd say that zero-copy sharing of information on the graphics/visualization front is the most immediate 'people will be using it tomorrow' feature. I personally don't have my pulse on the Scientific Python community, so I don't know about other uses, but in regards to Martin's list of missing features: "pointers, unions, function pointers, alignment/packing [, etc.]" I'm going to go out on a limb and say for the majority of those YAGNI, or really, NOHAFIAFACT (no one has asked for it, as far as I can tell). Someone who knows the scipy community, feel free to correct me. - Josiah From martin at v.loewis.de Sun Oct 29 19:48:46 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 29 Oct 2006 19:48:46 +0100 Subject: [Python-Dev] build bots, log output In-Reply-To: <200610292220.14695.anthony@interlink.com.au> References: <200610292220.14695.anthony@interlink.com.au> Message-ID: <4544F78E.8010101@v.loewis.de> Anthony Baxter schrieb: > A better solution (awaiting sufficient round-tuits) would be to add an option > to regrtest that's used by the buildslaves that uses particularly markup > around success/fail indications. The buildmaster can pick those up, and keep > track of existing pass/fails. Then it could send an email only when one > changes. We might also add a daily or every couple of days reminder > saying "The following tests are failing on the following platforms, and have > been for X days now". As yet another alternative, we could put the names of the builders on which builds are expected to fail (or the system names of these systems) into the test cases, and then report "expected failures"; regrtest would give a "success" status if all failures are expected. The consequence would be that these systems would appear "green" on the buildbot page, and you'd have to look into the log file to find out which of the expected failures actually happened. This all could work without changes to buildbot at all. Regards, Martin From martin at v.loewis.de Sun Oct 29 20:30:26 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 29 Oct 2006 20:30:26 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com> References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> <45447E0E.6050106@v.loewis.de> <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com> Message-ID: <45450152.5000706@v.loewis.de> Paul Moore schrieb: > Here's an example. PIL handles images (in various formats) in memory, > as blocks of binary image data. NumPy provides methods for > manipulating in-memory blocks of data. Now, if I want to use NumPy to > manipulate that data in place (for example, to cap the red component > at 128, and equalise the range of the green component) my code needs > to know the format of the memory block that PIL exposes. I am assuming > that in-place manipulation is better, because there is no need for > repeated copies of the data to be made (this would be true for large > images). Thanks, that looks like a good example. Is it possible to elaborate that? E.g. what specific image format would I use (could that work for jpeg, even though this format has compression in it), and what specific NumPy routines would I use to implement the capping and equalising? What would the datatype description look like that those tools need to exchange? Looking at this in more detail, PIL in-memory images (ImagingCore objects) either have the image8 UINT8**, or the image32 INT32**; they have separate fields for pixelsize and linesize. In the image8 case, there are three options: - each value is an 8-bit integer (IMAGING_TYPE_UINT8) (1) - each value is a 16-bit integer, either little (2) or big endian (3) (IMAGING_TYPE_SPECIAL, mode either I;16 or I;16B) In the image32 case, there are five options: - two 8-bit values per four bytes, namely byte 0 and byte 3 (4) - three 8-bit values (bytes 0, 1, 2) (5) - four 8-bit values (6) - a single 32-bit int (7) - a single 32-bit float (8) Now, what would be the algorithm in NumPy that I could use to implement capping and equalising? > If PIL could expose a descriptor for its data structure, NumPy code > could manipulate it in place without fear of corrupting it. Of course, > this can be done by the end user reading the PIL documentation and > transcribing the documented format into the NumPy code. But I would > argue that it's better if the PIL block is self-describing in a way > that avoids the need for a manual transcription of the format. Without digging further, I think some of the formats simply don't allow for the kind of manipulation you suggest, namely all palette formats (which are the single-valued ones, plus the two-band version with a palette number and an alpha value), and greyscale images. So in any case, the application has to look at the mode of the image to find out whether the operation is even meaningful. And then, the application has to tell NumPy somehow what fields to operate on. > To do this *without* needing the PIL and NumPy developers to > co-operate needs an independent standard, which is what I assume this > PEP is intended to provide. Ok, I now understand the goal, although I still like to understand this usecase better. Regards, Martin From bjourne at gmail.com Sun Oct 29 20:33:13 2006 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Sun, 29 Oct 2006 20:33:13 +0100 Subject: [Python-Dev] PEP 355 status In-Reply-To: <4543B0CF.1000300@acm.org> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <4543B0CF.1000300@acm.org> Message-ID: <740c3aec0610291133i225e9d8er2b8f7f8afac03bd5@mail.gmail.com> On 10/28/06, Talin wrote: > BJ?rn Lindqvist wrote: > > I'd like to write a post mortem for PEP 355. But one important > > question that haven't been answered is if there is a possibility for a > > path-like PEP to succeed in the future? If so, does the path-object > > implementation have to prove itself in the wild before it can be > > included in Python? From earlier posts it seems like you don't like > > the concept of path objects, which others have found very interesting. > > If that is the case, then it would be nice to hear it explicitly. :) > > So...how's that post mortem coming along? Did you get a sufficient > answer to your questions? Yes and no. All posts have very exhaustively explained why the implementation in PEP 355 is far from optimal. And I can see why it is. However, what I am uncertain of is Guido's opinion on the background and motivation of the PEP: "Many have felt that the API for manipulating file paths as offered in the os.path module is inadequate." "Currently, Python has a large number of different functions scattered over half a dozen modules for handling paths. This makes it hard for newbies and experienced developers to to choose the right method." IMHO, the current API is very messy. But when it comes to PEPs, it is mostly Guido's opinion that counts. :) Unless he sees a problem with the current situation, then there is no point in writing more PEPs. > And the more interesting question is, will the effort to reform Python's > path functionality continue? I certainly hope so. But maybe it is better to target Python 3000, or maybe the Python devs already have ideas for how they want the path APIs to look like? > So what happens next? I really hope that Guido will give his input when he has more time. Mvh Bj?rn From martin at v.loewis.de Sun Oct 29 20:37:09 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 29 Oct 2006 20:37:09 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <20061029101933.0B0E.JCARLSON@uci.edu> References: <45447E0E.6050106@v.loewis.de> <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com> <20061029101933.0B0E.JCARLSON@uci.edu> Message-ID: <454502E5.80202@v.loewis.de> Josiah Carlson schrieb: > One could also toss wxPython, VTK, or any one of the other GUI libraries > into the mix for visualizing those images, of which wxPython just > acquired no-copy display of PIL images, and being able to manipulate > them with numpy (of which some wxPython built in classes use numpy to > speed up manipulation) would be very useful. I'm doubtful that this PEP alone would allow zero-copy sharing of images for display. Often, the libraries need the data in a different format. So they need to copy, even if they could understand the other format. However, the PEP won't allow "understanding" the format. If I know I have an array of 4-byte values: which of them is R, G, B, and A? Regards, Martin From talin at acm.org Sun Oct 29 20:56:19 2006 From: talin at acm.org (Talin) Date: Sun, 29 Oct 2006 11:56:19 -0800 Subject: [Python-Dev] PEP 355 status In-Reply-To: <740c3aec0610291133i225e9d8er2b8f7f8afac03bd5@mail.gmail.com> References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm> <2mk63lfu6j.fsf@starship.python.net> <021c01c6e4de$7b1a6d80$9a4c2a97@bagio> <740c3aec0610241711j30f4beaepf294a7e3772bf70e@mail.gmail.com> <4543B0CF.1000300@acm.org> <740c3aec0610291133i225e9d8er2b8f7f8afac03bd5@mail.gmail.com> Message-ID: <45450763.1030000@acm.org> BJ?rn Lindqvist wrote: > On 10/28/06, Talin wrote: >> BJ?rn Lindqvist wrote: >> > I'd like to write a post mortem for PEP 355. But one important >> > question that haven't been answered is if there is a possibility for a >> > path-like PEP to succeed in the future? If so, does the path-object >> > implementation have to prove itself in the wild before it can be >> > included in Python? From earlier posts it seems like you don't like >> > the concept of path objects, which others have found very interesting. >> > If that is the case, then it would be nice to hear it explicitly. :) >> >> So...how's that post mortem coming along? Did you get a sufficient >> answer to your questions? > > Yes and no. All posts have very exhaustively explained why the > implementation in PEP 355 is far from optimal. And I can see why it > is. However, what I am uncertain of is Guido's opinion on the > background and motivation of the PEP: > > "Many have felt that the API for manipulating file paths as offered in > the os.path module is inadequate." > > "Currently, Python has a large number of different functions scattered > over half a dozen modules for handling paths. This makes it hard for > newbies and experienced developers to to choose the right method." > > IMHO, the current API is very messy. But when it comes to PEPs, it is > mostly Guido's opinion that counts. :) Unless he sees a problem with > the current situation, then there is no point in writing more PEPs. > >> And the more interesting question is, will the effort to reform Python's >> path functionality continue? > > I certainly hope so. But maybe it is better to target Python 3000, or > maybe the Python devs already have ideas for how they want the path > APIs to look like? I think targeting Py3K is a good idea. The whole purpose of Py3K is to "clean up the messes" of past decisions, and to that end, a certain amount of backwards-compatibility breakage will be allowed (although if that can be avoided, so much the better.) And to the second point, having been following the Py3K list, I don't anyone has expressed any preconceived notions of how they want things to look (well, except I know I do, but I'm not a core dev :) :). >> So what happens next? > > I really hope that Guido will give his input when he has more time. First bit of advice is, don't hold your breath. Second bit of advice is, if you really do want Guido's feedback (or the core python devs), start my creating a (short) list of the outstanding points of controversy to be resolved. Once those issues have been decided, then proceed to the next stage, building consensus by increments. Basically, anything that requires Guido to read more than a page of material isn't going to get done quickly. At least, in my experience :) > Mvh Bj?rn From jcarlson at uci.edu Sun Oct 29 20:13:19 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 29 Oct 2006 12:13:19 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <454502E5.80202@v.loewis.de> References: <20061029101933.0B0E.JCARLSON@uci.edu> <454502E5.80202@v.loewis.de> Message-ID: <20061029120129.0B1E.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > Josiah Carlson schrieb: > > One could also toss wxPython, VTK, or any one of the other GUI libraries > > into the mix for visualizing those images, of which wxPython just > > acquired no-copy display of PIL images, and being able to manipulate > > them with numpy (of which some wxPython built in classes use numpy to > > speed up manipulation) would be very useful. > > I'm doubtful that this PEP alone would allow zero-copy sharing of images > for display. Often, the libraries need the data in a different format. > So they need to copy, even if they could understand the other format. > However, the PEP won't allow "understanding" the format. If I know I > have an array of 4-byte values: which of them is R, G, B, and A? ...in the cases I have seen, which includes BMP, TGA, uncompressed TIFF, a handful of platform-specific bitmap formats, etc., you _always_ get them in RGBA order. If the alpha channel is to be left out, then you get them as RGB. The trick with allowing zero-copy sharing is 1) to understand the format, and 2) to manipulate/display in-place. The former is necessary for the latter, which is what Travis is shooting for. Also, because wxPython has figured out how PIL images are structured, they can do #2, and so far no one has mentioned any examples where the standard RGB/RGBA format hasn't worked for them. In the case of jpegs (as you mentioned in another message), PIL uncompresses all images it understands into some kind of 'natural' format (from what I understand). For 24/32 bit images, that is RGB or RGBA. For palletized images (gif, 8-bit png, 8-bit bmp, etc.) maybe it is a palletized format, or maybe it is RGB/RGBA? I don't know, all of my images are 24/32 bit, but I can just about guarantee it's not an issue for the case that Paul mentioned. - Josiah From brett at python.org Sun Oct 29 21:58:05 2006 From: brett at python.org (Brett Cannon) Date: Sun, 29 Oct 2006 12:58:05 -0800 Subject: [Python-Dev] Status of new issue tracker Message-ID: The initial admins for the Roundup installation have been chosen: Paul DuBois, Michael Twomey, Stefan Seefeld, and Erik Forsberg. The offer from Upfront Systems (http://www.upfrontsystems.co.za/) has been accepted for professional Roundup hosting. Discussion of how to handle the new tracker (including the design of it, handling the transition, etc.) will take place on the tracker-discuss mailing list (http://mail.python.org/mailman/listinfo/tracker-discuss). If you want to provide input on what you want the new tracker to do, please join the list. Input from members of python-dev will take precedence so please participate if you have any interest. I don't have a timeline on when all of this will happen (talks amongst the four admins has already started on the mailing list and Upfront has started the process of getting us our account). The first step is to get the admins situated with their new server. Then we start worrying about what info we want the tracker to store and how to transition off of SF. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061029/460f52c4/attachment.html From greg.ewing at canterbury.ac.nz Sun Oct 29 23:58:20 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 30 Oct 2006 11:58:20 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543FFA2.30002@canterbury.ac.nz> Message-ID: <4545320C.603@canterbury.ac.nz> Travis E. Oliphant wrote: > Greg Ewing wrote: >>What exactly does "bit" mean in that context? > > Do you mean "big" ? No, you've got a data type there called "bit", which seems to imply a size, in contradiction to the size-independent nature of the other types. I'm asking what size-independent information it's meant to convey. -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 30 00:37:54 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 30 Oct 2006 12:37:54 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543AE87.7080909@v.loewis.de> <454463C4.1080009@v.loewis.de> Message-ID: <45453B52.5030802@canterbury.ac.nz> Travis E. Oliphant wrote: > Martin v. L?wis wrote: > >>Travis E. Oliphant schrieb: >>Is it the intent of this PEP to support such data structures, >>and allow the user to fill in a Unicode object, and then the >>processing is automatic? > No, the point of the data-format object is to communicate information > about data-formats not to encode or decode anything. Well, there's still the issue of how much detail you want to be able to convey, so I think the question is valid. Is the encoding of a Unicode string something we want to be able to communicate via this mechanism, or is that outside its scope? -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 30 00:38:01 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 30 Oct 2006 12:38:01 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <20061029120129.0B1E.JCARLSON@uci.edu> References: <20061029101933.0B0E.JCARLSON@uci.edu> <454502E5.80202@v.loewis.de> <20061029120129.0B1E.JCARLSON@uci.edu> Message-ID: <45453B59.3070406@canterbury.ac.nz> Josiah Carlson wrote: > ...in the cases I have seen ... you _always_ get > them in RGBA order. Except when you don't. I've had cases where I've had to convert between RGBA and BGRA (for stuffing directly into a frame buffer on Linux, as far as I remember). So it may be worth including some features in the standard for describing pixel formats. Pygame seems to have a very detailed and flexible system for doing this, so it might be a good idea to have a look at that. -- Greg From skip at pobox.com Mon Oct 30 01:45:42 2006 From: skip at pobox.com (skip at pobox.com) Date: Sun, 29 Oct 2006 18:45:42 -0600 Subject: [Python-Dev] test_codecs failures Message-ID: <17733.19254.732407.478451@montanaro.dyndns.org> I recently began running a Pybots buildslave for SQLAlchemy. I am still struggling to get that working correctly. Today, Python's test_codecs test began failing: test test_codecs failed -- Traceback (most recent call last): File "/Library/Buildbot/pybot/trunk.montanaro-g5/build/Lib/test/test_codecs.py", line 1165, in test_basics encoder = codecs.getincrementalencoder(encoding)("ignore") File "/Library/Buildbot/pybot/trunk.montanaro-g5/build/Lib/encodings/bz2_codec.py", line 56, in __init__ assert errors == 'strict' AssertionError This failure seems to coincide with some checkins by Georg. Full output here: http://www.python.org/dev/buildbot/community/all/?show=g5%20OSX%20trunk&show=g5%20OSX%202.5 Skip From nnorwitz at gmail.com Mon Oct 30 01:48:51 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 29 Oct 2006 16:48:51 -0800 Subject: [Python-Dev] test_codecs failures In-Reply-To: <17733.19254.732407.478451@montanaro.dyndns.org> References: <17733.19254.732407.478451@montanaro.dyndns.org> Message-ID: On 10/29/06, skip at pobox.com wrote: > I recently began running a Pybots buildslave for SQLAlchemy. I am still > struggling to get that working correctly. Today, Python's test_codecs test > began failing: I checked in a fix for this that hasn't quite completed yet. (Only finished successfully on one box so far.) So this should be taken care of. I *think* the fix was correct, but I'm not entirely positive. Also the refleak problem is fixed AFAIK. n From walter at livinglogic.de Mon Oct 30 01:51:38 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon, 30 Oct 2006 01:51:38 +0100 Subject: [Python-Dev] test_codecs failures In-Reply-To: References: <17733.19254.732407.478451@montanaro.dyndns.org> Message-ID: <45454C9A.4040903@livinglogic.de> Neal Norwitz wrote: > On 10/29/06, skip at pobox.com wrote: >> I recently began running a Pybots buildslave for SQLAlchemy. I am still >> struggling to get that working correctly. Today, Python's test_codecs test >> began failing: > > I checked in a fix for this that hasn't quite completed yet. (Only > finished successfully on one box so far.) So this should be taken > care of. I *think* the fix was correct, but I'm not entirely > positive. The fix *is* indeed correct. bz2 didn't get built on my box, so I didn't see the failure. Servus, Walter From ncoghlan at gmail.com Mon Oct 30 11:00:42 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 30 Oct 2006 20:00:42 +1000 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: <4545CD4A.8040302@gmail.com> Neal Becker wrote: > I have watched numpy with interest for a long time. My own interest is to > possibly use the c-api to wrap c++ algorithms to use from python. > > One thing that has concerned me, and continues to concern me with this > proposal, is that it seems to suffer from a very fat interface. I > certainly have not studied the options in any depth, but my gut feeling is > that the interface is too fat and too complex. I wonder if it's possible > to avoid this. I wonder if this is an example of all the methods sinking > to the base class. You've just described my number #1 concern with incorporating NumPy wholesale, and the reason I believe it would be nice to cherry-pick a couple of key components for the standard library, rather than adopting the whole thing. Travis has done a lot of work towards that goal (the latest result of which is this pre-PEP for describing the individual array elements in a way that is more flexible than the single character codes of the current array module). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From Jack.Jansen at cwi.nl Mon Oct 30 16:40:19 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Mon, 30 Oct 2006 16:40:19 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: Would it be possible to make the data-type objects subclassable, with the subclasses being able to override the equality test? The range of data types that you've specified in the PEP are good enough for most general use, and probably for NumPy as well, but someone already came up with the example of image formats, which have their whole own range of data formats. I could throw in audio formats (bits per sample, excess-N or signed or ulaw samples, mono/stereo/5.1/ etc, order of the channels), and there's probably a whole slew of other areas that have their own sets of formats. If the datatype objects are subclassable, modules could initially start by adding their own formats. So, the "jackaudio" and "jillaudio" modules would have distinct sets of formats. But then later on it should be fairly easy for them to recognize each others formats. So, jackaudio would recognize the jillaudio format "msdos linear pcm" as being identical to its own "16-bit excess-32768". Hopefully eventually all audio module writers would get together and define a set of standard audio formats. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061030/7ae95590/attachment.htm -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2255 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20061030/7ae95590/attachment.bin From deets at web.de Mon Oct 30 18:19:19 2006 From: deets at web.de (Diez B. Roggisch) Date: Mon, 30 Oct 2006 18:19:19 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <20061029120129.0B1E.JCARLSON@uci.edu> References: <20061029101933.0B0E.JCARLSON@uci.edu> <454502E5.80202@v.loewis.de> <20061029120129.0B1E.JCARLSON@uci.edu> Message-ID: <200610301819.20117.deets@web.de> > ...in the cases I have seen, which includes BMP, TGA, uncompressed TIFF, > a handful of platform-specific bitmap formats, etc., you _always_ get > them in RGBA order. If the alpha channel is to be left out, then you > get them as RGB. Mac OS X unfortunately uses ARGB. Writing some alti-vec code remedied that for passing it around to the OpenCV library. Just my $.02 Diez From jimjjewett at gmail.com Mon Oct 30 18:56:18 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 30 Oct 2006 12:56:18 -0500 Subject: [Python-Dev] PEP: Adding data-type objects to Python Message-ID: Travis E. Oliphant wrote: > Two packages need to share a chunk of memory (the package authors do not > know each other and only have and Python as a common reference). They > both want to describe that the memory they are sharing has some > underlying binary structure. As a quick sanity check, please tell me where I went off track. it sounds to me like you are assuming that: (1) The memory chunk represents a single object (probably an array of some sort) (2) That subchunks can themselves be described by a (single?) repeating C struct. (3) You can't just use the C header, since you want this at run-time. (4) It would be enough if you could say This is an array of 500 elements that look like struct { int simple; struct nested { char name[30]; char addr[45]; int amount; } (5) But is it not acceptable to use Martin's suggested ctypes equivalent of (building out from the inside): class nested(Structure): _fields_ = [("name", c_char*30), ("addr", c_char*45), ("amount", c_long)] class struct(Structure): _fields_ = [("simple", c_int), ("nested", nested)] struct * 500 If I misunderstood, could you show me where? If I did understand correctly, could you expand on why (5) is unacceptable, given that ctypes is now in the core? (New and unknown, I would understand -- but that is also true of any datatype proposal, for the people who haven't already used it. I suspect that any differences from Numpy would be a source of pain for those who *have* used Numpy, but following Numpy exactly is ... not much simpler than the above.) Or are you just saying that "anything with a buffer interface should also have a datatype object describing the layout in a standard way"? If so, that makes sense, but I'm inclined to prefer the ctypes way, so that most people won't ever have to worry about things like endianness/strides/Fortan layout. -jJ From oliphant.travis at ieee.org Mon Oct 30 22:26:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Oct 2006 14:26:02 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <454502E5.80202@v.loewis.de> References: <45447E0E.6050106@v.loewis.de> <79990c6b0610290900y66f7756eqbc9000dfab1c010@mail.gmail.com> <20061029101933.0B0E.JCARLSON@uci.edu> <454502E5.80202@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Josiah Carlson schrieb: > >>One could also toss wxPython, VTK, or any one of the other GUI libraries >>into the mix for visualizing those images, of which wxPython just >>acquired no-copy display of PIL images, and being able to manipulate >>them with numpy (of which some wxPython built in classes use numpy to >>speed up manipulation) would be very useful. > > > I'm doubtful that this PEP alone would allow zero-copy sharing of images > for display. Often, the libraries need the data in a different format. > So they need to copy, even if they could understand the other format. > However, the PEP won't allow "understanding" the format. If I know I > have an array of 4-byte values: which of them is R, G, B, and A? > You give a name to the fields: 'R', 'G', 'B', and 'A'. -Travis From oliphant.travis at ieee.org Mon Oct 30 22:44:22 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Oct 2006 14:44:22 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: Jim Jewett wrote: > Travis E. Oliphant wrote: > > >>Two packages need to share a chunk of memory (the package authors do not >>know each other and only have and Python as a common reference). They >>both want to describe that the memory they are sharing has some >>underlying binary structure. > > > As a quick sanity check, please tell me where I went off track. > > it sounds to me like you are assuming that: > > (1) The memory chunk represents a single object (probably an array of > some sort) > (2) That subchunks can themselves be described by a (single?) > repeating C struct. > (3) You can't just use the C header, since you want this at run-time. > (4) It would be enough if you could say > > This is an array of 500 elements that look like > > struct { > int simple; > struct nested { > char name[30]; > char addr[45]; > int amount; > } > Sure. I think that's pretty much it. I assume you mean object in the general sense and not as in (Python object). > (5) But is it not acceptable to use Martin's suggested ctypes > equivalent of (building out from the inside): Part of the problem is that ctypes uses a lot of different Python types (that's what I mean by "multi-object" to accomplish it's goal). What I'm looking for is a single Python type that can be passed around and explains binary data. Remember the buffer protocol is in compiled code. So, as a result, 1) It's harder to construct a class to pass through the protocol using the multiple-types approach of ctypes. 2) It's harder to interpret the object recevied through the buffer protocol. Sure, it would be *possible* to use ctypes, but I think it would be very difficult. Think about how you would write the get_data_format C function in the extended buffer protocol for NumPy if you had to import ctypes and then build a class just to describe your data. How would you interpret what you get back? The ctypes "format-description" approach is not as unified as a single Python type object that I'm proposing. In NumPy, we have a very nice, compact description of complicated data already available. Why not use what we've learned? I don't think we should just *use ctypes because it's there* when the way it describes binary data was not constructed with the extended buffer protocol in mind. The other option, of course, which would not introduce a new Python type is to use the array interface specification and pass a list of tuples. But, I think this is also un-necessarily wasteful because the sending object has to construct it and the receiving object has to de-construct it. The whole point of the (extended) buffer protocol is to communicate this information more quickly. -Travis From oliphant.travis at ieee.org Mon Oct 30 22:50:34 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Oct 2006 14:50:34 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4545320C.603@canterbury.ac.nz> References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543FFA2.30002@canterbury.ac.nz> <4545320C.603@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Travis E. Oliphant wrote: > > >>Greg Ewing wrote: > > >>>What exactly does "bit" mean in that context? >> >>Do you mean "big" ? > > > No, you've got a data type there called "bit", > which seems to imply a size, in contradiction > to the size-independent nature of the other > types. I'm asking what size-independent > information it's meant to convey. Ah. I see what you were saying now. I guess the 'bit' type is different (we actually don't have that type in NumPy so my understanding of it is limited). The 'bit' type re-intprets the size information to be in units of "bits" and so implies a "bit-field" instead of another data-format. -Travis From oliphant.travis at ieee.org Mon Oct 30 23:00:49 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Oct 2006 15:00:49 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45448058.5020700@v.loewis.de> References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> <45448058.5020700@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Robert Kern schrieb: > >>>As I unification mechanism, I think it is insufficient. I doubt it >>>can express all the concepts that ctypes supports. >> >>What do you think is missing that can't be added? > > > I can factually only report what is missing. Whether it can be added, > I don't know. As I just wrote in a few other messages: pointers, > unions, functions pointers, packed structs, incomplete/recursive > types. Also "flexible array members" (i.e. open-ended arrays). > I understand function pointers, pointers, and unions. Function pointers are "supported" with the void data-type and could be more specifically supported if it were important. People typically don't use the buffer protocol to send function-pointers around in a way that the void description wouldn't be enough. Pointers are also "supported" with the void data-type. If pointers to other data-types were an important feature to support, then this could be added in many ways (a flag on the data-type object for example is how this is done is NumPy). Unions are actually supported (just define two fields with the same offset). I don't know what you mean by "packed structs" (unless you are talking about alignment issues in which case there is support for it). I'm not sure I understand what you mean by "incomplete / recursive" types unless you are referring to something like a node where an element of the structure is a pointer to another structure of the same kind (like used in linked-lists or trees). If that is the case, then it's easily supported once support for pointers is added. I also don't know what you mean by "open-ended arrays." The data-format is meant to describe a fixed-size chunk of data. String syntax is not needed to support all of these things. What I'm asking for and proposing is a way to construct an instance of a single Python type that communicates this data-format information in a standardized way. -Travis From martin at v.loewis.de Tue Oct 31 00:25:08 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 31 Oct 2006 00:25:08 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> <45448058.5020700@v.loewis.de> Message-ID: <454689D4.9040109@v.loewis.de> Travis Oliphant schrieb: > Function pointers are "supported" with the void data-type and could be > more specifically supported if it were important. People typically > don't use the buffer protocol to send function-pointers around in a way > that the void description wouldn't be enough. As I said before, I can't tell whether it's important, as I still don't know what the purpose of this PEP is. If it is to support a unification of memory layout specifications, and if that unifications is also to include ctypes, then yes, it is important. If it is to describe array elements in NumArray arrays, then it might not be important. For the usage of ctypes, the PEP void type is insufficient to describe function pointers: you also need a specification of the signature of the function pointer (parameter types and return type), or else you can't use the function pointer (i.e. you can't call the function). > Pointers are also "supported" with the void data-type. If pointers to > other data-types were an important feature to support, then this could > be added in many ways (a flag on the data-type object for example is how > this is done is NumPy). For ctypes, (I think) you need "true" pointers to other layouts, or else you couldn't set up the memory correctly. I don't understand how this could work with some extended buffer protocol, though: would a buffer still have to be a contiguous piece of memory? If you have structures with pointers in them, they rarely point to contiguous memory. > Unions are actually supported (just define two fields with the same > offset). Ah, ok. What's the string syntax for it? > I don't know what you mean by "packed structs" (unless you are talking > about alignment issues in which case there is support for it). Yes, this is indeed about alignment; I missed it. What's the string syntax for it? > I'm not sure I understand what you mean by "incomplete / recursive" > types unless you are referring to something like a node where an element > of the structure is a pointer to another structure of the same kind > (like used in linked-lists or trees). If that is the case, then it's > easily supported once support for pointers is added. That's what I mean, yes. I'm not sure how it can easily be added, though. Suppose you want to describe struct item{ int key; char* value; struct item *next; }; How would you do that? Something like item = datatype([('key', 'i4'), ('value', 'S*'), ('next', 'what_to_put_here*')] can't work: item hasn't been assigned, yet, so you can't use it as the field type. > I also don't know what you mean by "open-ended arrays." The data-format > is meant to describe a fixed-size chunk of data. I see. In C (and thus in ctypes), you sometimes have what C99 calls "flexible array member": struct PyString{ Py_ssize_t ob_refcnt; PyObject *ob_type; Py_ssize_t ob_len; char ob_sval[]; }; where the ob_sval field can extend arbitrarily, as it is the last member of the struct. Of course, this will give you dynamically-sized objects (objects in C cannot really be "variable-sized", since the size of a memory block has to be defined at allocation time, and can't really change afterwards). > String syntax is not needed to support all of these things. Ok. That's confusing in the PEP: it's not clear whether all these forms are meant to be equivalent, and, if not, which one is the most generic one, and what aspects are missing in what forms. Also, if you have a datatype which cannot be expressed in the string syntax, what is its "str" attribute? Regards, Martin From greg.ewing at canterbury.ac.nz Tue Oct 31 00:36:46 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Oct 2006 12:36:46 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: Message-ID: <45468C8E.1000203@canterbury.ac.nz> Travis Oliphant wrote: > Part of the problem is that ctypes uses a lot of different Python types > (that's what I mean by "multi-object" to accomplish it's goal). What > I'm looking for is a single Python type that can be passed around and > explains binary data. It's not clear that multi-object is a bad thing in and of itself. It makes sense conceptually -- if you have a datatype object representing a struct, and you ask for a description of one of its fields, which could be another struct or array, you would expect to get another datatype object describing that. Can you elaborate on what would be wrong with this? Also, can you clarify whether your objection is to multi-object or multi-type. They're not the same thing -- you could have a data structure built out of multiple objects that are all of the same Python type, with attributes distinguishing between struct, array, etc. That would be single-type but multi-object. -- Greg From greg.ewing at canterbury.ac.nz Tue Oct 31 00:43:11 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Oct 2006 12:43:11 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543FFA2.30002@canterbury.ac.nz> <4545320C.603@canterbury.ac.nz> Message-ID: <45468E0F.80000@canterbury.ac.nz> Travis Oliphant wrote: > The 'bit' type re-intprets the size information to be in units of "bits" > and so implies a "bit-field" instead of another data-format. Hmmm, okay, but now you've got another orthogonality problem, because you can't distinguish between e.g. a 5-bit signed int field and a 5-bit unsigned int field. It might be better not to consider "bit" to be a type at all, and come up with another way of indicating that the size is in bits. Perhaps 'i4' # 4-byte signed int 'i4b' # 4-bit signed int 'u4' # 4-byte unsigned int 'u4b' # 4-bit unsigned int (Next we can have an argument about whether bit fields should be packed MSB-to-LSB or vice versa...:-) -- Greg From greg.ewing at canterbury.ac.nz Tue Oct 31 00:49:12 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Oct 2006 12:49:12 +1300 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> <45448058.5020700@v.loewis.de> Message-ID: <45468F78.7050707@canterbury.ac.nz> Travis Oliphant wrote: > I'm not sure I understand what you mean by "incomplete / recursive" > types unless you are referring to something like a node where an element > of the structure is a pointer to another structure of the same kind > (like used in linked-lists or trees). Yes, and more complex arrangements of types that reference each other. > If that is the case, then it's > easily supported once support for pointers is added. But it doesn't fit easily into the single-object model. -- Greg From oliphant.travis at ieee.org Tue Oct 31 02:58:28 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Oct 2006 18:58:28 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <20061028135415.GA13049@code0.codespeak.net> References: <20061028135415.GA13049@code0.codespeak.net> Message-ID: Armin Rigo wrote: > Hi Travis, > > On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote: > >> This PEP proposes adapting the data-type objects from NumPy for >> inclusion in standard Python, to provide a consistent and standard >> way to discuss the format of binary data. > > > How does this compare with ctypes? Do we really need yet another, > incompatible way to describe C-like data structures in the standarde > library? There is a lot of subtlety in the details that IMHO clouds the central issue which I will try to clarify here the way I see it. First of all: In order to make sense of the data-format object that I'm proposing you have to see the need to share information about data-format through an extended buffer protocol (which I will be proposing soon). I'm not going to try to argue that right now because there are a lot of people who can do that. So, I'm going to assume that you see the need for it. If you don't, then just suspend concern about that for the moment. There are a lot of us who really see the need for it. Now: To describe data-formats ctypes uses a Python type-object defined for every data-format you might need. In my view this is an 'over-use' of the type-object and in fact, to be useful, requires the definition of a meta-type that carries the relevant additions to the type-object that are needed to describe data (like function pointers to get data in and out of Python objects). My view is that it is un-necessary to use a different type object to describe each different data-type. The route I'm proposing is to define (in C) a *single* new Python type (called a data-format type) that carries the information needed to describe a chunk of memory. In this way *instances* of this new type define data-formats. In ctypes *instances* of the "meta-type" (i.e. new types) define data-formats (actually I'm not sure if all the new c-types are derived from the same meta-type). So, the big difference is that I think data-formats should be *instances* of a single type. There is no need to define a Python type-object for every single data-type. In fact, not only is there no need, it makes the extended buffer protocol I'm proposing even more difficult to use and explain. Again, my real purpose is the extended buffer protocol. These data-format type is a means to that end. If the consensus is that nobody sees a greater use of the data-format type beyond the buffer protocol, then I will just write 1 PEP for the extended buffer protocol. -Travis From oliphant.travis at ieee.org Tue Oct 31 03:00:59 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Oct 2006 19:00:59 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45468C8E.1000203@canterbury.ac.nz> References: <45468C8E.1000203@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Travis Oliphant wrote: > > >>Part of the problem is that ctypes uses a lot of different Python types >>(that's what I mean by "multi-object" to accomplish it's goal). What >>I'm looking for is a single Python type that can be passed around and >>explains binary data. > > > It's not clear that multi-object is a bad thing in and > of itself. It makes sense conceptually -- if you have > a datatype object representing a struct, and you ask > for a description of one of its fields, which could > be another struct or array, you would expect to get > another datatype object describing that. > > Can you elaborate on what would be wrong with this? > > Also, can you clarify whether your objection is to > multi-object or multi-type. They're not the same thing -- > you could have a data structure built out of multiple > objects that are all of the same Python type, with > attributes distinguishing between struct, array, etc. > That would be single-type but multi-object. I've tried to clarify this in another post. Basically, what I don't like about the ctypes approach is that it is multi-type (every new data-format is a Python type). In order to talk about all these Python types together, then they must all share some attribute (or else be derived from a meta-type in C with a specific function-pointer entry). I think it is simpler to think of a single Python type whose instances convey information about data-format. -Travis From oliphant.travis at ieee.org Tue Oct 31 06:10:17 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Mon, 30 Oct 2006 22:10:17 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45468C8E.1000203@canterbury.ac.nz> Message-ID: Travis Oliphant wrote: > Greg Ewing wrote: >> Travis Oliphant wrote: >> >> >>> Part of the problem is that ctypes uses a lot of different Python types >>> (that's what I mean by "multi-object" to accomplish it's goal). What >>> I'm looking for is a single Python type that can be passed around and >>> explains binary data. >> >> It's not clear that multi-object is a bad thing in and >> of itself. It makes sense conceptually -- if you have >> a datatype object representing a struct, and you ask >> for a description of one of its fields, which could >> be another struct or array, you would expect to get >> another datatype object describing that. Yes, exactly. This is what the Python type I'm proposing does as well. So, perhaps we are misunderstanding each other. The difference is that data-types are instances of the data-type (data-format) object instead of new Python types (as they are in ctypes). > > I've tried to clarify this in another post. Basically, what I don't > like about the ctypes approach is that it is multi-type (every new > data-format is a Python type). > I should clarify that I have no opinion about the ctypes approach for what ctypes does with it. I like ctypes and have adapted NumPy to make it easier to work with ctypes. I'm saying that I don't like the idea of forcing this approach on everybody else who wants to describe arbitrary binary data just because ctypes is included. Now, if it is shown that it is indeed better than a simpler instances-of-a-single-type approach that I'm basically proposing then I'll be persuaded. However, the existence of an alternative strategy using a single Python type and multiple instances of that type to describe binary data (which is the NumPy approach and essentially the array module approach) means that we can't just a-priori assume that the way ctypes did it is the only or best way. The examples of "missing features" that Martin has exposed are not show-stoppers. They can all be easily handled within the context of what is being proposed. I can modify the PEP to show this. But, I don't have the time to spend if it's just all going to be rejected in the end. I need some encouragement in order to continue to invest energy in pushing this forward. -Travis From oliphant.travis at ieee.org Tue Oct 31 06:51:18 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Mon, 30 Oct 2006 22:51:18 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45468E0F.80000@canterbury.ac.nz> References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543FFA2.30002@canterbury.ac.nz> <4545320C.603@canterbury.ac.nz> <45468E0F.80000@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Travis Oliphant wrote: > >> The 'bit' type re-intprets the size information to be in units of "bits" >> and so implies a "bit-field" instead of another data-format. > > Hmmm, okay, but now you've got another orthogonality > problem, because you can't distinguish between e.g. > a 5-bit signed int field and a 5-bit unsigned int > field. Good point. > > It might be better not to consider "bit" to be a > type at all, and come up with another way of indicating > that the size is in bits. Perhaps > > 'i4' # 4-byte signed int > 'i4b' # 4-bit signed int > 'u4' # 4-byte unsigned int > 'u4b' # 4-bit unsigned int > I like this. Very nice. I think that's the right way to look at it. > (Next we can have an argument about whether bit > fields should be packed MSB-to-LSB or vice versa...:-) I guess we need another flag / attribute to indicate that. The other thing that needs to be discussed at some point may be a way to indicate the floating-point format. I've basically punted on this and just meant 'f' to mean "platform float" Thus, you can't use the data-type object to pass information between two platforms that don't share a common floating point representation. -Travis From oliphant.travis at ieee.org Tue Oct 31 07:12:48 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Mon, 30 Oct 2006 23:12:48 -0700 Subject: [Python-Dev] PEP: Extending the buffer protocol to share array information. Message-ID: Attached is my PEP for extending the buffer protocol to allow array data to be shared. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep_buffer.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20061030/90c68b35/attachment.txt From oliphant.travis at ieee.org Tue Oct 31 07:32:47 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Mon, 30 Oct 2006 23:32:47 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4543B016.7070002@egenix.com> References: <45431761.1020401@egenix.com> <45439D17.5010306@egenix.com> <4543B016.7070002@egenix.com> Message-ID: <4546EE0F.90000@ieee.org> M.-A. Lemburg wrote: > Travis E. Oliphant wrote: > > I understand and that's why I'm asking why you made the range > explicit in the definition. > In the case of NumPy it was so that String and Unicode arrays would both look like multi-length string "character" arrays and not arrays of arrays of some character. But, this can change in the data-format object. I can see that the Unicode description needs to be improved. > The definition should talk about Unicode code points. > The number of bytes then determines whether you can only > represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only) > or UCS4 (4 bytes, all currently assigned code points). Yes, you are correct. A string of unicode characters should really be represented in the same way that an array of integers is represented for a data-format object. -Travis From martin at v.loewis.de Tue Oct 31 08:51:25 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 31 Oct 2006 08:51:25 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> Message-ID: <4547007D.30404@v.loewis.de> Travis Oliphant schrieb: > So, the big difference is that I think data-formats should be > *instances* of a single type. This is nearly the case for ctypes as well. All layout descriptions are instances of the type type. Nearly, because they are instances of subtypes of the type type: py> type(ctypes.c_long) py> type(ctypes.c_double) py> type(ctypes.c_double).__bases__ (,) py> type(ctypes.Structure) py> type(ctypes.Array) py> type(ctypes.Structure).__bases__ (,) py> type(ctypes.Array).__bases__ (,) So if your requirement is "all layout descriptions ought to have the same type", then this is (nearly) the case: they are instances of type (rather then datatype, as in your PEP). Regards, Martin From p.f.moore at gmail.com Tue Oct 31 10:47:08 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 31 Oct 2006 09:47:08 +0000 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> Message-ID: <79990c6b0610310147q74851b19v55e7caab6f87c444@mail.gmail.com> On 10/31/06, Travis Oliphant wrote: > In order to make sense of the data-format object that I'm proposing you > have to see the need to share information about data-format through an > extended buffer protocol (which I will be proposing soon). I'm not > going to try to argue that right now because there are a lot of people > who can do that. > > So, I'm going to assume that you see the need for it. If you don't, > then just suspend concern about that for the moment. There are a lot of > us who really see the need for it. [...] > Again, my real purpose is the extended buffer protocol. These > data-format type is a means to that end. If the consensus is that > nobody sees a greater use of the data-format type beyond the buffer > protocol, then I will just write 1 PEP for the extended buffer protocol. While I don't personally use NumPy, I can see where an extended buffer protocol like you describe could be advantageous, and so I'm happy to concede that benefit. I can also vaguely see that a unified "block of memory description" would be useful. My interest would be in the area of the struct module (unpacking and packing data for dumping to byte streams - whether this happens in place or not is not too important to this use case). However, I cannot see how your proposal would help here in practice - does it include the functionality of the struct module (or should it?) If so, then I'd like to see examples of equivalent constructs. If not, then isn't it yet another variation on the theme, adding to the problem of multiple approaches rather than helping? I can also see the parallels with ctypes. Here I feel a little less sure that keeping the two approaches is wrong. I don't know why I feel like that - maybe nothing more than familiarity with ctypes - but I don't have the same reluctance to have both the ctypes data definition stuff and the new datatype proposal. Enough of the abstract. As a concrete example, suppose I have a (byte) string in my program containing some binary data - an ID3 header, or a TCP packet, or whatever. It doesn't really matter. Does your proposal offer anything to me in how I might manipulate that data (assuming I'm not using NumPy)? (I'm not insisting that it should, I'm just trying to understand the scope of the PEP). Paul. From ncoghlan at gmail.com Tue Oct 31 13:44:26 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 31 Oct 2006 22:44:26 +1000 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45468C8E.1000203@canterbury.ac.nz> Message-ID: <4547452A.5040501@gmail.com> Travis E. Oliphant wrote: > However, the existence of an alternative strategy using a single Python > type and multiple instances of that type to describe binary data (which > is the NumPy approach and essentially the array module approach) means > that we can't just a-priori assume that the way ctypes did it is the > only or best way. As a hypothetical, what if there was a helper function that translated a description of a data structure using basic strings and sequences (along the lines of what you have in your PEP) into a ctypes data structure? > The examples of "missing features" that Martin has exposed are not > show-stoppers. They can all be easily handled within the context of > what is being proposed. I can modify the PEP to show this. But, I > don't have the time to spend if it's just all going to be rejected in > the end. I need some encouragement in order to continue to invest > energy in pushing this forward. I think the most important thing in your PEP is the formats for describing structures in a way that is easy to construct in both C and Python (specifically, by using strings and sequences), and it is worth pursuing for that aspect alone. Whether that datatype is then implemented as a class in its own right or as a factory function that returns a ctypes data type object is, to my mind, a relatively minor implementation issue (either way has questions to be addressed - I'm not sure how you tell ctypes that you have a 32-bit integer with a non-native endian format, for example). In fact, it may make sense to just use the lists/strings directly as the data exchange format definitions, and let the various libraries do their own translation into their private format descriptions instead of creating a new one-type-to-describe-them-all. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From gmccaughan at synaptics-uk.com Tue Oct 31 11:56:50 2006 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Tue, 31 Oct 2006 11:56:50 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45468E0F.80000@canterbury.ac.nz> Message-ID: <200610311056.51070.gmccaughan@synaptics-uk.com> > > It might be better not to consider "bit" to be a > > type at all, and come up with another way of indicating > > that the size is in bits. Perhaps > > > > 'i4' # 4-byte signed int > > 'i4b' # 4-bit signed int > > 'u4' # 4-byte unsigned int > > 'u4b' # 4-bit unsigned int > > > > I like this. Very nice. I think that's the right way to look at it. I remark that 'ib4' and 'ub4' make for marginally easier parsing and less danger of ambiguity. -- g From mcherm at mcherm.com Tue Oct 31 14:26:35 2006 From: mcherm at mcherm.com (Michael Chermside) Date: Tue, 31 Oct 2006 05:26:35 -0800 Subject: [Python-Dev] PEP: Adding data-type objects to Python Message-ID: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com> In this email I'm responding to a series of emails from Travis pretty much in the order I read them: Travis Oliphant writes: > I'm saying we should introduce a single-object mechanism for > describing binary data so that the many-object approach of c-types > does not become some kind of de-facto standard. C-types can > "translate" this object-instance to its internals if and when it > needs to. > > In the mean-time, how are other packages supposed to communicate > binary information about data with each other? Here we disagree. I haven't used C-types. I have no idea whether it is well-designed or horribly unusable. So if someone wanted to argue that C-types is a mistake and should be thrown out, I'd be willing to listen. Until someone tries to make that argument, I'm presuming it's good enough to be part of the standard library for Python. Given that, I think that it *SHOULD* become a de-facto standard. I think that the way different packages should communicate binary information about data with each other is using C-types. Not because it's wonderful (remember, I've never used it), but because it's STANDARD. There should be one obvious way to do things! When there is, it makes interoperability WAY easier, and interoperability is the main objective when dealing with things like binary data formats. Propose using C-types. Or propose *improving* C-types. But don't propose ignoring it. In a different message, he writes: > It also bothers me that so many ways to describe binary data are > being used out there. This is a problem that deserves being solved. > And, no, ctypes hasn't solved it (we can't directly use the ctypes > solution). Really? Why? Is this a failing in C-types? Can C-types be "fixed"? Later he explains: > Remember the buffer protocol is in compiled code. So, as a result, > > 1) It's harder to construct a class to pass through the protocol > using the multiple-types approach of ctypes. > > 2) It's harder to interpret the object recevied through the buffer protocol. > > Sure, it would be *possible* to use ctypes, but I think it would be > very difficult. Think about how you would write the get_data_format > C function in the extended buffer protocol for NumPy if you had to > import ctypes and then build a class just to describe your data. > How would you interpret what you get back? Aha! So what you REALLY ought to be asking for is a C interface to the ctypes module. That seems like a very sensible and reasonable request. > I don't think we should just *use ctypes because it's there* when > the way it describes binary data was not constructed with the > extended buffer protocol in mind. I just disagree. (1) I *DO* think we should "just use ctypes because it's there". After all, the problem we're trying to solve is one of COMPATIBILITY - you don't solve those by introducing competing standards. (2) From what I understand of it, I think ctypes is quite capable of describing data to be accessed via the buffer protocol. In another email: > In order to make sense of the data-format object that I'm proposing > you have to see the need to share information about data-format > through an extended buffer protocol (which I will be proposing > soon). I'm not going to try to argue that right now because there > are a lot of people who can do that. Actually, no need to convince me... I am already convinced of the wisdom of this approach. > My view is that it is un-necessary to use a different type object to > describe each different data-type. [...] > So, the big difference is that I think data-formats should be > *instances* of a single type. Why? Who cares? Seriously, if we were proposing to describe the layouts with a collection of rubber bands and potato chips, I'd say it was a crazy idea. But we're proposing using data structures in a computer memory. Why does it matter whether those data structures are of the same "python type" or different "python types"? I care whether the structure can be created, passed around, and interrogated. I don't care what Python type they are. > I'm saying that I don't like the idea of forcing this approach on > everybody else who wants to describe arbitrary binary data just > because ctypes is included. And I'm saying that I *do*. Hey, if someone proposed getting rid of the current syntax for the array module (for Py3K) and replacing it with use of ctypes, I'd give it serious consideration. There should be only one way to describe binary structures. It should be powerful enough to describe almost any structure, easy-to-use, and most of all it should be used consistently everywhere. > I need some encouragement in order to continue to invest energy in > pushing this forward. Please keep up the good work! Some day I'd like to see NumPy built in to the standard Python distribution. The incremental, PEP by PEP approach you are taking is the best route to getting there. But there may be some changes along the way -- convergence with ctypes may be one of those. ------------- Look, my advice is to try to make ctypes work for you. Not having any easy way to construct or to interrogate ctypes objects from C is a legitimate complaint... and if you can define your requirements, it should be relatively easy to add a C interface to meet those needs. -- Michael Chermside From oliphant.travis at ieee.org Tue Oct 31 16:32:39 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Tue, 31 Oct 2006 08:32:39 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com> References: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com> Message-ID: Michael Chermside wrote: > In this email I'm responding to a series of emails from Travis > pretty much in the order I read them: > >> >> In the mean-time, how are other packages supposed to communicate >> binary information about data with each other? > > Here we disagree. > > I haven't used C-types. I have no idea whether it is well-designed or > horribly unusable. So if someone wanted to argue that C-types is a > mistake and should be thrown out, I'd be willing to listen. > Until > someone tries to make that argument, I'm presuming it's good enough to > be part of the standard library for Python. My problem with this argument is two fold: 1) I'm not sure you really know what your talking about since you apparently haven't used either ctypes or NumPy (I've used both and so forgive me if I claim to understand the strengths of the data-format representations that each uses a bit better). Therefore, it's hard for me to take your opinion seriously. I will try though. I understand you have a preference for not wildly expanding the ways to do similar things. I share that preference with you. 2) You are assuming that because it's good enough for the standard library means that the way they describe data-formats (using a separate Python type for each one) is the *one true way*. When was this discussed? Frankly it's a weak argument because the struct module has been around for a lot longer. Why didn't the ctypes module follow that standard? Or the standard that's in the array module for describing data-types. That's been there for a long time too. Why wasn't ctypes forced to use that approach? The reason it wasn't is because it made sense for ctypes to use a separate type for each data-format object so that you could call C-functions as if they were Python functions. If this is your goal, then it seems like a good idea (though not strictly necessary) to use a separate Python type for each data-format. But, there are distinct disadvantages to this approach compared to what I'm trying to allow. Martin claims that the ctypes approach is *basically* equivalent but this is just not true. It could be made more true if the ctypes objects inherited from a "meta-type" and if Python allowed meta-types to expand their C-structures. But, last I checked this is not possible. A Python type object is a very particular kind of Python-type. As far as I can tell, it's not as flexible in terms of the kinds of things you can do with the "instances" of a type object (i.e. what ctypes types are) on the C-level. The other disadvantage of what you are describing is: Who is going to write the code? I'm happy to have the data-format object live separate from ctypes and leave it to the ctypes author(s) to support it if desired. But, the claim that the extended buffer protocol jump through all kinds of hoops to conform to the "ctypes standard" when that "standard" was designed with a different idea in mind is not acceptable. Ctypes has only been in Python since 2.5 and the array interface was around before that. Numeric has been around longer than ctypes. The array module and the struct modules in Python have also both been around longer than ctypes as well. Where is the discussion that crowned the ctypes way of doing things as "the one true way" > > In a different message, he writes: >> It also bothers me that so many ways to describe binary data are >> being used out there. This is a problem that deserves being solved. >> And, no, ctypes hasn't solved it (we can't directly use the ctypes >> solution). > > Really? Why? Is this a failing in C-types? Can C-types be "fixed"? You can't grow C-function pointers on to an existing type object. You are also carrying around a lot of weight in the Python type object that is un-necessary if all you are doing is describing data. > > I just disagree. (1) I *DO* think we should "just use ctypes because it's > there". After all, the problem we're trying to solve is one of > COMPATIBILITY - you don't solve those by introducing competing standards. > (2) From what I understand of it, I think ctypes is quite capable of > describing data to be accessed via the buffer protocol. Capable but not supporting all the things I'm talking about. The ctypes objects don't have any of the methods or attributes (or C function pointers) that I've described. Nor should they necessarily grow them. > > Why? Who cares? Seriously, if we were proposing to describe the layouts > with a collection of rubber bands and potato chips, I'd say it was a > crazy idea. But we're proposing using data structures in a computer > memory. Why does it matter whether those data structures are of the same > "python type" or different "python types"? I care whether the structure > can be created, passed around, and interrogated. I don't care what > Python type they are. Sure, but the flexibility you have with an instance of a Python type is different then when that instance must itself also be a Python type. It *is* different. This is quite noticeable in C especially. > >> I'm saying that I don't like the idea of forcing this approach on >> everybody else who wants to describe arbitrary binary data just >> because ctypes is included. > > And I'm saying that I *do*. Hey, if someone proposed getting rid of > the current syntax for the array module (for Py3K) and replacing it with > use of ctypes, I'd give it serious consideration. There should be only > one way to describe binary structures. It should be powerful enough to > describe almost any structure, easy-to-use, and most of all it should be > used consistently everywhere. I'm not opposed to convergence, but ctypes must be willing to come to us too. It's devleopment of a "standard" was not done with the array interface in mind so why should it be surprising that it does not fill the need for us. -Travis From oliphant.travis at ieee.org Tue Oct 31 17:54:12 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue, 31 Oct 2006 09:54:12 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4547007D.30404@v.loewis.de> References: <20061028135415.GA13049@code0.codespeak.net> <4547007D.30404@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis Oliphant schrieb: > >>So, the big difference is that I think data-formats should be >>*instances* of a single type. > > > This is nearly the case for ctypes as well. All layout descriptions > are instances of the type type. Nearly, because they are instances > of subtypes of the type type: > > py> type(ctypes.c_long) > > py> type(ctypes.c_double) > > py> type(ctypes.c_double).__bases__ > (,) > py> type(ctypes.Structure) > > py> type(ctypes.Array) > > py> type(ctypes.Structure).__bases__ > (,) > py> type(ctypes.Array).__bases__ > (,) > > So if your requirement is "all layout descriptions ought to have > the same type", then this is (nearly) the case: they are instances > of type (rather then datatype, as in your PEP). > The big difference, however, is that by going this route you are forced to use the "type object" as your data-format "instance". This is fitting a square peg into a round hole in my opinion. To really be useful, you would need to add the attributes and (most importantly) C-function pointers and C-structure members to these type objects. I don't even think that is possible in Python (even if you do create a meta-type that all the c-type type objects can use that carries the same information). There are a few people claiming I should use the ctypes type-hierarchy but nobody has explained how that would be possible given the attributes, C-structure members and C-function pointers that I'm proposing. In NumPy we also have a Python type for each basic data-format (we call them array scalars). For a little while they carried the data-format information on the Python side. This turned out to be not flexible enough. So, we expanded the PyArray_Descr * structure which has always been a part of Numeric (and the array module array type) into an actual Python type and a lot of things became possible. It was clear to me that we were "on to something". Now, the biggest claim against the gist of what I'm proposing (details we can argue about), seems from my perspective to be a desire to "go backwards" and carry data-type information around with a Python type. The data-type object did not just appear out of thin-air one day. It really can be seen as an evolution from the beginnings of Numeric (and the Python array module). So, this is what we came up with in the NumPy world. Ctypes came up with something a bit different. It is not "trivial" to "just use ctypes." I could say the same thing and tell ctypes to just use NumPy's data-type object. It could be done that way, but of course it would take a bit of work on the part of ctypes to make that happen. Having ctypes in the standard library does not mean that any other discussion of how data-format should be represented has been decided on. If I had known that was what it meant to put ctypes in the standard library, I would have been more vocal several months ago. -Travis From oliphant.travis at ieee.org Tue Oct 31 18:13:39 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue, 31 Oct 2006 10:13:39 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <4547452A.5040501@gmail.com> References: <45468C8E.1000203@canterbury.ac.nz> <4547452A.5040501@gmail.com> Message-ID: Nick Coghlan wrote: > Travis E. Oliphant wrote: > >>However, the existence of an alternative strategy using a single Python >>type and multiple instances of that type to describe binary data (which >>is the NumPy approach and essentially the array module approach) means >>that we can't just a-priori assume that the way ctypes did it is the >>only or best way. > > > As a hypothetical, what if there was a helper function that translated a > description of a data structure using basic strings and sequences (along the > lines of what you have in your PEP) into a ctypes data structure? > That would be fine and useful in fact. I don't see how it helps the problem of "what to pass through the buffer protocol" I see passing c-types type objects around on the c-level as an un-necessary and burdensome approach unless the ctypes objects were significantly enhanced. > > In fact, it may make sense to just use the lists/strings directly as the data > exchange format definitions, and let the various libraries do their own > translation into their private format descriptions instead of creating a new > one-type-to-describe-them-all. Yes, I'm open to this possibility. I basically want two things in the object passed through the extended buffer protocol: 1) It's fast on the C-level 2) It covers all the use-cases. If just a particular string or list structure were passed, then I would drop the data-format PEP and just have the dataformat argument of the extended buffer protocol be that thing. Then, something that converts ctypes objects to that special format would be very nice indeed. -Travis From martin at v.loewis.de Tue Oct 31 18:27:18 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 31 Oct 2006 18:27:18 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com> Message-ID: <45478776.3040607@v.loewis.de> Travis E. Oliphant schrieb: > But, there are distinct disadvantages to this approach compared to what > I'm trying to allow. Martin claims that the ctypes approach is > *basically* equivalent but this is just not true. I may claim that, but primarily, my goal was to demonstrate that the proposed PEP cannot be used to describe ctypes object layouts (without checking, I can readily believe that the PEP covers everything in the array and struct modules). > It could be made more > true if the ctypes objects inherited from a "meta-type" and if Python > allowed meta-types to expand their C-structures. But, last I checked > this is not possible. That I don't understand. a) what do you think is not possible? b) why is that an important difference between a datatype and a ctype? If you are suggesting that, given two Python types A and B, and B inheriting from A, that the memory layout of B cannot extend the memory layout of A, then: that is certainly possible in Python, and there are many examples for it. > A Python type object is a very particular kind of Python-type. As far > as I can tell, it's not as flexible in terms of the kinds of things you > can do with the "instances" of a type object (i.e. what ctypes types > are) on the C-level. Ah, you are worried that NumArray objects would have to be *instances* of ctypes types. That wouldn't be necessary at all. Instead, if each NumArray object had a method get_ctype(), which returned a ctypes type, then you would get the same desciptiveness that you get with the PEP's datatype. > I'm happy to have the data-format object live separate from ctypes and > leave it to the ctypes author(s) to support it if desired. But, the > claim that the extended buffer protocol jump through all kinds of hoops > to conform to the "ctypes standard" when that "standard" was designed > with a different idea in mind is not acceptable. That, of course, is a reasoning I can understand. This is free software, contributors can chose to contribute whatever they want; you can't force anybody to do anything specific you want to get done. Acceptance of any PEP (not just this PEP) should always be contingent on available of a patch implementing it. > Where is the discussion that crowned the ctypes way of doing things as > "the one true way" It hasn't been crowned this way. Me, personally, I just said two things about this PEP and ctypes: a) the PEP does not support all concepts that ctypes needs b) ctypes can express all examples in the PEP in response to your proposal that ctypes should adopt the PEP, and that ctypes is not good enough to be the one true way. Regards, Martin From andorxor at gmx.de Tue Oct 31 18:31:30 2006 From: andorxor at gmx.de (Stephan Tolksdorf) Date: Tue, 31 Oct 2006 18:31:30 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <454689D4.9040109@v.loewis.de> References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> <45448058.5020700@v.loewis.de> <454689D4.9040109@v.loewis.de> Message-ID: <45478872.2010906@gmx.de> Martin v. L?wis wrote: > Travis Oliphant schrieb: >> Function pointers are "supported" with the void data-type and could be >> more specifically supported if it were important. People typically >> don't use the buffer protocol to send function-pointers around in a way >> that the void description wouldn't be enough. > > As I said before, I can't tell whether it's important, as I still don't > know what the purpose of this PEP is. If it is to support a unification > of memory layout specifications, and if that unifications is also to > include ctypes, then yes, it is important. If it is to describe array > elements in NumArray arrays, then it might not be important. > > For the usage of ctypes, the PEP void type is insufficient to describe > function pointers: you also need a specification of the signature of > the function pointer (parameter types and return type), or else you > can't use the function pointer (i.e. you can't call the function). The buffer protocol is primarily meant for describing the format of (large) contiguous pieces of binary data. In most cases that will be all kinds of numerical data for scientific applications, image and other media data, simple databases and similar kinds of data. There is currently no adequate data format type which sufficiently supports these applications, otherwise Travis wouldn't make this proposal. While Travis' proposal encompasses the data format functionality within the struct module and overlaps with what ctypes has to offer, it does not aim to replace ctypes. I don't think that a basic data format type necessarily should be able to encode all the information a foreign function interface needs to call a code library. From my point of view, that kind of information is one abstraction layer above a basic data format and should be implemented as an extension of or complementary to the basic data format. I also do not understand why the data format type should attempt to fully describe arbitrarily complex data formats, like fragmented (non-continuous) data structures in memory. You'd probably need a full programming language for that anyway. Regards, Stephan From theller at ctypes.org Tue Oct 31 18:38:01 2006 From: theller at ctypes.org (Thomas Heller) Date: Tue, 31 Oct 2006 18:38:01 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45468C8E.1000203@canterbury.ac.nz> Message-ID: <454789F9.7050808@ctypes.org> Travis Oliphant schrieb: > Greg Ewing wrote: >> Travis Oliphant wrote: >> >> >>>Part of the problem is that ctypes uses a lot of different Python types >>>(that's what I mean by "multi-object" to accomplish it's goal). What >>>I'm looking for is a single Python type that can be passed around and >>>explains binary data. >> >> >> It's not clear that multi-object is a bad thing in and >> of itself. It makes sense conceptually -- if you have >> a datatype object representing a struct, and you ask >> for a description of one of its fields, which could >> be another struct or array, you would expect to get >> another datatype object describing that. >> >> Can you elaborate on what would be wrong with this? >> >> Also, can you clarify whether your objection is to >> multi-object or multi-type. They're not the same thing -- >> you could have a data structure built out of multiple >> objects that are all of the same Python type, with >> attributes distinguishing between struct, array, etc. >> That would be single-type but multi-object. > > I've tried to clarify this in another post. Basically, what I don't > like about the ctypes approach is that it is multi-type (every new > data-format is a Python type). > > In order to talk about all these Python types together, then they must > all share some attribute (or else be derived from a meta-type in C with > a specific function-pointer entry). (I tried to read the whole thread again, but it is too large already.) There is a (badly named, probably) api to access information about ctypes types and instances of this type. The functions are PyObject_stgdict(obj) and PyType_stgdict(type). Both return a 'StgDictObject' instance or NULL if the funtion fails. This object is the ctypes' type object's __dict__. StgDictObject is a subclass of PyDictObject and has fields that carry information about the C type (alignment requirements, size in bytes, plus some other stuff). Also it contains several pointers to functions that implement (in C) struct-like functionality (packing/unpacking). Of course several of these fields can only be used for ctypes-specific purposes, for example a pointer to the ffi_type which is used when calling foreign functions, or the restype, argtypes, and errcheck fields which are only used when the type describes a function pointer. This mechanism is probably a hack because it'n not possible to add C accessible fields to type objects, on the other hand it is extensible (in principle, at least). Just to describe the implementation. Thomas From martin at v.loewis.de Tue Oct 31 18:48:33 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 31 Oct 2006 18:48:33 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> <4547007D.30404@v.loewis.de> Message-ID: <45478C71.2010600@v.loewis.de> Travis Oliphant schrieb: > The big difference, however, is that by going this route you are forced > to use the "type object" as your data-format "instance". Since everything is an object (an "instance) in Python, this is not such a big difference. > This is > fitting a square peg into a round hole in my opinion. To really be > useful, you would need to add the attributes and (most importantly) > C-function pointers and C-structure members to these type objects. Can you explain why that is? In the PEP, I see two C fucntions: setitem and getitem. I think they can be implemented readily with ctypes' GETFUNC and SETFUNC function pointers that it uses all over the place. I don't see a requirement to support C structure members or function pointers in the datatype object. > There are a few people claiming I should use the ctypes type-hierarchy > but nobody has explained how that would be possible given the > attributes, C-structure members and C-function pointers that I'm proposing. Ok, here you go. Remember, I'm still not claiming that this should be done: I'm just explaining how it could be done. - byteorder/isnative: I think this could be derived from the presence of the _swappedbytes_ field - itemsize: can be done with ctypes.sizeof - kind: can be created through a mapping of the _type_ field (I think) - fields: can be derived from the _fields_ member - hasobject: compare, recursively, with py_object - name: use __name__ - base: again, created from _type_ (if _length_ is present) - shape: recursively look at _length_ - alignment: use ctypes.alignment > It was clear to me that we were "on to something". Now, the biggest > claim against the gist of what I'm proposing (details we can argue > about), seems from my perspective to be a desire to "go backwards" and > carry data-type information around with a Python type. I, at least, have no such desire. I just explained that the ctypes model of memory layouts is just as expressive as the one in the PEP. Which of these is "better" for what the PEP wants to achieve, I can't say, because I still don't quite understand what the PEP wants to achieve. Regards, Martin From martin at v.loewis.de Tue Oct 31 18:58:01 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 31 Oct 2006 18:58:01 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45478872.2010906@gmx.de> References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> <45448058.5020700@v.loewis.de> <454689D4.9040109@v.loewis.de> <45478872.2010906@gmx.de> Message-ID: <45478EA9.80302@v.loewis.de> Stephan Tolksdorf schrieb: > While Travis' proposal encompasses the data format functionality within > the struct module and overlaps with what ctypes has to offer, it does > not aim to replace ctypes. This discussion could have been a lot shorter if he had said so. Unfortunately (?) he stated that it was *precisely* a motivation of the PEP to provide a standard data description machinery that can then be adopted by the struct, array, and ctypes modules. > I also do not understand why the data format type should attempt to > fully describe arbitrarily complex data formats, like fragmented > (non-continuous) data structures in memory. You'd probably need a full > programming language for that anyway. For an FFI application, you need to be able to describe arbitrary in-memory formats, since that's what the foreign function will expect. For type safety and reuse, you better separate the description of the layout from the creation of the actual values. Otherwise (i.e. if you have to define the layout on each invocation), creating the parameters for a foreign function becomes very tedious and error-prone, with errors often being catastrophic (i.e. interpreter crashes). Regards, Martin From oliphant.travis at ieee.org Tue Oct 31 20:48:28 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue, 31 Oct 2006 12:48:28 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45478C71.2010600@v.loewis.de> References: <20061028135415.GA13049@code0.codespeak.net> <4547007D.30404@v.loewis.de> <45478C71.2010600@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis Oliphant schrieb: > >>The big difference, however, is that by going this route you are forced >>to use the "type object" as your data-format "instance". > > > Since everything is an object (an "instance) in Python, this is not > such a big difference. > I think it actually is. Perhaps I'm wrong, but a type-object is still a special kind of an instance of a meta-type. I once tried to add function pointers to a type object by inheriting from it. But, I was told that Python is not set up to handle that. Maybe I misunderstood. Let me be very clear. The whole reason I make any statements about ctypes is because somebody else brought it up. I'm not trying to replace ctypes and the way it uses type objects to represent data internally. All I'm trying to do is come up with a way to describe data-types through a buffer protocol. The way ctypes does it is "too" bulky by definining a new Python type for every data-format. While semantically you may talk about the equivalency of types being instances of a "meta-type" and regular objects being instances of a type. My understanding is still that there are practical differences when it comes to implementation --- and certain things that "can't be done" Here's what I mean by the difference. This is akin to what I'm proposing struct { PyObject_HEAD /* whatever you need to represent your instance Quite a bit of flexibility.... */ } PyDataFormatObject; A Python type object (what every C-types data-format "type" inherits from) has a C-structure struct { PyObject_VAR_HEAD char *tp_name; int tp_basicsize, tp_itemsize; /* Methods to implement standard operations */ destructor tp_dealloc; printfunc tp_print; getattrfunc tp_getattr; setattrfunc tp_setattr; cmpfunc tp_compare; reprfunc tp_repr; ... ... PyObject *tp_bases; PyObject *tp_mro; /* method resolution order */ PyObject *tp_cache; PyObject *tp_subclasses; PyObject *tp_weaklist; destructor tp_del; ... /* + more under certain conditions */ } PyTypeObject; Why in the world do we need to carry all this extra baggage around in each data-format instance in order to just describe data? I can see why it's useful for ctypes to do it and that's fine. But, the argument that every exchange of data-format information should use this type-object instance is hard to swallow. So, I'm happy to let ctypes continue on doing what it's doing trusting its developers to have done something good. I'd be happy to drop any reference to ctypes. The only reason to have the data-type objects is something to pass as part of the extended buffer protocol. > > > Can you explain why that is? In the PEP, I see two C fucntions: > setitem and getitem. I think they can be implemented readily with > ctypes' GETFUNC and SETFUNC function pointers that it uses > all over the place. Sure, but where do these function pointers live and where are they stored. In ctypes it's in the CField_object. Now, this is closer to what I'm talking about. But, why is not not the same thing. Why, yet another type object to talk about fields of a structure? These are rhetorical questions. I really don't expect or need an answer because I'm not questioning why ctypes did what it did for solving the problem it was solving. I am questioning anyone who claims that we should use this mechanism for describing data-formats in the extended buffer protocol. > > I don't see a requirement to support C structure members or > function pointers in the datatype object. > > >>There are a few people claiming I should use the ctypes type-hierarchy >>but nobody has explained how that would be possible given the >>attributes, C-structure members and C-function pointers that I'm proposing. > > > Ok, here you go. Remember, I'm still not claiming that this should be > done: I'm just explaining how it could be done. O.K. Thanks for putting in the effort. It doesn't answer my real concerns, though. >>It was clear to me that we were "on to something". Now, the biggest >>claim against the gist of what I'm proposing (details we can argue >>about), seems from my perspective to be a desire to "go backwards" and >>carry data-type information around with a Python type. > > > I, at least, have no such desire. I just explained that the ctypes > model of memory layouts is just as expressive as the one in the > PEP. I agree with this. I'm very aware of what "can" be expressed. I just think it's too awkard and bulky to use in the extended buffer protocol > Which of these is "better" for what the PEP wants to achieve, > I can't say, because I still don't quite understand what the PEP > wants to achieve. > Are you saying you still don't understand after having read the extended buffer protocol PEP, yet? -Travis From oliphant.travis at ieee.org Tue Oct 31 21:04:53 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue, 31 Oct 2006 13:04:53 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45478776.3040607@v.loewis.de> References: <20061031052635.315p3rnhb4cg4kws@login.werra.lunarpages.com> <45478776.3040607@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis E. Oliphant schrieb: > >>But, there are distinct disadvantages to this approach compared to what >>I'm trying to allow. Martin claims that the ctypes approach is >>*basically* equivalent but this is just not true. > > > I may claim that, but primarily, my goal was to demonstrate that the > proposed PEP cannot be used to describe ctypes object layouts (without > checking, I can readily believe that the PEP covers everything in > the array and struct modules). > That's a fine argument. You are right in terms of the PEP as it stands. However, I want to make clear that a single Python type object *could* be used to describe data including all the cases you laid out. It would not be difficult to extend the PEP to cover all the cases you've described --- I'm not sure that's desireable. I'm not trying to replace what ctypes does. I'm just trying to get something that we can use to exchange data-format information through the extended buffer protocol. It really comes down to using Python type-objects as the instances describing data-formats (which ctypes does) or "normal" Python objects as the instances describing data-formats (what the PEP proposes). > >>It could be made more >>true if the ctypes objects inherited from a "meta-type" and if Python >>allowed meta-types to expand their C-structures. But, last I checked >>this is not possible. > > > That I don't understand. a) what do you think is not possible? Extending the C-structure of PyTypeObject and having Python types use that as their "type-object". b) > why is that an important difference between a datatype and a ctype? Because with instances of C-types you are stuck with the PyTypeObject structure. If you want to add anything you have to do it in the dictionary. Instances of a datatype allow adding anything after the PyObject_HEAD structure. > > If you are suggesting that, given two Python types A and B, and > B inheriting from A, that the memory layout of B cannot extend > the memory layout of A, then: that is certainly possible in Python, > and there are many examples for it. > I know this. I've done it for many different objects. I'm saying it's not quite the same when what you are extending is the PyTypeObject and trying to use it as the type object for some other object. > >>A Python type object is a very particular kind of Python-type. As far >>as I can tell, it's not as flexible in terms of the kinds of things you >>can do with the "instances" of a type object (i.e. what ctypes types >>are) on the C-level. > > > Ah, you are worried that NumArray objects would have to be *instances* > of ctypes types. That wouldn't be necessary at all. Instead, if each > NumArray object had a method get_ctype(), which returned a ctypes type, > then you would get the same desciptiveness that you get with the > PEP's datatype. > No, I'm not worried about that (It's not NumArray by the way, it's NumPy. NumPy replaces both NumArray and Numeric). NumPy actually interfaces with ctypes quite well. This is how I learned anything I might know about ctypes. So, I'm well aware of this. What I am concerned about is using Python type objects (i.e. Python objects that can be cast in C to PyTypeObject *) outside of ctypes to describe data-formats when you don't need it and it just complicates dealing with the data-format description. > >>Where is the discussion that crowned the ctypes way of doing things as >>"the one true way" > > > It hasn't been crowned this way. Me, personally, I just said two things > about this PEP and ctypes: Thanks for clarifying, but I know you didn't say this. Others, however, basically did. > a) the PEP does not support all concepts that ctypes needs It could be extended, but I'm not sure it *needs* to be in it's real context. I'm very sorry for contributing to the distraction that ctypes should adopt the PEP. My words were unclear. But, I'm not pushing for that. I really have no opinion how ctypes describes data. > b) ctypes can express all examples in the PEP > in response to your proposal that ctypes should adopt the PEP, and > that ctypes is not good enough to be the one true way. > I think it is "good enough" in the semantic sense. But, I think using type objects in this fashion for general-purpose data-description is over-kill and will be much harder to extend and deal with. -Travis From oliphant.travis at ieee.org Tue Oct 31 21:13:25 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue, 31 Oct 2006 13:13:25 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <454789F9.7050808@ctypes.org> References: <45468C8E.1000203@canterbury.ac.nz> <454789F9.7050808@ctypes.org> Message-ID: Thomas Heller wrote: > > (I tried to read the whole thread again, but it is too large already.) > > There is a (badly named, probably) api to access information > about ctypes types and instances of this type. The functions are > PyObject_stgdict(obj) and PyType_stgdict(type). Both return a > 'StgDictObject' instance or NULL if the funtion fails. This object > is the ctypes' type object's __dict__. > > StgDictObject is a subclass of PyDictObject and has fields that > carry information about the C type (alignment requirements, size in bytes, > plus some other stuff). Also it contains several pointers to functions > that implement (in C) struct-like functionality (packing/unpacking). > > Of course several of these fields can only be used for ctypes-specific > purposes, for example a pointer to the ffi_type which is used when > calling foreign functions, or the restype, argtypes, and errcheck fields > which are only used when the type describes a function pointer. > > > This mechanism is probably a hack because it'n not possible to add C accessible > fields to type objects, on the other hand it is extensible (in principle, at least). > Thank you for the description. While I've studied the ctypes code, I still don't understand the purposes beind all the data-structures. Also, I really don't have an opinion about ctypes' implementation. All my comparisons are simply being resistant to the "unexplained" idea that I'm supposed to use ctypes objects in a way they weren't really designed to be used. For example, I'm pretty sure you were the one who made me aware that you can't just extend the PyTypeObject. Instead you extended the tp_dict of the Python typeObject to store some of the extra information that is needed to describe a data-type like I'm proposing. So, if you I'm just describing data-format information, why do I need all this complexity (that makes ctypes implementation easier/more natural/etc)? What if the StgDictObject is the Python data-format object I'm talking about? It actually looks closer. But, if all I want is the StgDictObject (or something like it), then why should I pass around the whole type object? This is all I'm saying to those that want me to use ctypes to describe data-formats in the extended buffer protocol. I'm not trying to change anything in ctypes. -Travis From theller at ctypes.org Tue Oct 31 21:46:15 2006 From: theller at ctypes.org (Thomas Heller) Date: Tue, 31 Oct 2006 21:46:15 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <45468C8E.1000203@canterbury.ac.nz> <454789F9.7050808@ctypes.org> Message-ID: <4547B617.3050400@ctypes.org> Travis Oliphant schrieb: > For example, I'm pretty sure you were the one who made me aware that you > can't just extend the PyTypeObject. Instead you extended the tp_dict of > the Python typeObject to store some of the extra information that is > needed to describe a data-type like I'm proposing. > > So, if you I'm just describing data-format information, why do I need > all this complexity (that makes ctypes implementation easier/more > natural/etc)? What if the StgDictObject is the Python data-format > object I'm talking about? It actually looks closer. > > But, if all I want is the StgDictObject (or something like it), then why > should I pass around the whole type object? Maybe you don't need it. ctypes certainly needs the type object because it is also used for constructing instances (while NumPy uses factory functions, IIUC), or for converting 'native' Python object into foreign function arguments. I know that this doesn't interest you from the NumPy perspective (and I don't want to offend you by saying this). > This is all I'm saying to those that want me to use ctypes to describe > data-formats in the extended buffer protocol. I'm not trying to change > anything in ctypes. I don't want to change anything in NumPy, either, and was not the one who suggested to use ctypes objects, although I had thought about whether it would be possible or not. What I like about ctypes, and dislike about Numeric/Numarry/NumPy is the way C compatible types are defined in ctypes. I find the ctypes way more natural than the numxxx or array module way, but what else would anyone expect from me as the ctypes author... I hope that a useful interface is developed from your proposals, and will be happy to adapt ctypes to use it or interface ctypes with it if this makes sense. Thomas From oliphant.travis at ieee.org Tue Oct 31 21:56:30 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue, 31 Oct 2006 13:56:30 -0700 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <45478EA9.80302@v.loewis.de> References: <20061028135415.GA13049@code0.codespeak.net> <45445ECE.9050504@v.loewis.de> <45448058.5020700@v.loewis.de> <454689D4.9040109@v.loewis.de> <45478872.2010906@gmx.de> <45478EA9.80302@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Stephan Tolksdorf schrieb: > >>While Travis' proposal encompasses the data format functionality within >>the struct module and overlaps with what ctypes has to offer, it does >>not aim to replace ctypes. > > > This discussion could have been a lot shorter if he had said so. > Unfortunately (?) he stated that it was *precisely* a motivation > of the PEP to provide a standard data description machinery that > can then be adopted by the struct, array, and ctypes modules. Struct and array I was sure about. Ctypes less sure. I'm very sorry for the distraction I caused by mis-stating my objective. My objective is really the extended buffer protocol. The data-type object is a means to that end. I do think ctypes could make use of the data-type object and that there is a real difference between using Python type objects as data-format descriptions and using another Python type for those descriptions. I thought to go the ctypes route (before I even knew what ctypes did) but decided against it for a number of reasons. But, nonetheless those are side issues. The purpose of the PEP is to provide an object that the extended buffer protocol can use to share data-format information. It should be considered primarily in that context. -Travis From martin at v.loewis.de Tue Oct 31 22:12:11 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 31 Oct 2006 22:12:11 +0100 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> <4547007D.30404@v.loewis.de> <45478C71.2010600@v.loewis.de> Message-ID: <4547BC2B.30406@v.loewis.de> Travis Oliphant schrieb: > I think it actually is. Perhaps I'm wrong, but a type-object is still a > special kind of an instance of a meta-type. I once tried to add > function pointers to a type object by inheriting from it. But, I was > told that Python is not set up to handle that. Maybe I misunderstood. I'm not quite sure what the problems are: one "obvious" problem is that the next Python version may also extend the size of type objects. But, AFAICT, even that should "work", in the sense that this new version should check for the presence of a flag to determine whether the additional fields are there. The only tricky question is how you can find out whether your own extension is there. If that is a common problem, I think a framework could be added to support extensible type objects (with some kind of registry for additional fields, and a per-type-object indicator whether a certain extension field is present). > Let me be very clear. The whole reason I make any statements about > ctypes is because somebody else brought it up. I'm not trying to > replace ctypes and the way it uses type objects to represent data > internally. Ok. I understood you differently earlier. Regards, Martin From p.f.moore at gmail.com Tue Oct 31 22:12:59 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 31 Oct 2006 21:12:59 +0000 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: References: <20061028135415.GA13049@code0.codespeak.net> <4547007D.30404@v.loewis.de> <45478C71.2010600@v.loewis.de> Message-ID: <79990c6b0610311312y2a749b4bw617f0cf18ae9d660@mail.gmail.com> On 10/31/06, Travis Oliphant wrote: > Martin v. L?wis wrote: > > [...] because I still don't quite understand what the PEP > > wants to achieve. > > > > Are you saying you still don't understand after having read the extended > buffer protocol PEP, yet? I can't speak for Martin, but I don't understand how I, as a Python programmer, might use the data type objects specified in the PEP. I have skimmed the extended buffer protocol PEP, but I'm conscious that no objects I currently use support the extended buffer protocol (and the PEP doesn't mention adding support to existing objects), so I don't see that as too relevant to me. I have also installed numpy, and looked at the help for numpy.dtype, but that doesn't add much to the PEP. The freely available chapters of the numpy book explain how dtypes describe data structures, but not how to use them. The freely available Numeric documentation doesn't refer to dtypes, as far as I can tell. Is there any documentation on how to use dtypes, independently of other features of numpy? If not, can you clarify where the benefit lies for a Python user of this proposal? (I understand the benefits of a common language for extensions to communicate datatype information, but why expose it to Python? How do Python users use it?) This is probably all self-evident to the numpy community, but I think that as the PEP is aimed at a wider audience it needs a little more background. Paul. From martin at v.loewis.de Tue Oct 31 22:26:30 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 31 Oct 2006 22:26:30 +0100 Subject: [Python-Dev] PEP: Extending the buffer protocol to share array information. In-Reply-To: References: Message-ID: <4547BF86.6070806@v.loewis.de> Travis E. Oliphant schrieb: > Several extensions to Python utilize the buffer protocol to share > the location of a data-buffer that is really an N-dimensional > array. However, there is no standard way to exchange the > additional N-dimensional array information so that the data-buffer > is interpreted correctly. The NumPy project introduced an array > interface (http://numpy.scipy.org/array_interface.shtml) through a > set of attributes on the object itself. While this approach > works, it requires attribute lookups which can be expensive when > sharing many small arrays. Can you please give examples for real-world applications of this interface, preferably examples involving multiple independently-developed libraries? ("this" being the current interface in NumPy - I understand that the PEP's interface isn't implemented, yet) Paul Moore (IIRC) gave the example of equalising the green values and maximizing the red values in a PIL image by passing it to NumPy: Is that a realistic (even though not-yet real-world) example? If so, what algorithms of NumPy would I use to perform this image manipulation (and why would I use NumPy for it if I could just write a for loop that does that in pure Python, given PIL's getpixel/setdata)? Regards, Martin From brett at python.org Tue Oct 31 23:40:11 2006 From: brett at python.org (Brett Cannon) Date: Tue, 31 Oct 2006 14:40:11 -0800 Subject: [Python-Dev] PEP: Extending the buffer protocol to share array information. In-Reply-To: References: Message-ID: On 10/30/06, Travis E. Oliphant wrote: > > Attached is my PEP for extending the buffer protocol to allow array data > to be shared. You might want to reference this thread ( http://mail.python.org/pipermail/python-3000/2006-August/003309.html) as Guido mentions that extending the buffer protocol to tell more about the data in the buffer and "would offer the numarray folks their 'array interface'". -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20061031/404b5f60/attachment.html From jcarlson at uci.edu Tue Oct 31 23:59:11 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 31 Oct 2006 14:59:11 -0800 Subject: [Python-Dev] PEP: Adding data-type objects to Python In-Reply-To: <79990c6b0610311312y2a749b4bw617f0cf18ae9d660@mail.gmail.com> References: <79990c6b0610311312y2a749b4bw617f0cf18ae9d660@mail.gmail.com> Message-ID: <20061031144447.C0D7.JCARLSON@uci.edu> "Paul Moore" wrote: > On 10/31/06, Travis Oliphant wrote: > > Martin v. L?wis wrote: > > > [...] because I still don't quite understand what the PEP > > > wants to achieve. > > > > > > > Are you saying you still don't understand after having read the extended > > buffer protocol PEP, yet? > > I can't speak for Martin, but I don't understand how I, as a Python > programmer, might use the data type objects specified in the PEP. I > have skimmed the extended buffer protocol PEP, but I'm conscious that > no objects I currently use support the extended buffer protocol (and > the PEP doesn't mention adding support to existing objects), so I > don't see that as too relevant to me. Presumably str in 2.x and bytes in 3.x could be extended to support the 'S' specifier, unicode in 2.x and text in 3.x could be extended to support the 'U' specifier. The various array.array variants could be extended to support all relevant specifiers, etc. > This is probably all self-evident to the numpy community, but I think > that as the PEP is aimed at a wider audience it needs a little more > background. Someone correct me if I am wrong, but it allows things equivalent to the following that is available in C, available in Python... typedef struct { char R; char G; char B; char A; } pixel_RGBA; pixel_RGBA image[1024][768]; Or even... typedef struct { long long numerator; unsigned long long denominator; double approximation; } rational; rational ratios[1024]; The real use is that after you have your array of (packed) objects, be it one of the above samples, or otherwise, you don't need to explicitly pass around specifiers (like in struct, or ctypes), numpy and others can talk to each other, and pick up the specifier with the extended buffer protocol, and it just works. - Josiah From paul.chiusano at gmail.com Sun Oct 29 16:51:01 2006 From: paul.chiusano at gmail.com (Paul Chiusano) Date: Sun, 29 Oct 2006 10:51:01 -0500 Subject: [Python-Dev] Status of pairing_heap.py? Message-ID: I was looking for a good pairing_heap implementation and came across one that had apparently been checked in a couple years ago (!). Here is the full link: http://svn.python.org/view/sandbox/trunk/collections/pairing_heap.py?rev=40887&view=markup I was just wondering about the status of this implementation. The api looks pretty good to me -- it's great that the author decided to have the insert method return a node reference which can then be passed to delete and adjust_key. It's a bit of a pain to implement that functionality, but it's extremely useful for a number of applications. If that project is still alive, I have a couple api suggestions: * Add a method which nondestructively yields the top K elements of the heap. This would work by popping the top k elements of the heap into a list, then reinserting those elements in reverse order. By reinserting the sorted elements in reverse order, the top of the heap is essentially a sorted linked list, so if the exact operation is repeated again, the removals take contant time rather than amortized logarthmic. * So, for example: if we have a min heap, the topK method would pop K elements from the heap, say they are {1, 3, 5, 7}, then do insert(7), followed by insert(5), ... insert(1). * Even better might be if this operation avoided having to allocate new heap nodes, and just reused the old ones. * I'm not sure if adjust_key should throw an exception if the key adjustment is in the wrong direction. Perhaps it should just fall back on deleting and reinserting that node? Paul