From reinhold-birkenfeld-nospam at wolke7.net Sat Jan 1 03:19:03 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat Jan 1 03:22:53 2005 Subject: [Python-Dev] Patch Reviewing Message-ID: Hello, just felt a little bored and tried to review a few (no-brainer) patches. Here are the results: * Patch #1051395 Minor fix in Lib/locale.py: Docs say that function _parse_localename returns a tuple; but for one codepath it returns a list. Patch fixes this by adding tuple(), recommending apply. * Patch #1046831 Minor fix in Lib/distutils/sysconfig.py: it defines a function to retrieve the Python version but does not use it everywhere; Patch fixes this, recommending apply. * Patch #751031 Adds recognizing JPEG-EXIF files (produced by digicams) to imghdr.py. Recommending apply. * Patch #712317 Fixes URL parsing in urlparse for URLs such as http://foo?bar. Splits at '?', so assigns 'foo' to netloc and 'bar' to query instead of assigning 'foo?bar' to netloc. Recommending apply. regards, Reinhold From bac at OCF.Berkeley.EDU Sat Jan 1 04:11:43 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sat Jan 1 04:11:59 2005 Subject: [Python-Dev] python-dev Summary for 2004-11-16 through 2004-11-30 [draft] Message-ID: <41D614EF.1090804@ocf.berkeley.edu> With school starting up again Monday and New Years being tomorrow I don't plan to send this out until Tuesday. Hope everyone has a good New Years. -Brett ----------------------------------- ===================== Summary Announcements ===================== PyCon_ is coming up! Being held March 23-25 in Washington, DC, registration is now open at http://www.python.org/pycon/2005/register.html for credit card users (you can pay by check as well; see the general info page for the conference). .. _PyCon: http://www.python.org/pycon/2005/ ========= Summaries ========= --------------------------------------------- Would you like the source with your function? --------------------------------------------- Would you like all functions and classes to contain a __pycode__ attribute that contains a string of the code used to compile that code object? Well, that very idea was proposed. You would use a command-line switch to turn on the feature in order to remove the memory and any performance overhead for the default case of not needing this feature. Some might ask why this is needed when inspect.getsource and its ilk exist. The perk is that __pycode__ would always exist while inspect.getsource is a best attempt but cannot guarantee it will have the source. Beyond a suggested name change to __source__, various people have suggested very different uses. Some see it as a convenient way to save interpreter work easily and thus not lose any nice code snippet developed interactively. Others see a more programmatic use (such as AOP "advice" injection). Both are rather different and led to the thread ending on the suggestion that a PEP be written that specifies what the intended use-case to make sure that need is properly met. Contributing threads: - `__pycode__ extension <>`__ =============== Skipped Threads =============== - PEP 310 Status - python 2.3.5 release? look for 2.3.5 possibly in January - Current CVS, Cygwin and "make test" - syntactic shortcut - unpack to variably sizedlist mostly discussed `last summary`_ - Python 2.4, MS .NET 1.1 and distutils - Trouble installing 2.4 - Looking for authoritative documentation on packages, import & ihooks no docs exist, but feel free to write some! =) - String literal concatenation & docstrings literal string concatenation only works if the newline separating the strings is not significant to the parser - print "%X" % id(object()) not so nice does 'id' need to return only a positive? No, but it would be nice. - Bug in PyLocale_strcoll - Multilib strikes back - File encodings file.write does not work with Unicode strings; have to decode them to ASCII on your own From python at rcn.com Sat Jan 1 03:53:27 2005 From: python at rcn.com (Raymond Hettinger) Date: Sat Jan 1 04:33:09 2005 Subject: [Python-Dev] Patch Reviewing References: Message-ID: <000201c4efb2$2e8b1320$43facc97@oemcomputer> [Reinhold Birkenfeld] > just felt a little bored and tried to review a few (no-brainer) patches. Thanks, please assign to me and I'll apply them. Raymond Hettinger From kbk at shore.net Sat Jan 1 05:55:36 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat Jan 1 05:55:47 2005 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200501010455.j014taqo000992@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 261 open ( +4) / 2718 closed ( +3) / 2979 total ( +7) Bugs : 801 open ( -6) / 4733 closed (+16) / 5534 total (+10) RFE : 165 open ( +2) / 139 closed ( +0) / 304 total ( +2) New / Reopened Patches ______________________ Patch for bug 999042. (2004-12-23) http://python.org/sf/1090482 opened by Darek Suchojad _AEModule.c patch (2004-12-25) http://python.org/sf/1090958 opened by has py-compile DESTDIR support to compile in correct paths (2004-12-27) CLOSED http://python.org/sf/1091679 opened by Thomas Vander Stichele Refactoring Python/import.c (2004-12-30) http://python.org/sf/1093253 opened by Thomas Heller socket leak in SocketServer (2004-12-30) http://python.org/sf/1093468 opened by Shannon -jj Behrens sanity check for readline remove/replace (2004-12-30) http://python.org/sf/1093585 opened by DSM miscellaneous doc typos (2004-12-31) CLOSED http://python.org/sf/1093896 opened by DSM Patches Closed ______________ Avoid calling tp_compare with different types (2004-07-22) http://python.org/sf/995939 closed by arigo py-compile DESTDIR support to compile in correct paths (2004-12-27) http://python.org/sf/1091679 closed by jafo miscellaneous doc typos (2004-12-31) http://python.org/sf/1093896 closed by rhettinger New / Reopened Bugs ___________________ presentation typo in lib: 6.21.4.2 How callbacks are called (2004-12-23) http://python.org/sf/1090139 reopened by jlgijsbers input from numeric pad always dropped when numlock off (2004-11-27) http://python.org/sf/1074333 reopened by kbk minor bug in what's new > decorators (2004-12-26) CLOSED http://python.org/sf/1091302 opened by vincent wehren A large block of commands after an "if" cannot be (2003-03-28) http://python.org/sf/711268 reopened by facundobatista DESTROOTed frameworkinstall fails (2004-12-26) CLOSED http://python.org/sf/1091468 opened by Jack Jansen No need to fix (2004-12-27) CLOSED http://python.org/sf/1091634 opened by Bertram Scharpf garbage collector still documented as optional (2004-12-27) http://python.org/sf/1091740 opened by Gregory H. Ball IDLE hangs due to subprocess (2004-12-28) http://python.org/sf/1092225 opened by ZACK slice [0:] default is len-1 not len (2004-12-28) CLOSED http://python.org/sf/1092240 opened by Robert Phillips Memory leak in socket.py on Mac OS X 10.3 (2004-12-28) http://python.org/sf/1092502 opened by bacchusrx os.remove fails on win32 with read-only file (2004-12-29) http://python.org/sf/1092701 opened by Joshua Weage Make Generators Pickle-able (2004-12-29) http://python.org/sf/1092962 opened by Jayson Vantuyl distutils/tests not installed (2004-12-30) http://python.org/sf/1093173 opened by Armin Rigo mapitags.PROP_TAG() doesn't account for new longs (2004-12-30) http://python.org/sf/1093389 opened by Joe Hildebrand Bugs Closed ___________ presentation typo in lib: 6.21.4.2 How callbacks are called (2004-12-22) http://python.org/sf/1090139 closed by rhettinger Memory leaks? (2004-10-16) http://python.org/sf/1048495 closed by rhettinger _bsddb segfault (2004-07-15) http://python.org/sf/991754 closed by dcjim coercion results used dangerously (2004-06-26) http://python.org/sf/980352 closed by arigo exec scoping problem (2004-12-22) http://python.org/sf/1089978 closed by arigo _DummyThread() objects not freed from threading._active map (2004-12-22) http://python.org/sf/1089632 closed by bcannon Mac Library Modules 1.1.1 Bad Info (2004-12-14) http://python.org/sf/1085300 closed by bcannon minor bug in what's new > decorators (2004-12-26) http://python.org/sf/1091302 closed by montanaro A large block of commands after an "if" cannot be (2003-03-28) http://python.org/sf/711268 closed by bcannon Failed assert in stringobject.c (2003-05-14) http://python.org/sf/737947 closed by facundobatista DESTROOTed frameworkinstall fails (2004-12-26) http://python.org/sf/1091468 closed by jackjansen nturl2path.url2pathname() mishandles /// (2002-12-07) http://python.org/sf/649961 closed by mike_j_brown No need to fix (2004-12-27) http://python.org/sf/1091634 closed by mwh 2.4a3: unhelpful error message from distutils (2004-09-03) http://python.org/sf/1021756 closed by effbot BuildApplication includes many unneeded modules (2004-12-01) http://python.org/sf/1076492 closed by jackjansen slice [0:] default is len-1 not len (2004-12-28) http://python.org/sf/1092240 closed by jlgijsbers truncated gzip file triggers zlibmodule segfault (2004-12-10) http://python.org/sf/1083110 closed by akuchling os.ttyname() accepts wrong arguments (2004-12-07) http://python.org/sf/1080713 closed by akuchling New / Reopened RFE __________________ Distutils needs a way *not* to install files (2004-12-28) http://python.org/sf/1092365 opened by Mike Orr From bob at redivi.com Sun Jan 2 04:40:35 2005 From: bob at redivi.com (Bob Ippolito) Date: Sun Jan 2 04:40:41 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.2, 1.3 In-Reply-To: References: Message-ID: <0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com> On Jan 1, 2005, at 5:33 PM, jackjansen@users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Mac/OSX > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv14408 > > Modified Files: > fixapplepython23.py > Log Message: > Create the wrapper scripts for gcc/g++ too. > > +SCRIPT="""#!/bin/sh > +export MACOSX_DEPLOYMENT_TARGET=10.3 > +exec %s "${@}" This script should check to see if MACOSX_DEPLOYMENT_TARGET is already set. If I have some reason to set MACOSX_DEPLOYMENT_TARGET=10.4 for compilation (say I'm compiling an extension that requires 10.4 features) then I'm going to have some serious problems with this fix. -bob From Jack.Jansen at cwi.nl Sun Jan 2 22:28:22 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Sun Jan 2 22:28:11 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.2, 1.3 In-Reply-To: <0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com> References: <0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com> Message-ID: <3A6CE662-5D05-11D9-81BB-000D934FF6B4@cwi.nl> On 2-jan-05, at 4:40, Bob Ippolito wrote: >> +SCRIPT="""#!/bin/sh >> +export MACOSX_DEPLOYMENT_TARGET=10.3 >> +exec %s "${@}" > > This script should check to see if MACOSX_DEPLOYMENT_TARGET is already > set. If I have some reason to set MACOSX_DEPLOYMENT_TARGET=10.4 for > compilation (say I'm compiling an extension that requires 10.4 > features) then I'm going to have some serious problems with this fix. I was going to do that, but then I thought it didn't make any sense, because this script is *only* used in the context of Apple-provided Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other than 10.3 (be it lower or higher) while compiling an extension for Apple's 2.3 is going to produce disappointing results anyway. But, if I've missed a use case, please enlighten me. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From bob at redivi.com Sun Jan 2 22:35:16 2005 From: bob at redivi.com (Bob Ippolito) Date: Sun Jan 2 22:35:24 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.2, 1.3 In-Reply-To: <3A6CE662-5D05-11D9-81BB-000D934FF6B4@cwi.nl> References: <0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com> <3A6CE662-5D05-11D9-81BB-000D934FF6B4@cwi.nl> Message-ID: <3192B720-5D06-11D9-8981-000A9567635C@redivi.com> On Jan 2, 2005, at 4:28 PM, Jack Jansen wrote: > > On 2-jan-05, at 4:40, Bob Ippolito wrote: >>> +SCRIPT="""#!/bin/sh >>> +export MACOSX_DEPLOYMENT_TARGET=10.3 >>> +exec %s "${@}" >> >> This script should check to see if MACOSX_DEPLOYMENT_TARGET is >> already set. If I have some reason to set >> MACOSX_DEPLOYMENT_TARGET=10.4 for compilation (say I'm compiling an >> extension that requires 10.4 features) then I'm going to have some >> serious problems with this fix. > > I was going to do that, but then I thought it didn't make any sense, > because this script is *only* used in the context of Apple-provided > Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other > than 10.3 (be it lower or higher) while compiling an extension for > Apple's 2.3 is going to produce disappointing results anyway. > > But, if I've missed a use case, please enlighten me. You're right, of course. I had realized that I was commenting on the fixpython script after I had replied, but my concern is still applicable to whatever solution is used for Python 2.4.1. Anything lower than 10.3 is of course an error, in either case. -bob From Jack.Jansen at cwi.nl Mon Jan 3 00:16:07 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Mon Jan 3 00:16:02 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.2, 1.3 In-Reply-To: <3192B720-5D06-11D9-8981-000A9567635C@redivi.com> References: <0FFA1732-5C70-11D9-BBAB-000A9567635C@redivi.com> <3A6CE662-5D05-11D9-81BB-000D934FF6B4@cwi.nl> <3192B720-5D06-11D9-8981-000A9567635C@redivi.com> Message-ID: <47FBE1CA-5D14-11D9-81BB-000D934FF6B4@cwi.nl> On 2-jan-05, at 22:35, Bob Ippolito wrote: >> On 2-jan-05, at 4:40, Bob Ippolito wrote: >>>> +SCRIPT="""#!/bin/sh >>>> +export MACOSX_DEPLOYMENT_TARGET=10.3 >>>> +exec %s "${@}" >>> >>> This script should check to see if MACOSX_DEPLOYMENT_TARGET is >>> already set. If I have some reason to set >>> MACOSX_DEPLOYMENT_TARGET=10.4 for compilation (say I'm compiling an >>> extension that requires 10.4 features) then I'm going to have some >>> serious problems with this fix. >> >> I was going to do that, but then I thought it didn't make any sense, >> because this script is *only* used in the context of Apple-provided >> Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other >> than 10.3 (be it lower or higher) while compiling an extension for >> Apple's 2.3 is going to produce disappointing results anyway. >> >> But, if I've missed a use case, please enlighten me. > > You're right, of course. I had realized that I was commenting on the > fixpython script after I had replied, but my concern is still > applicable to whatever solution is used for Python 2.4.1. Anything > lower than 10.3 is of course an error, in either case. 2.4.1 will install this fix into Apple-installed Python 2.3 (if applicable, i.e. if you're installing 2.4.1 on 10.3), but for its own use it will have the newer distutils, which understands that it needs to pick up MACOSX_DEPLOYMENT_TARGET from the Makefile, so it'll never see these scripts. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From bob at redivi.com Mon Jan 3 03:43:32 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 3 03:43:43 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations Message-ID: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> Quite a few notable places in the Python sources expect realloc(...) to relinquish some memory if the requested size is smaller than the currently allocated size. This is definitely not true on Darwin, and possibly other platforms. I have tested this on OpenBSD and Linux, and the implementations on these platforms do appear to relinquish memory, but I didn't read the implementation. I haven't been able to find any documentation that states that realloc should make this guarantee, but I figure Darwin does this as an "optimization" and because Darwin probably can't resize mmap'ed memory (at least it can't from Python, but this probably means it doesn't have this capability at all). It is possible to "fix" this for Darwin, because you can ask the default malloc zone how big a particular allocation is, and how big an allocation of a given size will actually be (see: ). The obvious place to put this would be PyObject_Realloc, because this is at least called by _PyString_Resize (which will fix ). Should I write up a patch that "fixes" this? I guess the best thing to do would be to determine whether the fix should be used at runtime, by allocating a meg or so, resizing it to 1 byte, and see if the size of the allocation changes. If the size of the allocation does change, then the system realloc can be trusted to do what Python expects it to do, otherwise realloc should be done "cleanly" by allocating a new block (returning the original on failure, because it's good enough and some places in Python seem to expect that shrink will never fail), memcpy, free, return new block. I wrote up a small hack that does this realloc indirection to CVS trunk, and it doesn't seem to cause any measurable difference in pystone performance. Note that all versions of Darwin that I've looked at (6.x, 7.x, and 8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have this "issue", but it might go away by Mac OS X 10.4 or some later release. This URL points to the sf bug and Darwin 7.7's realloc(...) implementation: http://bob.pythonmac.org/archives/2005/01/01/realloc-doesnt/ -bob From tim.peters at gmail.com Mon Jan 3 06:13:22 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon Jan 3 06:13:25 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> Message-ID: <1f7befae05010221134a94eccd@mail.gmail.com> [Bob Ippolito] > Quite a few notable places in the Python sources expect realloc(...) to > relinquish some memory if the requested size is smaller than the > currently allocated size. I don't know what "relinquish some memory" means. If it means something like "returns memory to the OS, so that the reported process size shrinks", then no, nothing in Python ever assumes that. That's simply because "returns memory to the OS" and "process size" aren't concepts in the C standard, and so nothing can be said about them in general -- not in theory, and neither in practice, because platforms (OS+libc combos) vary so widely in behavior here. As a pragmatic matter, I *expect* that a production-quality realloc() implementation will at least be able to reuse released memory, provided that the amount released is at least half the amount originally malloc()'ed (and, e.g., reasonable buddy systems may not be able to do better than that). > This is definitely not true on Darwin, and possibly other platforms. I have tested > this on OpenBSD and Linux, and the implementations on these platforms do > appear to relinquish memory, As above, don't know what this means. > but I didn't read the implementation. I haven't been able to find any > documentation that states that realloc should make this guarantee, realloc() guarantees very little; it certainly doesn't guarantee anything, e.g., about OS interactions or process sizes. > but I figure Darwin does this as an "optimization" and because Darwin > probably can't resize mmap'ed memory (at least it can't from Python, > but this probably means it doesn't have this capability at all). > > It is possible to "fix" this for Darwin, I don't understand what's "broken". Small objects go thru Python's own allocator, which has its own realloc policies and its own peculiarities (chiefly that pymalloc never free()s any memory allocated for small objects). > because you can ask the default malloc zone how big a particular > allocation is, and how big an allocation of a given size will actually > be (see: ). > The obvious place to put this would be PyObject_Realloc, because this > is at least called by _PyString_Resize (which will fix > ). The diagnosis in the bug report seems to leave it pointing at socket.py's _fileobject.read(), although I suspect the real cause is in socketmodule.c's sock_recv(). We've had other reports of various problems when people pass absurdly large values to socket recv(). A better fix here would probably amount to rewriting sock_recv() to refuse to pass enormous numbers to the platform recv() (it appears that many platform recv() implementations simply don't expect a recv() argument to be much bigger than the native network buffer size, and screw up when that's not so). > Should I write up a patch that "fixes" this? I guess the best thing to > do would be to determine whether the fix should be used at runtime, by > allocating a meg or so, resizing it to 1 byte, and see if the size of > the allocation changes. If the size of the allocation does change, > then the system realloc can be trusted to do what Python expects it to > do, otherwise realloc should be done "cleanly" by allocating a new > block (returning the original on failure, because it's good enough and > some places in Python seem to expect that shrink will never fail), Yup, that assumption (that a non-growing realloc can't fail) is all over the place. > memcpy, free, return new block. > > I wrote up a small hack that does this realloc indirection to CVS > trunk, and it doesn't seem to cause any measurable difference in > pystone performance. > > Note that all versions of Darwin that I've looked at (6.x, 7.x, and > 8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have > this "issue", but it might go away by Mac OS X 10.4 or some later > release. > > This URL points to the sf bug and Darwin 7.7's realloc(...) > implementation: > http://bob.pythonmac.org/archives/2005/01/01/realloc-doesnt/ It would be good to rewrite sock_recv() more defensively in any case. Best I can tell, this implementation of realloc() is standard-conforming but uniquely brain dead in its downsize behavior. I don't expect the latter will last (as you say on your page, "probably plenty of other software" also makes the same pragmatic assumptions about realloc downsize behavior), so I'm not keen to gunk up Python to worm around it. From bob at redivi.com Mon Jan 3 07:08:24 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 3 07:08:37 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <1f7befae05010221134a94eccd@mail.gmail.com> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> Message-ID: On Jan 3, 2005, at 12:13 AM, Tim Peters wrote: > [Bob Ippolito] >> Quite a few notable places in the Python sources expect realloc(...) >> to >> relinquish some memory if the requested size is smaller than the >> currently allocated size. > > I don't know what "relinquish some memory" means. If it means > something like "returns memory to the OS, so that the reported process > size shrinks", then no, nothing in Python ever assumes that. That's > simply because "returns memory to the OS" and "process size" aren't > concepts in the C standard, and so nothing can be said about them in > general -- not in theory, and neither in practice, because platforms > (OS+libc combos) vary so widely in behavior here. > > As a pragmatic matter, I *expect* that a production-quality realloc() > implementation will at least be able to reuse released memory, > provided that the amount released is at least half the amount > originally malloc()'ed (and, e.g., reasonable buddy systems may not be > able to do better than that). This is what I meant by relinquish (c/o merriam-webster): a : to stop holding physically : RELEASE b : to give over possession or control of : YIELD Your expectation is not correct for Darwin's memory allocation scheme. It seems that Darwin creates allocations of immutable size. The only way ANY part of an allocation will ever be used by ANYTHING else is if free() is called with that allocation. free() can be called either explicitly, or implicitly by calling realloc() with a size larger than the size of the allocation. In that case, it will create a new allocation of at least the requested size, copy the contents of the original allocation into the new allocation (probably with copy-on-write pages if it's large enough, so it might be cheap), and free() the allocation. In the case where realloc() specifies a size that is not greater than the allocation's size, it will simply return the given allocation and cause no side-effects whatsoever. Was this a good decision? Probably not! However, it is our (in the "I know you use Windows but I am not the only one that uses Mac OS X" sense) problem so long as Darwin is a supported platform, because it is highly unlikely that Apple will backport any "fix" to the allocator unless we can prove it has some security implications in software shipped with their OS. I attempted to look for some easy ones by performing a quick audit of Apache, OpenSSH, and OpenSSL. Unfortunately, their developers did not share your expectation. I found one sprintf-like routine in Apache that could be affected by this behavior, and one instance of immutable string creation in Apple's CoreFoundation CFString implementation, but I have yet to find an easy way to exploit this behavior from the outside. I should probably be looking at PHP and Perl instead ;) >> but I figure Darwin does this as an "optimization" and because Darwin >> probably can't resize mmap'ed memory (at least it can't from Python, >> but this probably means it doesn't have this capability at all). >> >> It is possible to "fix" this for Darwin, > > I don't understand what's "broken". Small objects go thru Python's > own allocator, which has its own realloc policies and its own > peculiarities (chiefly that pymalloc never free()s any memory > allocated for small objects). What's broken is that there are several places in Python that seem to assume that you can allocate a large chunk of memory, and make it smaller in some meaningful way with realloc(...). This is not true with Darwin. You are right about small objects. They don't matter because they're small, and because they're handled by Python's allocator. >> because you can ask the default malloc zone how big a particular >> allocation is, and how big an allocation of a given size will actually >> be (see: ). >> The obvious place to put this would be PyObject_Realloc, because this >> is at least called by _PyString_Resize (which will fix >> ). > > The diagnosis in the bug report seems to leave it pointing at > socket.py's _fileobject.read(), although I suspect the real cause is > in socketmodule.c's sock_recv(). We've had other reports of various > problems when people pass absurdly large values to socket recv(). A > better fix here would probably amount to rewriting sock_recv() to > refuse to pass enormous numbers to the platform recv() (it appears > that many platform recv() implementations simply don't expect a recv() > argument to be much bigger than the native network buffer size, and > screw up when that's not so). You are correct. The real cause is in sock_recv(), and/or _PyString_Resize(), depending on how you look at it. >> Note that all versions of Darwin that I've looked at (6.x, 7.x, and >> 8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have >> this "issue", but it might go away by Mac OS X 10.4 or some later >> release. > > It would be good to rewrite sock_recv() more defensively in any case. > Best I can tell, this implementation of realloc() is > standard-conforming but uniquely brain dead in its downsize behavior. Presumably this can happen at other places (including third party extensions), so a better place to do this might be _PyString_Resize(). list_resize() is another reasonable place to put this. I'm sure there are other places that use realloc() too, and the majority of them do this through obmalloc. So maybe instead of trying to track down all the places where this can manifest, we should just "gunk up" Python and patch PyObject_Realloc()? Since we are both pretty confident that other allocators aren't like Darwin, this "gunk" can be #ifdef'ed to the __APPLE__ case. > I don't expect the latter will last (as you say on your page, > "probably plenty of other software" also makes the same pragmatic > assumptions about realloc downsize behavior), so I'm not keen to gunk > up Python to worm around it. As I said above, I haven't yet found any other software that makes the same kind of realloc() assumptions that Python does. I'm sure I'll find something, but what's important to me is that Python works well on Mac OS X, so something should happen. If we can't prove that Apple's allocation strategy is a security flaw in some service that ships with the OS, any improvements to this strategy are very unlikely to be backported to current versions of Mac OS X. -bob From tim.peters at gmail.com Mon Jan 3 08:16:34 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon Jan 3 08:16:54 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> Message-ID: <1f7befae050102231638b0d39d@mail.gmail.com> [Bob Ippolito] > ... > Your expectation is not correct for Darwin's memory allocation scheme. > It seems that Darwin creates allocations of immutable size. The only > way ANY part of an allocation will ever be used by ANYTHING else is if > free() is called with that allocation. Ya, I understood that. My conclusion was that Darwin's realloc() implementation isn't production-quality. So it goes. > free() can be called either explicitly, or implicitly by calling realloc() with > a size larger than the size of the allocation. In that case, it will create a new > allocation of at least the requested size, copy the contents of the > original allocation into the new allocation (probably with > copy-on-write pages if it's large enough, so it might be cheap), and > free() the allocation. Really? Another near-universal "quality of implementation" expectation is that a growing realloc() will strive to extend in-place. Like realloc(malloc(1000000), 1000001). For example, the theoretical guarantee that one-at-a-time list.append() has amortized linear time doesn't depend on that, but pragmatically it's greatly helped by a reasonable growing realloc() implementation. > In the case where realloc() specifies a size that is not greater than the > allocation's size, it will simply return the given allocation and cause no side- > effects whatsoever. > > Was this a good decision? Probably not! Sounds more like a bug (or two) to me than "a decision", but I don't know. > However, it is our (in the "I know you use Windows but I am not the only > one that uses Mac OS X sense) problem so long as Darwin is a supported > platform, because it is highly unlikely that Apple will backport any "fix" to > the allocator unless we can prove it has some security implications in > software shipped with their OS. ... Is there any known case where Python performs poorly on this OS, for this reason, other than the "pass giant numbers to recv() and then shrink the string because we didn't get anywhere near that many bytes" case? Claiming rampant performance problems should require evidence too . ... > Presumably this can happen at other places (including third party > extensions), so a better place to do this might be _PyString_Resize(). > list_resize() is another reasonable place to put this. I'm sure there > are other places that use realloc() too, and the majority of them do > this through obmalloc. So maybe instead of trying to track down all > the places where this can manifest, we should just "gunk up" Python and > patch PyObject_Realloc()? There is no "choke point" for allocations in Python -- some places call the system realloc() directly. Maybe the latter matter on Darwin too, but maybe they don't. The scope of this hack spreads if they do. I have no idea how often realloc() is called directly by 3rd-party extension modules. It's called directly a lot in Zope's C code, but AFAICT only to grow vectors, never to shrink them. ' > Since we are both pretty confident that other allocators aren't like Darwin, > this "gunk" can be #ifdef'ed to the __APPLE__ case. #ifdef's are a last resort: they almost never go away, so they complicate the code forever after, and typically stick around for years even after the platform problems they intended to address have been fixed. For obvious reasons, they're also an endless source of platform-specific bugs. Note that pymalloc already does a memcpy+free when in PyObject_Realloc(p, n) p was obtained from the system malloc or realloc but n is small enough to meet the "small object" threshold (pymalloc "takes over" small blocks that result from a PyObject_Realloc()). That's a reasonable strategy *because* n is always small in such cases. If you're going to extend this strategy to n of arbitrary size, then you may also create new performance problems for some apps on Darwin (copying n bytes can get arbitrarily expensive). > ... > I'm sure I'll find something, but what's important to me is that Python > works well on Mac OS X, so something should happen. I agree the socket-abuse case should be fiddled, and for more reasons than just Darwin's realloc() quirks. I don't know that there are actual problems on Darwin broader than that case (and I'm not challenging you to contrive one, I'm asking whether realloc() quirks are suspected in any other case that's known). Part of what you demonstrated when you said that pystone didn't slow down when you fiddled stuff is that pystone also didn't speed up. I also don't know that the memcpy+free wormaround is actually going to help more than it hurts overall. Yes, in the socket-abuse case, where the program routinely malloc()s strings millions of bytes larger than the socket can deliver, it would obviously help. That's not typically program behavior (however typical it may be of that specific app). More typical is shrinking a long list one element at a time, in which case about half the list remaining would get memcpy'd from time to time where such copies never get made today. IOW, there's no straightforward pure win here. From gvanrossum at gmail.com Mon Jan 3 08:17:59 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 3 08:18:18 2005 Subject: [Python-Dev] Zipfile needs? In-Reply-To: <41D1B0C6.8040208@ocf.berkeley.edu> References: <41D1B0C6.8040208@ocf.berkeley.edu> Message-ID: > Encryption/decryption support. Will most likely require a C extension since > the algorithm relies on ints (or longs, don't remember) wrapping around when > the value becomes too large. You may want to do this in C for speed, but C-style int wrapping is easily done by doing something like "x = x & 0xFFFFFFFFL" at crucial points in the code (for unsigned 32-bit ints) with an additional "if x & 0x80000000L: x -= 0x100000000L" to simulate signed 32-bit ints. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bob at redivi.com Mon Jan 3 16:48:00 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 3 16:48:14 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <1f7befae050102231638b0d39d@mail.gmail.com> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> Message-ID: On Jan 3, 2005, at 2:16 AM, Tim Peters wrote: > [Bob Ippolito] >> ... >> Your expectation is not correct for Darwin's memory allocation scheme. >> It seems that Darwin creates allocations of immutable size. The only >> way ANY part of an allocation will ever be used by ANYTHING else is if >> free() is called with that allocation. > > Ya, I understood that. My conclusion was that Darwin's realloc() > implementation isn't production-quality. So it goes. Whatever that means. >> free() can be called either explicitly, or implicitly by calling >> realloc() with >> a size larger than the size of the allocation. In that case, it will >> create a new >> allocation of at least the requested size, copy the contents of the >> original allocation into the new allocation (probably with >> copy-on-write pages if it's large enough, so it might be cheap), and >> free() the allocation. > > Really? Another near-universal "quality of implementation" > expectation is that a growing realloc() will strive to extend > in-place. Like realloc(malloc(1000000), 1000001). For example, the > theoretical guarantee that one-at-a-time list.append() has amortized > linear time doesn't depend on that, but pragmatically it's greatly > helped by a reasonable growing realloc() implementation. I said that it created allocations of fixed size, not that it created allocations of exactly the size you asked it to. Yes, it will extend in-place for many cases, including the given. >> In the case where realloc() specifies a size that is not greater >> than the >> allocation's size, it will simply return the given allocation and >> cause no side- >> effects whatsoever. >> >> Was this a good decision? Probably not! > > Sounds more like a bug (or two) to me than "a decision", but I don't > know. You said yourself that it is standards compliant ;) I have filed it as a bug, but it is probably unlikely to be backported to current versions of Mac OS X unless a case can be made that it is indeed a security flaw. >> However, it is our (in the "I know you use Windows but I am not the >> only >> one that uses Mac OS X sense) problem so long as Darwin is a supported >> platform, because it is highly unlikely that Apple will backport any >> "fix" to >> the allocator unless we can prove it has some security implications in >> software shipped with their OS. ... > > Is there any known case where Python performs poorly on this OS, for > this reason, other than the "pass giant numbers to recv() and then > shrink the string because we didn't get anywhere near that many bytes" > case? Claiming rampant performance problems should require evidence > too . Known case? No. Do I want to search Python application-space to find one? No. >> Presumably this can happen at other places (including third party >> extensions), so a better place to do this might be _PyString_Resize(). >> list_resize() is another reasonable place to put this. I'm sure there >> are other places that use realloc() too, and the majority of them do >> this through obmalloc. So maybe instead of trying to track down all >> the places where this can manifest, we should just "gunk up" Python >> and >> patch PyObject_Realloc()? > > There is no "choke point" for allocations in Python -- some places > call the system realloc() directly. Maybe the latter matter on Darwin > too, but maybe they don't. The scope of this hack spreads if they do. > I have no idea how often realloc() is called directly by 3rd-party > extension modules. It's called directly a lot in Zope's C code, but > AFAICT only to grow vectors, never to shrink them. In the case of Python, "some places" means "nowhere relevant". Four standard library extension modules relevant to the platform use realloc directly: _sre Uses realloc only to grow buffers. cPickle Uses realloc only to grow buffers. cStringIO Uses realloc only to grow buffers. regexpr: Uses realloc only to grow buffers. If Zope doesn't use the allocator that Python gives it, then it can deal with its own problems. I would expect most extensions to use Python's allocator. >> Since we are both pretty confident that other allocators aren't like >> Darwin, >> this "gunk" can be #ifdef'ed to the __APPLE__ case. > > #ifdef's are a last resort: they almost never go away, so they > complicate the code forever after, and typically stick around for > years even after the platform problems they intended to address have > been fixed. For obvious reasons, they're also an endless source of > platform-specific bugs. They're also the only good way to deal with platform-specific inconsistencies. In this specific case, it's not even possible to determine if a particular allocator implementation is stupid or not without at least using a platform-allocator-specific function to query the size reserved by a given allocation. > Note that pymalloc already does a memcpy+free when in > PyObject_Realloc(p, n) p was obtained from the system malloc or > realloc but n is small enough to meet the "small object" threshold > (pymalloc "takes over" small blocks that result from a > PyObject_Realloc()). That's a reasonable strategy *because* n is > always small in such cases. If you're going to extend this strategy > to n of arbitrary size, then you may also create new performance > problems for some apps on Darwin (copying n bytes can get arbitrarily > expensive). There's obviously a tradeoff between copying lots of bytes and having lots of memory go to waste. That should be taken into consideration when considering how many pages could be returned to the allocator. Note that we can ask the allocator how much memory an allocation has actually reserved (which is usually somewhat larger than the amount you asked it for) and how much memory an allocation will reserve for a given size. An allocation resize wouldn't even show up as smaller unless at least one page would be freed (for sufficiently large allocations anyway, the minimum granularity is 16 bytes because it guarantees that alignment). Obviously if you have a lot of pages anyway, one page isn't a big deal, so we would probably only resort to free()/memcpy() if some fair percentage of the total pages used by the allocation could be rescued. If it does end up causing some real performance problems anyway, there's always deeper hacks like using vm_copy(), a Darwin specific function which will do copy-on-write instead (which only makes sense if the allocation is big enough for this to actually be a performance improvement). >> ... >> I'm sure I'll find something, but what's important to me is that >> Python >> works well on Mac OS X, so something should happen. > > I agree the socket-abuse case should be fiddled, and for more reasons > than just Darwin's realloc() quirks. I don't know that there are > actual problems on Darwin broader than that case (and I'm not > challenging you to contrive one, I'm asking whether realloc() quirks > are suspected in any other case that's known). Part of what you > demonstrated when you said that pystone didn't slow down when you > fiddled stuff is that pystone also didn't speed up. I also don't know > that the memcpy+free wormaround is actually going to help more than it > hurts overall. Yes, in the socket-abuse case, where the program > routinely malloc()s strings millions of bytes larger than the socket > can deliver, it would obviously help. That's not typically program > behavior (however typical it may be of that specific app). More > typical is shrinking a long list one element at a time, in which case > about half the list remaining would get memcpy'd from time to time > where such copies never get made today. I do not yet know of another specific case where Darwin's realloc() implementation causes a problem. The list case would certainly be a loss with current behavior if the list gets extremely large at some point, but then becomes small and stays that way for a long period of time. > IOW, there's no straightforward pure win here. Well at least we have a nice bug to report to Apple, whether or not we do something about it ourselves. -bob From gvanrossum at gmail.com Mon Jan 3 17:15:24 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 3 17:15:27 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> Message-ID: Coming late to this thread. I don't see the point of lying awake at night worrying about potential memory losses unless you've heard someone complain about it. As Tim has been trying to explain, here are plenty of other things in Python that we *could* speed up if there was a need; since every speedup uglifies the code somewhat, we'd end up with very ugly code if we did them all. Remember, don't optimize prematurely. Here's one theoretical reason why even with socket.recv() it probably doesn't matter in practice: the overallocated string will usually be freed as soon as the data has been parsed from it, and this will free the overallocation as well! OTOH, if you want to do more research, checking the usage patterns for StringRealloc and TupleRealloc would be useful. I could imagine code in either that makes a copy if the new size is less than some fraction of the old size. Most code that I recall writing using these tends to start with a guaranteed-to-fit overallocation, and a single resize at the end. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bob at redivi.com Mon Jan 3 17:30:14 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 3 17:30:26 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> Message-ID: On Jan 3, 2005, at 11:15 AM, Guido van Rossum wrote: > Coming late to this thread. > > I don't see the point of lying awake at night worrying about potential > memory losses unless you've heard someone complain about it. As Tim > has been trying to explain, here are plenty of other things in Python > that we *could* speed up if there was a need; since every speedup > uglifies the code somewhat, we'd end up with very ugly code if we did > them all. Remember, don't optimize prematurely. We *have* had someone complain about it: http://python.org/sf/1092502 > Here's one theoretical reason why even with socket.recv() it probably > doesn't matter in practice: the overallocated string will usually be > freed as soon as the data has been parsed from it, and this will free > the overallocation as well! That depends on how socket.recv is used. Sometimes, a list of strings is used rather than a cStringIO (or equivalent), which can cause problems (see above referenced bug). -bob From Scott.Daniels at Acm.Org Mon Jan 3 18:07:00 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Mon Jan 3 18:05:36 2005 Subject: [Python-Dev] Re: Zipfile needs? In-Reply-To: <41D1B0C6.8040208@ocf.berkeley.edu> References: <41D1B0C6.8040208@ocf.berkeley.edu> Message-ID: Brett C. wrote: > Scott David Daniels wrote: > >> I'm hoping to add BZIP2 compression to zipfile for 2.5. My primary >> motivation is that Project Gutenberg seems to be starting to use BZIP2 >> compression for some of its zips. What other wish list things do >> people around here have for zipfile? I thought I'd collect input here >> and make a PEP. > Encryption/decryption support. Will most likely require a C extension > since the algorithm relies on ints (or longs, don't remember) wrapping > around when the value becomes too large. I'm trying to use byte-block streams (iterators taking iterables) as the basic structure of getting data in and out. I think the encryption/ decryption can then be plugged in at the right point. If it can be set up properly, you can import the encryption separately and connect it to zipfiles with a call. Would this address what you want? I believe there is an issue actually building in the encryption/decryption in terms of redistribution. -- -- Scott David Daniels Scott.Daniels@Acm.Org From bacchusrx at skorga.org Mon Jan 3 21:23:50 2005 From: bacchusrx at skorga.org (bacchusrx) Date: Mon Jan 3 21:23:58 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <1f7befae050102231638b0d39d@mail.gmail.com> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> Message-ID: <20050103202350.GA17165@skorga.org> On Thu, Jan 01, 1970 at 12:00:00AM +0000, Tim Peters wrote: > Is there any known case where Python performs poorly on this OS, for > this reason, other than the "pass giant numbers to recv() and then > shrink the string because we didn't get anywhere near that many bytes" > case? > > [...] > > I agree the socket-abuse case should be fiddled, and for more reasons > than just Darwin's realloc() quirks. [...] Yes, in the socket-abuse > case, where the program routinely malloc()s strings millions of bytes > larger than the socket can deliver, it would obviously help. That's > not typically program behavior (however typical it may be of that > specific app). Note that, with respect to http://python.org/sf/1092502, the author of the (original) program was using the documented interface to a file object. It's _fileobject.read() that decides to ask for huge numbers of bytes from recv() (specifically, in the max(self._rbufsize, left) condition). Patched to use a fixed recv_size, you of course sidestep the realloc() nastiness in this particular case. bacchusrx. From bob at redivi.com Mon Jan 3 21:55:19 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 3 21:55:29 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <20050103202350.GA17165@skorga.org> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> <20050103202350.GA17165@skorga.org> Message-ID: On Jan 3, 2005, at 3:23 PM, bacchusrx wrote: > On Thu, Jan 01, 1970 at 12:00:00AM +0000, Tim Peters wrote: >> Is there any known case where Python performs poorly on this OS, for >> this reason, other than the "pass giant numbers to recv() and then >> shrink the string because we didn't get anywhere near that many bytes" >> case? >> >> [...] >> >> I agree the socket-abuse case should be fiddled, and for more reasons >> than just Darwin's realloc() quirks. [...] Yes, in the socket-abuse >> case, where the program routinely malloc()s strings millions of bytes >> larger than the socket can deliver, it would obviously help. That's >> not typically program behavior (however typical it may be of that >> specific app). > > Note that, with respect to http://python.org/sf/1092502, the author of > the (original) program was using the documented interface to a file > object. It's _fileobject.read() that decides to ask for huge numbers > of > bytes from recv() (specifically, in the max(self._rbufsize, left) > condition). Patched to use a fixed recv_size, you of course sidestep > the > realloc() nastiness in this particular case. While using a reasonably sized recv_size is a good idea, using a smaller request size simply means that it's less likely that the strings will be significantly resized. It is still highly likely they *will* be resized and that doesn't solve the problem that over-allocated strings will persist until the entire request is fulfilled. For example, receiving 1 byte chunks (if that's even possible) would exacerbate the issue even for a small request size. If you asked for 8 MB with a request size of 1024 bytes, and received it in 1 byte chunks, you would need a minimum of an impossible ~16 GB to satisfy that request (minimum ~8 GB to collect the strings, minimum ~8 GB to concatenate them) as opposed to the Python-optimal case of ~16 MB when always using compact representations. Using cStringIO instead of a list of potentially over-allocated strings would actually have such Python-optimal memory usage characteristics on all platforms. -bob From bsder at mail.allcaps.org Mon Jan 3 22:26:12 2005 From: bsder at mail.allcaps.org (Andrew P. Lentvorski, Jr.) Date: Mon Jan 3 22:26:14 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <1f7befae050102231638b0d39d@mail.gmail.com> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> Message-ID: <1776760A-5DCE-11D9-A1A7-000A95C874EE@mail.allcaps.org> On Jan 2, 2005, at 11:16 PM, Tim Peters wrote: > [Bob Ippolito] >> However, it is our (in the "I know you use Windows but I am not the >> only >> one that uses Mac OS X sense) problem so long as Darwin is a supported >> platform, because it is highly unlikely that Apple will backport any >> "fix" to >> the allocator unless we can prove it has some security implications in >> software shipped with their OS. ... > > Is there any known case where Python performs poorly on this OS, for > this reason, other than the "pass giant numbers to recv() and then > shrink the string because we didn't get anywhere near that many bytes" > case? Claiming rampant performance problems should require evidence > too . Possibly. When using the stock btdownloadcurses.py from bitconjurer.org, I occasionally see a memory thrash on OS X. Normally I have to be in a mode where I am aggregating lots of small connections (10Kbps or less uploads) into a large download (10Mbps transfer rate on a >500MB file). When the file completes, Python sends OS X into a long-lasting spinning ball of death. It will emerge after about 10 minutes or so. I do not see this same behavior on Linux or FreeBSD. I never filed a bug because I can't reliably reproduce it (it is dependent upon the upload characteristics of the torrent swarm). However, it seems to fit the bug and diagnosis. -a From bac at OCF.Berkeley.EDU Mon Jan 3 22:42:38 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Jan 3 22:43:02 2005 Subject: [Python-Dev] Re: Zipfile needs? In-Reply-To: References: <41D1B0C6.8040208@ocf.berkeley.edu> Message-ID: <41D9BC4E.5020709@ocf.berkeley.edu> Scott David Daniels wrote: > Brett C. wrote: > >> Scott David Daniels wrote: >> >>> I'm hoping to add BZIP2 compression to zipfile for 2.5. My primary >>> motivation is that Project Gutenberg seems to be starting to use BZIP2 >>> compression for some of its zips. What other wish list things do >>> people around here have for zipfile? I thought I'd collect input here >>> and make a PEP. >> >> Encryption/decryption support. Will most likely require a C extension >> since the algorithm relies on ints (or longs, don't remember) wrapping >> around when the value becomes too large. > > > I'm trying to use byte-block streams (iterators taking iterables) as > the basic structure of getting data in and out. I think the encryption/ > decryption can then be plugged in at the right point. If it can be set > up properly, you can import the encryption separately and connect it to > zipfiles with a call. Would this address what you want? I believe > there is an issue actually building in the encryption/decryption in > terms of redistribution. > Possibly. Encryption is part of the PKZIP spec so I was just thinking of covering that, not adding external encryption support. It really is not overly complex stuff, just will want to do it in C for speed probably as Guido suggested (but, as always, I would profile that first to see if performance is really that bad). -Brett From tim.peters at gmail.com Mon Jan 3 22:49:29 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon Jan 3 22:49:34 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> Message-ID: <1f7befae050103134942ab1696@mail.gmail.com> [Tim Peters] >> Ya, I understood that. My conclusion was that Darwin's realloc() >> implementation isn't production-quality. So it goes. [Bob Ippolito] > Whatever that means. Well, it means what it said. The C standard says nothing about performance metrics of any kind, and a production-quality implementation of C requires very much more than just meeting what the standard requires. The phrase "quality of implementation" is used in the C Rationale (but not in the standard proper) to cover all such issues. realloc() pragmatics are quality-of-implementation issues; the accuracy of fp arithmetic is another (e.g., if you get back -666.0 from the C 1.0 + 2.0, there's nothing in the standard to justify a complaint). >>> free() can be called either explicitly, or implicitly by calling >>> realloc() with a size larger than the size of the allocation. >From later comments feigning outrage , I take it that "the size of the allocation" here does not mean the specific number the user passed to the previous malloc/realloc call, but means whatever amount of address space the implementation decided to use internally. Sorry, but I assumed it meant the former at first. ... >>> Was this a good decision? Probably not! >> Sounds more like a bug (or two) to me than "a decision", but I don't >> know. > You said yourself that it is standards compliant ;) I have filed it as > a bug, but it is probably unlikely to be backported to current versions > of Mac OS X unless a case can be made that it is indeed a security > flaw. That's plausible. If you showed me a case where Python's list.sort() took cubic time, I'd certainly consider that to be "a bug", despite that nothing promises better behavior. If I wrote a malloc subsystem and somebody pointed out "did you know that when I malloc 1024**2+1 bytes, and then realloc(1), I lose the other megabyte forever?", I'd consider that to be "a bug" too (because, docs be damned, I wouldn't intentionally design a malloc subsystem with such behavior; and pymalloc does in fact copy bytes on a shrinking realloc in blocks it controls, whenever at least a quarter of the space is given back -- and it didn't at the start, and I considered that to be "a bug" when it was pointed out). > ... > Known case? No. Do I want to search Python application-space to find > one? No. Serious problems on a platform are usually well-known to users on that platform. For example, it was well-known that Python's list-growing strategy as of a few years ago fragmented address space horribly on Win9X. This was a C quality-of-implementation issue specific to that platform. It was eventually resolved by improving the list-growing strategy on all platforms -- although it's still the case that Win9X does worse on list-growing than other platforms, it's no longer a disaster for most list-growing apps on Win9X. If there's a problem with "overallocate then realloc() to cut back" on Darwin that affects many apps, then I'd expect Darwin users to know about that already -- lots of people have used Python on Macs since Python's beginning, "mysterious slowdowns" and "mysterious bloat" get noticed, and Darwin has been around for a while. .. >> There is no "choke point" for allocations in Python -- some places >> call the system realloc() directly. Maybe the latter matter on Darwin >> too, but maybe they don't. The scope of this hack spreads if they do. ... > In the case of Python, "some places" means "nowhere relevant". Four > standard library extension modules relevant to the platform use realloc > directly: > > _sre > Uses realloc only to grow buffers. > cPickle > Uses realloc only to grow buffers. > cStringIO > Uses realloc only to grow buffers. > regexpr: > Uses realloc only to grow buffers. Good! > If Zope doesn't use the allocator that Python gives it, then it can > deal with its own problems. I would expect most extensions to use > Python's allocator. I don't know. ... > They're [#ifdef's] also the only good way to deal with platform-specific > inconsistencies. In this specific case, it's not even possible to > determine if a particular allocator implementation is stupid or not > without at least using a platform-allocator-specific function to query > the size reserved by a given allocation. We've had bad experience on several platforms when passing large numbers to recv(). If that were addressed, it's unclear that Darwin realloc() behavior would remain a real issue. OTOH, it is clear that *just* worming around Darwin realloc() behavior won't help other platforms with problems in the same *immediate* area of bug 1092502. Gross over-allocation followed by a shrinking realloc() just isn't common in Python. sock_recv() is an exceptionally bad case. More typical is, e.g., fileobject.c's get_line(), where if "a line" exceed 100 characters the buffer keeps growing by 25% until there's enough room, then it's cut back once at the end. That typical use for shrinking realloc() just isn't going to be implicated in a real problem -- the over-allocation is always minor. > ... > There's obviously a tradeoff between copying lots of bytes and having > lots of memory go to waste. That should be taken into consideration > when considering how many pages could be returned to the allocator. > Note that we can ask the allocator how much memory an allocation has > actually reserved (which is usually somewhat larger than the amount you > asked it for) and how much memory an allocation will reserve for a > given size. An allocation resize wouldn't even show up as smaller > unless at least one page would be freed (for sufficiently large > allocations anyway, the minimum granularity is 16 bytes because it > guarantees that alignment). Obviously if you have a lot of pages > anyway, one page isn't a big deal, so we would probably only resort to > free()/memcpy() if some fair percentage of the total pages used by the > allocation could be rescued. > > If it does end up causing some real performance problems anyway, > there's always deeper hacks like using vm_copy(), a Darwin specific > function which will do copy-on-write instead (which only makes sense if > the allocation is big enough for this to actually be a performance > improvement). As above, I'm skeptical that there's a general problem worth addressing here, and am still under the possible illusion that the Mac developers will eventually change their realloc()'s behavior anyway. If you're convinced it's worth the bother, go for it. If you do, I strongly hope that it keys off a new platform-neutral symbol (say, Py_SHRINKING_REALLOC_COPIES) and avoids Darwin-specific implementation code. Then if it turns out that it is a broad problem (across apps or across platforms), everyone can benefit. PyObject_Realloc() seems the best place to put it. Unfortunately, for blocks obtained from the system malloc(), there is no portable way to find out how much excess was allocated in a release-build Python, so "avoids Darwin-specific implementation code" may be impossible to achieve. The more it *can't* be used on any platform other than this flavor of Darwin, the more inclined I am to advise just fixing the immediate problem (sock_recv's potentially unbounded over-allocation). From bacchusrx at skorga.org Mon Jan 3 22:52:25 2005 From: bacchusrx at skorga.org (bacchusrx) Date: Mon Jan 3 22:52:37 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> <20050103202350.GA17165@skorga.org> Message-ID: <20050103215225.GA17273@skorga.org> On Mon, Jan 03, 2005 at 03:55:19PM -0500, Bob Ippolito wrote: > >Note that, with respect to http://python.org/sf/1092502, the author > >of the (original) program was using the documented interface to a > >file object. It's _fileobject.read() that decides to ask for huge > >numbers of bytes from recv() (specifically, in the > >max(self._rbufsize, left) condition). Patched to use a fixed > >recv_size, you of course sidestep the realloc() nastiness in this > >particular case. > > While using a reasonably sized recv_size is a good idea, using a > smaller request size simply means that it's less likely that the > strings will be significantly resized. It is still highly likely they > *will* be resized and that doesn't solve the problem that > over-allocated strings will persist until the entire request is > fulfilled. You're right. I should have said, "you're more likely to get away with it." The underlying issue still exists. My point is that the problem is not analogous to the guy who tried to read 2GB directly from a socket (as in http://python.org/sf/756104). Googling for MemoryError exceptions, you can find a number of spurious problems on Darwin that are probably due to this bug: SpamBayes for instance, or the thread at http://mail.python.org/pipermail/python-list/2004-November/250625.html bacchusrx. From bob at redivi.com Mon Jan 3 23:40:37 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 3 23:40:48 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <1f7befae050103134942ab1696@mail.gmail.com> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> <1f7befae050103134942ab1696@mail.gmail.com> Message-ID: <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com> On Jan 3, 2005, at 4:49 PM, Tim Peters wrote: > [Tim Peters] >>> Ya, I understood that. My conclusion was that Darwin's realloc() >>> implementation isn't production-quality. So it goes. > > [Bob Ippolito] >> Whatever that means. > > Well, it means what it said. The C standard says nothing about > performance metrics of any kind, and a production-quality > implementation of C requires very much more than just meeting what the > standard requires. The phrase "quality of implementation" is used in > the C Rationale (but not in the standard proper) to cover all such > issues. realloc() pragmatics are quality-of-implementation issues; > the accuracy of fp arithmetic is another (e.g., if you get back -666.0 > from the C 1.0 + 2.0, there's nothing in the standard to justify a > complaint). > >>>> free() can be called either explicitly, or implicitly by calling >>>> realloc() with a size larger than the size of the allocation. > > From later comments feigning outrage , I take it that "the size > of the allocation" here does not mean the specific number the user > passed to the previous malloc/realloc call, but means whatever amount > of address space the implementation decided to use internally. Sorry, > but I assumed it meant the former at first. Sorry for the confusion. >>>> Was this a good decision? Probably not! > >>> Sounds more like a bug (or two) to me than "a decision", but I don't >>> know. > >> You said yourself that it is standards compliant ;) I have filed it >> as >> a bug, but it is probably unlikely to be backported to current >> versions >> of Mac OS X unless a case can be made that it is indeed a security >> flaw. > > That's plausible. If you showed me a case where Python's list.sort() > took cubic time, I'd certainly consider that to be "a bug", despite > that nothing promises better behavior. If I wrote a malloc subsystem > and somebody pointed out "did you know that when I malloc 1024**2+1 > bytes, and then realloc(1), I lose the other megabyte forever?", I'd > consider that to be "a bug" too (because, docs be damned, I wouldn't > intentionally design a malloc subsystem with such behavior; and > pymalloc does in fact copy bytes on a shrinking realloc in blocks it > controls, whenever at least a quarter of the space is given back -- > and it didn't at the start, and I considered that to be "a bug" when > it was pointed out). I wouldn't equate "until free() is called" with "forever". But yes, I consider it a bug just as you do, and have reported it appropriately. Practically, since it exists in Mac OS X 10.2 and Mac OS X 10.3, and may not ever be fixed, we should at least consider it. >> ... >> Known case? No. Do I want to search Python application-space to find >> one? No. > > Serious problems on a platform are usually well-known to users on that > platform. For example, it was well-known that Python's list-growing > strategy as of a few years ago fragmented address space horribly on > Win9X. This was a C quality-of-implementation issue specific to that > platform. It was eventually resolved by improving the list-growing > strategy on all platforms -- although it's still the case that Win9X > does worse on list-growing than other platforms, it's no longer a > disaster for most list-growing apps on Win9X. It does take a long time to figure such weird behavior out though. I would have to guess that most people Python users on Darwin have been at it for less than 3 years. The number of people using Python on Darwin who have have written or used code that exercised this scenario are determined enough to track this sort of thing down is probably very small. > If there's a problem with "overallocate then realloc() to cut back" on > Darwin that affects many apps, then I'd expect Darwin users to know > about that already -- lots of people have used Python on Macs since > Python's beginning, "mysterious slowdowns" and "mysterious bloat" get > noticed, and Darwin has been around for a while. Most people on Mac OS X have a lot of memory, and Mac OS X generally does a good job about swapping in and out without causing much of a problem, so I'm personally not very surprised that it could go unnoticed this long. Google says: Results 1 - 10 of about 1,150 for (darwin OR Mac OR "OS X") AND MemoryError AND Python. Results 1 - 10 of about 942 for malloc vm_allocate failed. (0.73 seconds)? Of course, in both cases, not all of these can be attributed to realloc()'s implementation, but I'm sure some of them can, especially the Python ones! >> They're [#ifdef's] also the only good way to deal with >> platform-specific >> inconsistencies. In this specific case, it's not even possible to >> determine if a particular allocator implementation is stupid or not >> without at least using a platform-allocator-specific function to query >> the size reserved by a given allocation. > > We've had bad experience on several platforms when passing large > numbers to recv(). If that were addressed, it's unclear that Darwin > realloc() behavior would remain a real issue. OTOH, it is clear that > *just* worming around Darwin realloc() behavior won't help other > platforms with problems in the same *immediate* area of bug 1092502. > Gross over-allocation followed by a shrinking realloc() just isn't > common in Python. sock_recv() is an exceptionally bad case. More > typical is, e.g., fileobject.c's get_line(), where if "a line" exceed > 100 characters the buffer keeps growing by 25% until there's enough > room, then it's cut back once at the end. That typical use for > shrinking realloc() just isn't going to be implicated in a real > problem -- the over-allocation is always minor. What about for list objects that are big at some point, then progressively shrink, but happen to stick around for a while? An "event queue" that got clogged for some reason and then became stable? Dictionaries? Of course these potential problems are a lot less likely to happen. >> ... >> There's obviously a tradeoff between copying lots of bytes and having >> lots of memory go to waste. That should be taken into consideration >> when considering how many pages could be returned to the allocator. >> Note that we can ask the allocator how much memory an allocation has >> actually reserved (which is usually somewhat larger than the amount >> you >> asked it for) and how much memory an allocation will reserve for a >> given size. An allocation resize wouldn't even show up as smaller >> unless at least one page would be freed (for sufficiently large >> allocations anyway, the minimum granularity is 16 bytes because it >> guarantees that alignment). Obviously if you have a lot of pages >> anyway, one page isn't a big deal, so we would probably only resort to >> free()/memcpy() if some fair percentage of the total pages used by the >> allocation could be rescued. >> >> If it does end up causing some real performance problems anyway, >> there's always deeper hacks like using vm_copy(), a Darwin specific >> function which will do copy-on-write instead (which only makes sense >> if >> the allocation is big enough for this to actually be a performance >> improvement). > > As above, I'm skeptical that there's a general problem worth > addressing here, and am still under the possible illusion that the Mac > developers will eventually change their realloc()'s behavior anyway. > If you're convinced it's worth the bother, go for it. If you do, I > strongly hope that it keys off a new platform-neutral symbol (say, > Py_SHRINKING_REALLOC_COPIES) and avoids Darwin-specific implementation > code. Then if it turns out that it is a broad problem (across apps or > across platforms), everyone can benefit. PyObject_Realloc() seems the > best place to put it. Unfortunately, for blocks obtained from the > system malloc(), there is no portable way to find out how much excess > was allocated in a release-build Python, so "avoids Darwin-specific > implementation code" may be impossible to achieve. The more it > *can't* be used on any platform other than this flavor of Darwin, the > more inclined I am to advise just fixing the immediate problem > (sock_recv's potentially unbounded over-allocation). I'm pretty sure this kind of malloc functionality is very specific to Darwin and does not carry over to any other BSD. In order for an intelligent implementation, an equivalent of malloc_size() and malloc_good_size() is required. Unfortunately, despite the man page, malloc_good_size() is not declared in , however there is another, declared, way to get at that functionality (by poking into the malloc_introspection_t struct of the malloc_default_zone()). -bob From martin at v.loewis.de Mon Jan 3 23:46:52 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 3 23:46:51 2005 Subject: [Python-Dev] Re: Zipfile needs? In-Reply-To: References: <41D1B0C6.8040208@ocf.berkeley.edu> Message-ID: <41D9CB5C.8090305@v.loewis.de> Scott David Daniels wrote: > I believe > there is an issue actually building in the encryption/decryption in > terms of redistribution. Submitters should not worry about this too much. The issue primarily exists in the U.S., and there are now (U.S.) official procedures to deal with them, and the PSF can and does follow these procedures. Regards, Martin From tdelaney at avaya.com Tue Jan 4 00:25:23 2005 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Jan 4 00:25:34 2005 Subject: [Python-Dev] Out-of-date FAQs Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE721224@au3010avexu1.global.avaya.com> While grabbing the link to the copyright restrictions FAQ (for someone on python-list) I noticed a few out-of-date FAQ entries - specifically, "most stable version" and "Why doesn't list.sort() return the sorted list?". Bug reports have been submitted (and acted on - Raymond, you work too fast ;) I think it's important that the FAQs be up-to-date with the latest idioms, etc, so as I have the time available I intend to review all the existing FAQs that I'm qualified for. As a general rule, when an idiom has changed, do we want to state both the 2.4 idiom as well as the 2.3 idiom? In the case of list.sort(), that would mean having both: for key in sorted(dict.iterkeys()): ...do whatever with dict[key]... and keys = dict.keys() keys.sort() for key in keys: ...do whatever with dict[key]... Tim Delaney From irmen at xs4all.nl Tue Jan 4 00:30:24 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Tue Jan 4 00:30:27 2005 Subject: [Python-Dev] Small fix for windows.tex Message-ID: <41D9D590.2020006@xs4all.nl> The current cvs docs failed to build for me, because of a small misspelling in the windows.tex file. Here is a patch: Index: Doc/ext/windows.tex =================================================================== RCS file: /cvsroot/python/python/dist/src/Doc/ext/windows.tex,v retrieving revision 1.10 diff -u -r1.10 windows.tex --- Doc/ext/windows.tex 30 Dec 2004 10:44:32 -0000 1.10 +++ Doc/ext/windows.tex 3 Jan 2005 23:28:20 -0000 @@ -163,8 +163,8 @@ click OK. (Inserting them one by one is fine too.) Now open the \menuselection{Project \sub spam properties} dialog. - You only need to change a few settings. Make sure \guilable{All - Configurations} is selected from the \guilable{Settings for:} + You only need to change a few settings. Make sure \guilabel{All + Configurations} is selected from the \guilabel{Settings for:} dropdown list. Select the C/\Cpp{} tab. Choose the General category in the popup menu at the top. Type the following text in the entry box labeled \guilabel{Additional Include Directories}: --Irmen From shane.holloway at ieee.org Tue Jan 4 00:30:11 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Tue Jan 4 00:30:38 2005 Subject: [Python-Dev] Zipfile needs? In-Reply-To: References: Message-ID: <41D9D583.6060400@ieee.org> Scott David Daniels wrote: > What other wish list things do people around here have for zipfile? I thought I'd collect input here > and make a PEP. I was working on a project based around modifying zip files, and found that python just doesn't implement that part. I'd like to see the ability to remove a file in the archive, as well as "write over" a file already in the archive. It's a tall order, but you asked. ;) Thanks, -Shane Holloway From martin at v.loewis.de Tue Jan 4 00:42:54 2005 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 4 00:42:50 2005 Subject: [Python-Dev] Small fix for windows.tex In-Reply-To: <41D9D590.2020006@xs4all.nl> References: <41D9D590.2020006@xs4all.nl> Message-ID: <41D9D87E.9050501@v.loewis.de> Irmen de Jong wrote: > The current cvs docs failed to build for me, because of a small > misspelling in the windows.tex file. Here is a patch: Thanks, fixed. Martin From aahz at pythoncraft.com Tue Jan 4 01:13:14 2005 From: aahz at pythoncraft.com (Aahz) Date: Tue Jan 4 01:13:17 2005 Subject: [Python-Dev] Out-of-date FAQs In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE721224@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DE721224@au3010avexu1.global.avaya.com> Message-ID: <20050104001314.GA17136@panix.com> On Tue, Jan 04, 2005, Delaney, Timothy C (Timothy) wrote: > > As a general rule, when an idiom has changed, do we want to state both > the 2.4 idiom as well as the 2.3 idiom? In the case of list.sort(), that > would mean having both: > > for key in sorted(dict.iterkeys()): > ...do whatever with dict[key]... > > and > > keys = dict.keys() > keys.sort() > for key in keys: > ...do whatever with dict[key]... Yes. Until last July, the company I work for was still using 1.5.2. Our current version is 2.2. I think that the FAQ should be usable for anyone with a "reasonably current" version of Python, say at least two major versions. IOW, answers should continue to work with 2.2 during the lifetime of 2.4. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From tdelaney at avaya.com Tue Jan 4 01:26:17 2005 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Jan 4 01:26:25 2005 Subject: [Python-Dev] Out-of-date FAQs Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE02520253@au3010avexu1.global.avaya.com> Aahz wrote: > Yes. Until last July, the company I work for was still using 1.5.2. > Our current version is 2.2. I think that the FAQ should be usable for > anyone with a "reasonably current" version of Python, say at least two > major versions. IOW, answers should continue to work with 2.2 during > the lifetime of 2.4. That seems reasonable to me. Tim Delaney From tim.peters at gmail.com Tue Jan 4 02:38:08 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue Jan 4 02:38:11 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> <1f7befae050103134942ab1696@mail.gmail.com> <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com> Message-ID: <1f7befae05010317387667d8b1@mail.gmail.com> [Bob Ippolito] > ... > What about for list objects that are big at some point, then > progressively shrink, but happen to stick around for a while? An > "event queue" that got clogged for some reason and then became stable? It's less plausible that we''re going to see a lot of these simultaneously alive. It's possible, of course. Note that if we do, fiddling PyObject_Realloc() won't help: list resizing goes thru the PyMem_RESIZE() macro, which calls the platform realloc() directly in a release build (BTW, I suspect that when you were looking for realloc() calls, you were looking for the string "realloc(" -- but that's not the only spelling; we don't even have alphabetical choke points ). The list object itself goes thru Python's small-object allocator, which makes sense because a list object has a small fixed size independent of list length. Space for list elements is allocated seperately from the list object, and talks to the platform malloc/free/realloc directly (in release builds, via how the PyMem_XYZ macros resolve in release builds). > Dictionaries? They're not a potential problem here -- dict resizing (whether growing or shrinking) always proceeds by allocating new space for the dict guts, copying over elements from the original space, then freeing the original space. This is because the hash slot assigned to a key can change when the table size changes, and keeping collision chains straight is a real bitch if you try to do it in-place. IOW, there are implementation reasons for why CPython dicts will probably never use realloc(). > Of course these potential problems are a lot less likely to happen. I think so. Guido's suggestion to look at PyString_Resize (etc) instead could be a good one, since those methods know both the number of thingies (bytes, list elements, tuple elements, ...) currently allocated and the number of thingies being asked for. That could be exploited by a portable heuristic (like malloc+memcpy+free if the new number of thingies is at least a quarter less than the old number of thingies, else let realloc (however spelled) exercise its own judgment). Since list_resize() doesn't go thru pymalloc, that's the only clear way to worm around realloc() quirks for lists. From gvanrossum at gmail.com Tue Jan 4 02:42:54 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 4 02:42:57 2005 Subject: [Python-Dev] Please help complete the AST branch Message-ID: The AST branch has been "nearly complete" for several Python versions now. The last time a serious effort was made was in May I believe, but it wasn't enough to merge the code back into 2.4, alas. It would be a real shame if this code was abandoned. If we're going to make progress with things like type inferencing, integrating PyChecker, or optional static type checking (see my blog on Artima -- I just finished rambling part II), the AST branch would be a much better starting point than the current mainline bytecode compiler. (Arguably, the compiler package, written in Python, would make an even better start for prototyping, but I don't expect that it will ever be fast enough to be Python's only bytecode compiler.) So, I'm pleading. Please, someone, either from the established crew of developers or a new volunteer (or more than one!), try to help out to complete the work on the AST branch and merge it into 2.5. I wish I could do this myself, and I *am* committed to more time for Python than last year, but I think I should try to focus on language design issues more than implementation issues. (Although I haven't heard what Larry Wall has been told -- apparently the Perl developers don't want Larry writing code any more. :-) Please, anyone? Raymond? Neil? Facundo? Brett? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Tue Jan 4 03:02:52 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Tue Jan 4 03:03:18 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: Message-ID: <41D9F94C.3020005@ocf.berkeley.edu> Guido van Rossum wrote: > The AST branch has been "nearly complete" for several Python versions > now. The last time a serious effort was made was in May I believe, but > it wasn't enough to merge the code back into 2.4, alas. > > It would be a real shame if this code was abandoned. [SNIP] > So, I'm pleading. Please, someone, either from the established crew of > developers or a new volunteer (or more than one!), try to help out to > complete the work on the AST branch and merge it into 2.5. > [SNIP] > Please, anyone? Raymond? Neil? Facundo? Brett? > Funny you should send this out today. I just did some jiggling with my schedule so I could take the undergrad language back-end course this quarter. This led to me needing to take a grad-level projects class in Spring. And what was the first suggestion my professor had for that course credit in Spring? Finish the AST branch. I am dedicated to finishing the AST branch as soon as my thesis is finished, class credit or no. I just can't delve into that large of a project until I get my school stuff in order. But if I get to do it for my class credit I will be able to dedicate 4 units of work to it a week (about 8 hours minimum). Plus there is the running tradition of sprinting on the AST branch at PyCon. I was planning on shedding my bug fixing drive at PyCon this year and sprinting with (hopefully) Jeremy, Neal, Tim, and Neil on the AST branch as a prep for working on it afterwards for my class credit. Although if someone can start sooner than by all means, go for it! I can find something else to get credit for (such as finishing my monster of a paper comparing Python to Java; 34 single-spaced pages just covering paradigm support and the standard libraries so far). And obviously help would be great since it isn't a puny codebase (4,000 lines so far for the CST->AST and AST->bytecode code). If anyone would like to see the current code, check out ast-branch from CVS (read the dev FAQ on how to check out a branch from CVS). Read Python/compile.txt for an overview of how the thing works and such. It will get done, just don't push for a 2.5 release within a month. =) -Brett From jepler at unpythonic.net Tue Jan 4 03:19:09 2005 From: jepler at unpythonic.net (Jeff Epler) Date: Tue Jan 4 03:19:12 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <41D9F94C.3020005@ocf.berkeley.edu> References: <41D9F94C.3020005@ocf.berkeley.edu> Message-ID: <20050104021909.GB11833@unpythonic.net> On Mon, Jan 03, 2005 at 06:02:52PM -0800, Brett C. wrote: > Although if someone can start sooner than by all means, go for it! > And obviously help would be great since it isn't a puny codebase > (4,000 lines so far for the CST->AST and AST->bytecode code). And obviously knowing a little more about the AST branch would be helpful for those considering helping. Is there any relatively up-to-date document about ast-branch? googling about it turned up some pypy stuff from 2003, and I didn't look much further. I just built the ast-branch for fun, and "make test" mostly worked. 8 tests failed: test_builtin test_dis test_generators test_inspect test_pep263 test_scope test_symtable test_trace 6 skips unexpected on linux2: test_csv test_hotshot test_bsddb test_parser test_logging test_email I haven't looked at any of the failures in detail, but at least test_bsddb is due to missing development libs on this system One more thing: The software I work on by day has python scripting. One part of that functionality is a tree display of a script. I'm not actively involved with this part of the software (yet). Any comments on whether ast-branch could be construed as helping make this kind of functionality work better, faster, or easier? The code we use currently is based on a modified version of the parser which includes comment information, so we need to be aware of changes in this area anyhow. (on the other hand, I won't hold my breath for permission to do this on the clock, because of our own release scheduling I have other projects on my plate now, and a version of our software that uses a post-2.3 Python is years away) Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050103/b4211f92/attachment-0001.pgp From jhylton at gmail.com Tue Jan 4 05:03:33 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Tue Jan 4 05:03:36 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <41D9F94C.3020005@ocf.berkeley.edu> References: <41D9F94C.3020005@ocf.berkeley.edu> Message-ID: On Mon, 03 Jan 2005 18:02:52 -0800, Brett C. wrote: > Plus there is the running tradition of sprinting on the AST branch at PyCon. I > was planning on shedding my bug fixing drive at PyCon this year and sprinting > with (hopefully) Jeremy, Neal, Tim, and Neil on the AST branch as a prep for > working on it afterwards for my class credit. I'd like to sprint on it before PyCon; we'll have to see what my schedule allows. > If anyone would like to see the current code, check out ast-branch from CVS > (read the dev FAQ on how to check out a branch from CVS). Read > Python/compile.txt for an overview of how the thing works and such. > > It will get done, just don't push for a 2.5 release within a month. =) I think the branch is in an awkward state, because of the new features added to Python 2.4 after the AST branch work ceased. The ast branch doesn't handle generator expressions or decorators; extending the ast to support them would be a good first step. There are also the simple logistical questions of integrating changes. Since most of the AST branch changes are confined to a few files, I suspect the best thing to do is to merge all the changes from the head except for compile.c. I haven't done a major CVS branch integrate in at least nine months; if someone feels more comfortable with that, it would also be a good step. Perhaps interested parties should take up the discussion on the compiler-sig. I think we can recover the state of last May's effort pretty quickly, and I can help outline the remaining work even if I can't help much. (Although I hope I can help, too.) Jeremy From t-meyer at ihug.co.nz Tue Jan 4 05:17:03 2005 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Tue Jan 4 05:17:40 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: Message-ID: > Perhaps interested parties should take up the discussion on > the compiler-sig. This isn't listed in the 'currently active' SIGs list on - is it still active, or will it now be? If so, perhaps it should be added to the list? By 'discussion on', do you mean via the wiki at ? =Tony.Meyer From theller at python.net Tue Jan 4 11:00:15 2005 From: theller at python.net (Thomas Heller) Date: Tue Jan 4 10:58:51 2005 Subject: [Python-Dev] Mac questions Message-ID: I'm working on refactoring Python/import.c, currently the case_ok() function. I was wondering about these lines: /* new-fangled macintosh (macosx) */ #elif defined(__MACH__) && defined(__APPLE__) && defined(HAVE_DIRENT_H) Is this for Mac OSX? Does the Mac have a case insensitive file system (my experiments on the SF compile farm say no)? And finally: Is there any other way to find the true spelling of a file except than a linear search with opendir()/readdir()/closedir() ? Thomas From bob at redivi.com Tue Jan 4 11:41:03 2005 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 4 11:41:12 2005 Subject: [Python-Dev] Mac questions In-Reply-To: References: Message-ID: <21E1D0D2-5E3D-11D9-A787-000A9567635C@redivi.com> On Jan 4, 2005, at 5:00 AM, Thomas Heller wrote: > I'm working on refactoring Python/import.c, currently the case_ok() > function. > > I was wondering about these lines: > /* new-fangled macintosh (macosx) */ > #elif defined(__MACH__) && defined(__APPLE__) && > defined(HAVE_DIRENT_H) > > Is this for Mac OSX? Does the Mac have a case insensitive file system > (my experiments on the SF compile farm say no)? Yes, this tests positive for Mac OS X (and probably other variants of Darwin). Yes, Mac OS X uses a case preserving but insensitive file system by default (HFS+), but has case sensitive file systems (UFS, and a case sensitive version of HFS+, NFS, etc.). The SF compile farm may use one of these alternative file systems, probably NFS if anything. > And finally: Is there any other way to find the true spelling of a file > except than a linear search with opendir()/readdir()/closedir() ? Yes, definitely. I'm positive you can do this with CoreServices, but I'm not sure it's portable to Darwin (not Mac OS X). I'm sure there is some Darwin-compatible way of doing it, but I don't know it off the top of my head. I'll try to remember to look into it if nobody else finds it first. -bob From Jack.Jansen at cwi.nl Tue Jan 4 11:56:09 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Tue Jan 4 11:54:47 2005 Subject: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> <1f7befae050103134942ab1696@mail.gmail.com> <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com> Message-ID: <3DA96FE4-5E3F-11D9-A0C3-000A958D1666@cwi.nl> On 3 Jan 2005, at 23:40, Bob Ippolito wrote: > Most people on Mac OS X have a lot of memory, and Mac OS X generally > does a good job about swapping in and out without causing much of a > problem, so I'm personally not very surprised that it could go > unnoticed this long. *Except* when you're low on free disk space. 10.2 and before were really bad with this, usually hanging the machine, 10.3 is better but it's still pretty bad when compared to other unixen. It probably has something to do with the way OSX overcommits memory and swapspace, for which it apparently uses a different algorithm than FreeBSD or Linux. I wouldn't be surprised if the bittorrent problem report in this thread was due to being low on diskspace. And that could also be true for the original error report that sparked this discussion. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From bob at redivi.com Tue Jan 4 12:25:46 2005 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 4 12:25:54 2005 Subject: [Pythonmac-SIG] Re: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations In-Reply-To: <3DA96FE4-5E3F-11D9-A0C3-000A958D1666@cwi.nl> References: <41D823FF-5D31-11D9-8981-000A9567635C@redivi.com> <1f7befae05010221134a94eccd@mail.gmail.com> <1f7befae050102231638b0d39d@mail.gmail.com> <1f7befae050103134942ab1696@mail.gmail.com> <7D3892F0-5DD8-11D9-ACB4-000A9567635C@redivi.com> <3DA96FE4-5E3F-11D9-A0C3-000A958D1666@cwi.nl> Message-ID: <60DD4D4B-5E43-11D9-A787-000A9567635C@redivi.com> On Jan 4, 2005, at 5:56 AM, Jack Jansen wrote: > On 3 Jan 2005, at 23:40, Bob Ippolito wrote: >> Most people on Mac OS X have a lot of memory, and Mac OS X generally >> does a good job about swapping in and out without causing much of a >> problem, so I'm personally not very surprised that it could go >> unnoticed this long. > > *Except* when you're low on free disk space. 10.2 and before were > really bad with this, usually hanging the machine, 10.3 is better but > it's still pretty bad when compared to other unixen. It probably has > something to do with the way OSX overcommits memory and swapspace, for > which it apparently uses a different algorithm than FreeBSD or Linux. > > I wouldn't be surprised if the bittorrent problem report in this > thread was due to being low on diskspace. And that could also be true > for the original error report that sparked this discussion. I was able to trigger this bug with a considerable amount of free disk space using a laptop that has 1GB of RAM, although I did have to increase the buffer size from the given example quite a bit to get it to fail. After all, a 32-bit process can't have more than 4 GB of addressable memory. I am pretty sure that OS X is never supposed to overcommit memory. The disk thrashing probably has a lot to do with the fact that Mac OS X will grow and shrink its swap based on demand, rather than having a fixed size swap partition as is common on other unixen. I've never seen the problem myself, though. From what I remember about Linux, its malloc implementation merely increases the address space of a process. The actual allocation will happen when you try and access the memory, and if it's overcommitted things will fail in a bad way. -bob From Jack.Jansen at cwi.nl Tue Jan 4 13:42:26 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Tue Jan 4 13:42:49 2005 Subject: [Python-Dev] Mac questions In-Reply-To: <21E1D0D2-5E3D-11D9-A787-000A9567635C@redivi.com> References: <21E1D0D2-5E3D-11D9-A787-000A9567635C@redivi.com> Message-ID: <16D0A778-5E4E-11D9-8F0D-000A958D1666@cwi.nl> On 4 Jan 2005, at 11:41, Bob Ippolito wrote: >> And finally: Is there any other way to find the true spelling of a >> file >> except than a linear search with opendir()/readdir()/closedir() ? > > Yes, definitely. I'm positive you can do this with CoreServices, but > I'm not sure it's portable to Darwin (not Mac OS X). I'm sure there > is some Darwin-compatible way of doing it, but I don't know it off the > top of my head. I'll try to remember to look into it if nobody else > finds it first. I haven't used pure darwin, but I assume it has support for FSRefs, right? Then you could use FSPathMakeRef() to turn the filename into an FSRef, and then FSGetCatalogInfo() to get the true filename. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From barry at python.org Tue Jan 4 13:43:28 2005 From: barry at python.org (Barry Warsaw) Date: Tue Jan 4 13:43:32 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: Message-ID: <1104842608.3227.60.camel@presto.wooz.org> On Mon, 2005-01-03 at 23:17, Tony Meyer wrote: > > Perhaps interested parties should take up the discussion on > > the compiler-sig. > > This isn't listed in the 'currently active' SIGs list on > - is it still active, or will it now be? If so, > perhaps it should be added to the list? > > By 'discussion on', do you mean via the wiki at > ? If compiler-sig is where ASTers want to hang out, I'd be happy to resurrect it. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050104/720764ad/attachment.pgp From bob at redivi.com Tue Jan 4 13:56:34 2005 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 4 13:56:43 2005 Subject: [Python-Dev] Mac questions In-Reply-To: <16D0A778-5E4E-11D9-8F0D-000A958D1666@cwi.nl> References: <21E1D0D2-5E3D-11D9-A787-000A9567635C@redivi.com> <16D0A778-5E4E-11D9-8F0D-000A958D1666@cwi.nl> Message-ID: <0FE43B78-5E50-11D9-A950-000A9567635C@redivi.com> On Jan 4, 2005, at 7:42 AM, Jack Jansen wrote: > > On 4 Jan 2005, at 11:41, Bob Ippolito wrote: >>> And finally: Is there any other way to find the true spelling of a >>> file >>> except than a linear search with opendir()/readdir()/closedir() ? >> >> Yes, definitely. I'm positive you can do this with CoreServices, but >> I'm not sure it's portable to Darwin (not Mac OS X). I'm sure there >> is some Darwin-compatible way of doing it, but I don't know it off >> the top of my head. I'll try to remember to look into it if nobody >> else finds it first. > > I haven't used pure darwin, but I assume it has support for FSRefs, > right? Then you could use FSPathMakeRef() to turn the filename into an > FSRef, and then FSGetCatalogInfo() to get the true filename. I believe your assumption is wrong. CoreServices is not open source, and this looks like it confirms my suspicion: (from ) #if !defined(DARWIN) struct FSRef; CF_EXPORT CFURLRef CFURLCreateFromFSRef(CFAllocatorRef allocator, const struct FSRef *fsRef); CF_EXPORT Boolean CFURLGetFSRef(CFURLRef url, struct FSRef *fsRef); #endif /* !DARWIN */ -bob From jhylton at gmail.com Tue Jan 4 14:25:18 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Tue Jan 4 14:25:21 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <1104842608.3227.60.camel@presto.wooz.org> References: <1104842608.3227.60.camel@presto.wooz.org> Message-ID: The list archives look like they are mostly full of spam, but it's also the only list we've used to discuss the ast work. I haven't really worried whether the sig was "active," as long as the list was around. I don't mind if you want to resurrect it. Is there some way to delete the spam from the archives? By "discussion on" I meant a discussion of the remaining work. I'm not sure why you quoted just that part. I was suggesting that there is an ongoing discussion that should continue on the compiler-sig. Jeremy On Tue, 04 Jan 2005 07:43:28 -0500, Barry Warsaw wrote: > On Mon, 2005-01-03 at 23:17, Tony Meyer wrote: > > > Perhaps interested parties should take up the discussion on > > > the compiler-sig. > > > > This isn't listed in the 'currently active' SIGs list on > > - is it still active, or will it now be? If so, > > perhaps it should be added to the list? > > > > By 'discussion on', do you mean via the wiki at > > ? > > If compiler-sig is where ASTers want to hang out, I'd be happy to > resurrect it. > > -Barry > > > From gvanrossum at gmail.com Tue Jan 4 16:31:30 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 4 16:57:21 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: <1104842608.3227.60.camel@presto.wooz.org> Message-ID: >I was suggesting that there > is an ongoing discussion that should continue on the compiler-sig. I'd be fine with keeping this on python-dev too. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jhylton at gmail.com Tue Jan 4 17:17:33 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Tue Jan 4 17:17:36 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: <1104842608.3227.60.camel@presto.wooz.org> Message-ID: That's fine with me. We had taken it to the compiler-sig when it wasn't clear there was interest in the ast branch :-). Jeremy On Tue, 4 Jan 2005 07:31:30 -0800, Guido van Rossum wrote: > >I was suggesting that there > > is an ongoing discussion that should continue on the compiler-sig. > > I'd be fine with keeping this on python-dev too. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > From barry at python.org Tue Jan 4 19:13:57 2005 From: barry at python.org (Barry Warsaw) Date: Tue Jan 4 19:14:07 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: <1104842608.3227.60.camel@presto.wooz.org> Message-ID: <1104862406.12499.6.camel@geddy.wooz.org> On Tue, 2005-01-04 at 11:17, Jeremy Hylton wrote: > That's fine with me. We had taken it to the compiler-sig when it > wasn't clear there was interest in the ast branch :-). Ok, then I'll leave compiler-sig where it is. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050104/d634ed53/attachment.pgp From gvanrossum at gmail.com Tue Jan 4 19:17:06 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 4 19:17:09 2005 Subject: [Python-Dev] Fwd: Thank You! :) In-Reply-To: <011f01c4f289$e6b41760$3a0a000a@entereza.com> References: <011f01c4f289$e6b41760$3a0a000a@entereza.com> Message-ID: This really goes to all python-dev folks! ---------- Forwarded message ---------- From: Erik Johnson Date: Tue, 4 Jan 2005 11:19:15 -0700 Subject: Thank You! :) To: guido@python.org You probably get a number of messages like this, but here is mine... My name is Erik Johnson, and I work in Albuquerque, NM for a small company called WellKeeper, Inc. We do remote oil & gas well monitoring, and I am using Python as a replacement for Perl & PHP, both for supporting dynamic web pages as well as driving a number of non-web-based servers and data processors. I just wanted to take a moment and say "Thank you!" to you, Guido, and your team for developing Python and then so generously sharing it with the world. I know it must be a pretty thankless job sometimes. I am still a neophyte Python hacker (Pythonista?), but I have been pretty impressed with Python so far, and am looking forward to learning Python better and accomplishing more with it in near near and not too distant future. So... thanks again, Happy New Year, and best wishes to you, your family, and your Python team for 2005! (I hope you will pass these good wishes along to your team.) Sincerely, Erik Johnson -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Tue Jan 4 19:23:12 2005 From: skip at pobox.com (Skip Montanaro) Date: Tue Jan 4 19:23:04 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: <1104842608.3227.60.camel@presto.wooz.org> Message-ID: <16858.57104.116033.996927@montanaro.dyndns.org> >> I was suggesting that there is an ongoing discussion that should >> continue on the compiler-sig. Guido> I'd be fine with keeping this on python-dev too. +1 for a number of reasons: * It's more visible and would potentially get more people interested in what's happening (and maybe participate) * The python-dev list archives are searched regularly by a number of people not on the list (more external visibility/involvement) * Brett would probably include progress reports in his python-dev summary (again, more external visibility/involvement) * Who really feels the need to subscribe to yet another mailing list? Skip From bob at redivi.com Tue Jan 4 19:25:03 2005 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 4 19:25:09 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <16858.57104.116033.996927@montanaro.dyndns.org> References: <1104842608.3227.60.camel@presto.wooz.org> <16858.57104.116033.996927@montanaro.dyndns.org> Message-ID: On Jan 4, 2005, at 1:23 PM, Skip Montanaro wrote: > >>> I was suggesting that there is an ongoing discussion that should >>> continue on the compiler-sig. > > Guido> I'd be fine with keeping this on python-dev too. > > +1 for a number of reasons: > > * It's more visible and would potentially get more people > interested in > what's happening (and maybe participate) > > * The python-dev list archives are searched regularly by a number > of > people not on the list (more external visibility/involvement) > > * Brett would probably include progress reports in his python-dev > summary (again, more external visibility/involvement) > > * Who really feels the need to subscribe to yet another mailing > list? +1 for the same reasons -bob From gvanrossum at gmail.com Tue Jan 4 19:28:03 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 4 19:28:06 2005 Subject: [Python-Dev] Let's get rid of unbound methods Message-ID: In my blog I wrote: Let's get rid of unbound methods. When class C defines a method f, C.f should just return the function object, not an unbound method that behaves almost, but not quite, the same as that function object. The extra type checking on the first argument that unbound methods are supposed to provide is not useful in practice (I can't remember that it ever caught a bug in my code) and sometimes you have to work around it; it complicates function attribute access; and the overloading of unbound and bound methods on the same object type is confusing. Also, the type checking offered is wrong, because it checks for subclassing rather than for duck typing. This is a really simple change to begin with: *** funcobject.c 28 Oct 2004 16:32:00 -0000 2.67 --- funcobject.c 4 Jan 2005 18:23:42 -0000 *************** *** 564,571 **** static PyObject * func_descr_get(PyObject *func, PyObject *obj, PyObject *type) { ! if (obj == Py_None) ! obj = NULL; return PyMethod_New(func, obj, type); } --- 564,573 ---- static PyObject * func_descr_get(PyObject *func, PyObject *obj, PyObject *type) { ! if (obj == NULL || obj == Py_None) { ! Py_INCREF(func); ! return func; ! } return PyMethod_New(func, obj, type); } There are some test suite failures but I suspect they all have to do with checking this behavior. Of course, more changes would be needed: docs, the test suite, and some simplifications to the instance method object implementation in classobject.c. Does anyone think this is a bad idea? Anyone want to run with it? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at zope.com Tue Jan 4 19:36:03 2005 From: jim at zope.com (Jim Fulton) Date: Tue Jan 4 19:36:07 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: Message-ID: <41DAE213.9070906@zope.com> Guido van Rossum wrote: > In my blog I wrote: > > Let's get rid of unbound methods. When class C defines a method f, C.f > should just return the function object, not an unbound method that > behaves almost, but not quite, the same as that function object. The > extra type checking on the first argument that unbound methods are > supposed to provide is not useful in practice (I can't remember that > it ever caught a bug in my code) and sometimes you have to work around > it; it complicates function attribute access; I think this is probably a good thing as it potentially avoids some unintential aliasing. > and the overloading of > unbound and bound methods on the same object type is confusing. Also, > the type checking offered is wrong, because it checks for subclassing > rather than for duck typing. duck typing? > This is a really simple change to begin with: > > *** funcobject.c 28 Oct 2004 16:32:00 -0000 2.67 > --- funcobject.c 4 Jan 2005 18:23:42 -0000 > *************** > *** 564,571 **** > static PyObject * > func_descr_get(PyObject *func, PyObject *obj, PyObject *type) > { > ! if (obj == Py_None) > ! obj = NULL; > return PyMethod_New(func, obj, type); > } > > --- 564,573 ---- > static PyObject * > func_descr_get(PyObject *func, PyObject *obj, PyObject *type) > { > ! if (obj == NULL || obj == Py_None) { > ! Py_INCREF(func); > ! return func; > ! } > return PyMethod_New(func, obj, type); > } > > There are some test suite failures but I suspect they all have to do > with checking this behavior. > > Of course, more changes would be needed: docs, the test suite, and > some simplifications to the instance method object implementation in > classobject.c. > > Does anyone think this is a bad idea? It *feels* very disruptive to me, but I'm probably wrong. We'll still need unbound builtin methods, so the concept won't go away. In fact, the change would mean that the behavior between builtin methods and python methods would become more inconsistent. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From bob at redivi.com Tue Jan 4 19:39:44 2005 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 4 19:39:58 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: Message-ID: <00C9B1BE-5E80-11D9-A950-000A9567635C@redivi.com> On Jan 4, 2005, at 1:28 PM, Guido van Rossum wrote: > Let's get rid of unbound methods. When class C defines a method f, C.f > should just return the function object, not an unbound method that > behaves almost, but not quite, the same as that function object. The > extra type checking on the first argument that unbound methods are > supposed to provide is not useful in practice (I can't remember that > it ever caught a bug in my code) and sometimes you have to work around > it; it complicates function attribute access; and the overloading of > unbound and bound methods on the same object type is confusing. Also, > the type checking offered is wrong, because it checks for subclassing > rather than for duck typing. +1 I like this idea. It may have some effect on current versions of PyObjC though, because we really do care about what self is in order to prevent crashes. This is not a discouragement; we are already using custom descriptors and a metaclass, so it won't be a problem to do this ourselves if we are not doing it already. I'll try and find some time later in the week to play with this patch to see if it does break PyObjC or not. If it breaks PyObjC, I can sure that PyObjC 1.3 will be compatible with such a runtime change, as we're due for a refactoring in that area anyway. -bob From jack at performancedrivers.com Tue Jan 4 19:42:17 2005 From: jack at performancedrivers.com (Jack Diederich) Date: Tue Jan 4 19:42:21 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: Message-ID: <20050104184217.GJ1404@performancedrivers.com> On Tue, Jan 04, 2005 at 10:28:03AM -0800, Guido van Rossum wrote: > In my blog I wrote: > > Let's get rid of unbound methods. When class C defines a method f, C.f > should just return the function object, not an unbound method that > behaves almost, but not quite, the same as that function object. The > extra type checking on the first argument that unbound methods are > supposed to provide is not useful in practice (I can't remember that > it ever caught a bug in my code) and sometimes you have to work around > it; it complicates function attribute access; and the overloading of > unbound and bound methods on the same object type is confusing. Also, > the type checking offered is wrong, because it checks for subclassing > rather than for duck typing. > > Does anyone think this is a bad idea? Anyone want to run with it? > I like the idea, it means I can get rid of this[1] func = getattr(cls, 'do_command', None) setattr(cls, 'do_command', staticmethod(func.im_func)) # don't let anyone on c.l.py see this .. or at least change the comment *grin*, -Jack [1] http://cvs.sourceforge.net/viewcvs.py/lyntin/lyntin40/sandbox/leantin/mudcommands.py?view=auto From aahz at pythoncraft.com Tue Jan 4 19:47:06 2005 From: aahz at pythoncraft.com (Aahz) Date: Tue Jan 4 19:47:08 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <41DAE213.9070906@zope.com> References: <41DAE213.9070906@zope.com> Message-ID: <20050104184706.GA3466@panix.com> On Tue, Jan 04, 2005, Jim Fulton wrote: > Guido van Rossum wrote: >> >> and the overloading of >>unbound and bound methods on the same object type is confusing. Also, >>the type checking offered is wrong, because it checks for subclassing >>rather than for duck typing. > > duck typing? "If it looks like a duck and quacks like a duck, it must be a duck." Python is often referred to as having duck typing because even without formal interface declarations, good practice mostly depends on conformant interfaces rather than subclassing to determine an object's type. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From pje at telecommunity.com Tue Jan 4 19:48:24 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 4 19:48:18 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: Message-ID: <5.1.1.6.0.20050104133943.02b69d60@mail.telecommunity.com> At 10:28 AM 1/4/05 -0800, Guido van Rossum wrote: >Of course, more changes would be needed: docs, the test suite, and >some simplifications to the instance method object implementation in >classobject.c. > >Does anyone think this is a bad idea? Code that currently does 'aClass.aMethod.im_func' in order to access the function object would break, as would code that inspects 'im_self' to determine whether a method is a class or instance method. (Although code of the latter sort would already break with static methods, I suppose.) Cursory skimming of the first 100 Google hits for 'im_func' seems to show at least half a dozen instances of the first type of code, though. Such code would also be in the difficult position of having to do things two ways in order to be both forward and backward compatible. Also, I seem to recall once having relied on the behavior of a dynamically-created unbound method (via new.instancemethod) in order to create a descriptor of some sort. But I don't remember where or when I did it or whether I still care. :) From aleax at aleax.it Tue Jan 4 19:48:49 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 4 19:48:54 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: <1104842608.3227.60.camel@presto.wooz.org> Message-ID: <45D8D02A-5E81-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 04, at 17:17, Jeremy Hylton wrote: > That's fine with me. We had taken it to the compiler-sig when it > wasn't clear there was interest in the ast branch :-). Speaking for myself, I have a burning interest in the AST branch (though I can't seem to get it correctly downloaded so far, I guess it's just my usual CVS-clumsiness and I'll soon find out what I'm doing wrong & fix it), and if I could follow the discussion right here on python-dev that would sure be convenient (now that I've finally put the 2nd ed of the Cookbook to bed and am finally reading python-dev again after all these months -- almost caught up with recent traffic too;-)... Alex From pje at telecommunity.com Tue Jan 4 19:51:42 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 4 19:51:37 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <41DAE213.9070906@zope.com> References: Message-ID: <5.1.1.6.0.20050104134847.02b6bec0@mail.telecommunity.com> At 01:36 PM 1/4/05 -0500, Jim Fulton wrote: >duck typing? AKA latent typing or, "if it walks like a duck and quacks like a duck, it must be a duck." Or, more pythonically: if hasattr(ob,"quack") and hasattr(ob,"duckwalk"): # it's a duck This is as distinct from both 'if isinstance(ob,Duck)' and 'if implements(ob,IDuck)'. That is, "duck typing" is determining an object's type by inspection of its method/attribute signature rather than by explicit relationship to some type object. From olsongt at verizon.net Tue Jan 4 17:30:22 2005 From: olsongt at verizon.net (olsongt@verizon.net) Date: Tue Jan 4 19:55:11 2005 Subject: [Python-Dev] Will ASTbranch compile on windows yet? Message-ID: <20050104163022.RWCC10436.out012.verizon.net@outgoing.verizon.net> I submitted patch "[ 742621 ] ast-branch: msvc project sync" in the VC6.0 days. There were some required changes to headers as well as the project files. It had discouraged me in the past when Jeremy made calls for help on the astbranch and I wasn't even sure if the source was in a compilable state when I checked it out. I'm sure it has discouraged other windows programmers as well. -Grant From tim.peters at gmail.com Tue Jan 4 19:57:10 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue Jan 4 19:57:13 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: Message-ID: <1f7befae05010410576effd024@mail.gmail.com> [Guido] > In my blog I wrote: > > Let's get rid of unbound methods. When class C defines a method > f, C.f should just return the function object, not an unbound > method that behaves almost, but not quite, the same as that > function object. The extra type checking on the first argument that > unbound methods are supposed to provide is not useful in practice > (I can't remember that it ever caught a bug in my code) Really? Unbound methods are used most often (IME) to call a base-class method from a subclass, like my_base.the_method(self, ...). It's especially easy to forget to write `self, ` there, and the exception msg then is quite focused because of that extra bit of type checking. Otherwise I expect we'd see a more-mysterious AttributeError or TypeError when the base method got around to trying to do something with the bogus `self` passed to it. I could live with that, though. > and sometimes you have to work around it; For me, 0 times in ... what? ... about 14 years . > it complicates function attribute access; and the overloading of > unbound and bound methods on the same object type is > confusing. Yup, it is a complication, without a compelling use case I know of. Across the Python, Zope2 and Zope3 code bases, types.UnboundMethodType is defined once and used once (believe it or not, in unittest.py). From tim.peters at gmail.com Tue Jan 4 20:08:34 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue Jan 4 20:08:38 2005 Subject: [Python-Dev] Will ASTbranch compile on windows yet? In-Reply-To: <20050104163022.RWCC10436.out012.verizon.net@outgoing.verizon.net> References: <20050104163022.RWCC10436.out012.verizon.net@outgoing.verizon.net> Message-ID: <1f7befae05010411082bd35aab@mail.gmail.com> [olsongt@verizon.net] > I submitted patch "[ 742621 ] ast-branch: msvc project sync" in > the VC6.0 days. There were some required changes to headers > as well as the project files. It had discouraged me in the past > when Jeremy made calls for help on the astbranch and I wasn't > even sure if the source was in a compilable state when I checked > it out. I'm sure it has discouraged other windows programmers > as well. I'd be surprised if it compiled on Windows now, as I don't think any Windows users have been working on that branch. At the last (2004) PyCon, I was going to participate in the annual AST sprint again, but it was so far from working on Windows then I gave up (and joined the close-bugs/patches sprint instead). I don't have time to join the current crusade. If there's pent-up interest among Windows users, it would be good to say which compiler(s) you can use, since I expect not everyone can deal with VC 7.1 (e.g., I think Raymond Hettinger is limited to VC 6; and you said you worked up a VC 6 patch, but didn't say whether you could use 7.1 now). From jim at zope.com Tue Jan 4 20:12:39 2005 From: jim at zope.com (Jim Fulton) Date: Tue Jan 4 20:12:43 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <5.1.1.6.0.20050104133943.02b69d60@mail.telecommunity.com> References: <5.1.1.6.0.20050104133943.02b69d60@mail.telecommunity.com> Message-ID: <41DAEAA7.2040706@zope.com> Phillip J. Eby wrote: > At 10:28 AM 1/4/05 -0800, Guido van Rossum wrote: > >> Of course, more changes would be needed: docs, the test suite, and >> some simplifications to the instance method object implementation in >> classobject.c. >> >> Does anyone think this is a bad idea? > > > Code that currently does 'aClass.aMethod.im_func' in order to access the > function object would break, as would code that inspects 'im_self' to > determine whether a method is a class or instance method. (Although > code of the latter sort would already break with static methods, I > suppose.) Code of the latter sort wouldn't break with the change. We'd still have bound methods. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From gvanrossum at gmail.com Tue Jan 4 20:40:30 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 4 20:40:33 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <41DAE213.9070906@zope.com> References: <41DAE213.9070906@zope.com> Message-ID: [Jim] > We'll still need unbound builtin methods, so the concept won't > go away. In fact, the change would mean that the behavior between > builtin methods and python methods would become more inconsistent. Actually, unbound builtin methods are a different type than bound builtin methods: >>> type(list.append) >>> type([].append) >>> Compare this to the same thing for a method on a user-defined class: >>> type(C.foo) >>> type(C().foo) (The 'instancemethod' type knows whether it is a bound or unbound method by checking whether im_self is set.) [Phillip] > Code that currently does 'aClass.aMethod.im_func' in order to access the > function object would break, as would code that inspects 'im_self' to > determine whether a method is a class or instance method. (Although code > of the latter sort would already break with static methods, I suppose.) Right. (But I think you're using the terminology in a cunfused way -- im_self distinguishes between bould and unbound methods. Class methods are a different beast.) I guess for backwards compatibility, function objects could implement dummy im_func and im_self attributes (im_func returning itself and im_self returning None), while issuing a warning that this is a deprecated feature. [Tim] > Really? Unbound methods are used most often (IME) to call a > base-class method from a subclass, like my_base.the_method(self, ...). > It's especially easy to forget to write `self, ` there, and the > exception msg then is quite focused because of that extra bit of type > checking. Otherwise I expect we'd see a more-mysterious > AttributeError or TypeError when the base method got around to trying > to do something with the bogus `self` passed to it. Hm, I hadn't thought ot this. > I could live with that, though. Most cases would be complaints about argument counts (it gets harier when there are default args so the arg count is variable). Ironically, I get those all the time these days due to the reverse error: using super() but forgetting *not* to pass self! > Across the Python, Zope2 and Zope3 code bases, types.UnboundMethodType > is defined once and used once (believe it or not, in unittest.py). But that might be because BoundMethodType is the same type object... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Jan 4 20:38:29 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 4 20:42:15 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <41DAE213.9070906@zope.com> Message-ID: <003901c4f294$ff8e1640$e841fea9@oemcomputer> [Guido van Rossum] > > Let's get rid of unbound methods. +1 [Jim Fulton] > duck typing? Requiring a specific interface instead of a specific type. [Guido] > > Does anyone think this is a bad idea? [Jim] > It *feels* very disruptive to me, but I'm probably wrong. > We'll still need unbound builtin methods, so the concept won't > go away. In fact, the change would mean that the behavior between > builtin methods and python methods would become more inconsistent. The type change would be disruptive and guaranteed to break some code. Also, it would partially breakdown the distinction between functions and methods. The behavior, on the other hand, would remain essentially the same (sans type checking). Raymond From jim at zope.com Tue Jan 4 20:44:43 2005 From: jim at zope.com (Jim Fulton) Date: Tue Jan 4 20:44:47 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: <41DAE213.9070906@zope.com> Message-ID: <41DAF22B.6030605@zope.com> Guido van Rossum wrote: > [Jim] > >>We'll still need unbound builtin methods, so the concept won't >>go away. In fact, the change would mean that the behavior between >>builtin methods and python methods would become more inconsistent. > > > Actually, unbound builtin methods are a different type than bound > builtin methods: Of course, but conceptually they are similar. You would still encounter the concept if you got an unbound builtin method. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From exarkun at divmod.com Tue Jan 4 21:02:06 2005 From: exarkun at divmod.com (Jp Calderone) Date: Tue Jan 4 21:02:09 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: Message-ID: <20050104200206.25734.345337731.divmod.quotient.921@ohm> On Tue, 4 Jan 2005 10:28:03 -0800, Guido van Rossum wrote: >In my blog I wrote: > > Let's get rid of unbound methods. When class C defines a method f, C.f > should just return the function object, not an unbound method that > behaves almost, but not quite, the same as that function object. The > extra type checking on the first argument that unbound methods are > supposed to provide is not useful in practice (I can't remember that > it ever caught a bug in my code) and sometimes you have to work around > it; it complicates function attribute access; and the overloading of > unbound and bound methods on the same object type is confusing. Also, > the type checking offered is wrong, because it checks for subclassing > rather than for duck typing. > This would make pickling (or any serialization mechanism) of `Class.method' based on name next to impossible. Right now, with the appropriate support, this works: >>> import pickle >>> class Foo: ... def bar(self): pass ... >>> pickle.loads(pickle.dumps(Foo.bar)) >>> I don't see how it could if Foo.bar were just a function object. Jp From exarkun at divmod.com Tue Jan 4 21:15:00 2005 From: exarkun at divmod.com (Jp Calderone) Date: Tue Jan 4 21:15:05 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <20050104200206.25734.345337731.divmod.quotient.921@ohm> Message-ID: <20050104201500.25734.946201879.divmod.quotient.934@ohm> On Tue, 04 Jan 2005 20:02:06 GMT, Jp Calderone wrote: >On Tue, 4 Jan 2005 10:28:03 -0800, Guido van Rossum wrote: > >In my blog I wrote: > > > > Let's get rid of unbound methods. When class C defines a method f, C.f > > should just return the function object, not an unbound method that > > behaves almost, but not quite, the same as that function object. The > > extra type checking on the first argument that unbound methods are > > supposed to provide is not useful in practice (I can't remember that > > it ever caught a bug in my code) and sometimes you have to work around > > it; it complicates function attribute access; and the overloading of > > unbound and bound methods on the same object type is confusing. Also, > > the type checking offered is wrong, because it checks for subclassing > > rather than for duck typing. > > > > This would make pickling (or any serialization mechanism) of > `Class.method' based on name next to impossible. Right now, with > the appropriate support, this works: It occurs to me that perhaps I was not clear enough here. What I mean is that it is possible to serialize unbound methods currently, because they refer to both their own name, the name of their class object, and thus indirectly to the module in which they are defined. If looking up a method on a class object instead returns a function, then the class is no longer knowable, and most likely the function will not have a unique name which can be used to allow a reference to it to be serialized. In particular, I don't see how one will be able to write something equivalent to this: import new, copy_reg, types def pickleMethod(method): return unpickleMethod, (method.im_func.__name__, method.im_self, method.im_class) def unpickleMethod(im_name, im_self, im_class): unbound = getattr(im_class, im_name) if im_self is None: return unbound return new.instancemethod(unbound.im_func, im_self, im_class) copy_reg.pickle(types.MethodType, pickleMethod, unpickleMethod) But perhaps I am just overlooking the obvious. Jp From gvanrossum at gmail.com Tue Jan 4 21:18:15 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 4 21:18:18 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <20050104200206.25734.345337731.divmod.quotient.921@ohm> References: <20050104200206.25734.345337731.divmod.quotient.921@ohm> Message-ID: [me] > > Actually, unbound builtin methods are a different type than bound > > builtin methods: [Jim] > Of course, but conceptually they are similar. You would still > encounter the concept if you got an unbound builtin method. Well, these are all just implementation details. They really are all just callables. [Jp] > This would make pickling (or any serialization mechanism) of > `Class.method' based on name next to impossible. Right now, with > the appropriate support, this works: > > >>> import pickle > >>> class Foo: > ... def bar(self): pass > ... > >>> pickle.loads(pickle.dumps(Foo.bar)) > > >>> > > I don't see how it could if Foo.bar were just a function object. Is this a purely theoretical objection or are you actually aware of anyone doing this? Anyway, that approach is pretty limited -- how would you do it for static and class methods, or methods wrapped by other decorators? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From exarkun at divmod.com Tue Jan 4 21:27:37 2005 From: exarkun at divmod.com (Jp Calderone) Date: Tue Jan 4 21:27:41 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: Message-ID: <20050104202737.25734.1950245396.divmod.quotient.945@ohm> On Tue, 4 Jan 2005 12:18:15 -0800, Guido van Rossum wrote: >[me] > > > Actually, unbound builtin methods are a different type than bound > > > builtin methods: > > [Jim] > > Of course, but conceptually they are similar. You would still > > encounter the concept if you got an unbound builtin method. > > Well, these are all just implementation details. They really are all > just callables. > > [Jp] > > This would make pickling (or any serialization mechanism) of > > `Class.method' based on name next to impossible. Right now, with > > the appropriate support, this works: > > > > >>> import pickle > > >>> class Foo: > > ... def bar(self): pass > > ... > > >>> pickle.loads(pickle.dumps(Foo.bar)) > > > > >>> > > > > I don't see how it could if Foo.bar were just a function object. > > Is this a purely theoretical objection or are you actually aware of > anyone doing this? Anyway, that approach is pretty limited -- how > would you do it for static and class methods, or methods wrapped by > other decorators? It's not a feature I often depend on, however I have made use of it on occassion. Twisted's supports serializing unbound methods this way, primarily to enhance the useability of tap files (a feature whereby an application is configured by constructing a Python object graph and then pickled to a file to later be loaded and run). "Objection" may be too strong a word for my stance here, I just wanted to point out another potentially incompatible behavior change. I can't think of any software which I cam currently developing or maintaining which benefits from this feature, it just seems unfortunate to further complicate the already unpleasant business of serialization. Jp From pje at telecommunity.com Tue Jan 4 21:31:57 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 4 21:31:54 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: <41DAE213.9070906@zope.com> <41DAE213.9070906@zope.com> Message-ID: <5.1.1.6.0.20050104152023.02db7cc0@mail.telecommunity.com> At 11:40 AM 1/4/05 -0800, Guido van Rossum wrote: >[Jim] > > We'll still need unbound builtin methods, so the concept won't > > go away. In fact, the change would mean that the behavior between > > builtin methods and python methods would become more inconsistent. > >Actually, unbound builtin methods are a different type than bound >builtin methods: > > >>> type(list.append) > > >>> type([].append) > > >>> > >Compare this to the same thing for a method on a user-defined class: > > >>> type(C.foo) > > >>> type(C().foo) > > >(The 'instancemethod' type knows whether it is a bound or unbound >method by checking whether im_self is set.) > >[Phillip] > > Code that currently does 'aClass.aMethod.im_func' in order to access the > > function object would break, as would code that inspects 'im_self' to > > determine whether a method is a class or instance method. (Although code > > of the latter sort would already break with static methods, I suppose.) > >Right. (But I think you're using the terminology in a cunfused way -- >im_self distinguishes between bould and unbound methods. Class methods >are a different beast.) IIUC, when you do 'SomeClass.aMethod', if 'aMethod' is a classmethod, then you will receive a bound method with an im_self of 'SomeClass'. So, if you are introspecting items listed in 'dir(SomeClass)', this will be your only clue that 'aMethod' is a class method. Similarly, the fact that you get an unbound method object if 'aMethod' is an instance method, allows you to distinguish it from a static method (if the object is a function). That is, I'm saying that code that looks at the type and attributes of 'aMethod' as retrieved from 'SomeClass' will now not be able to distinguish between a static method and an instance method, because both will return a function instance. However, the 'inspect' module uses __dict__ rather than getattr to get at least some attributes, so it doesn't rely on this property. >I guess for backwards compatibility, function objects could implement >dummy im_func and im_self attributes (im_func returning itself and >im_self returning None), while issuing a warning that this is a >deprecated feature. +1 on this part if the proposal goes through. On the proposal as a whole, I'm -0, as I'm not quite clear on what this is going to simplify enough to justify the various semantic impacts such as upcalls, pickling, etc. Method objects will still have to exist, so ISTM that this is only going to streamline the "__get__(None,type)" branch of functions' descriptor code, and the check for "im_self is None" in the __call__ of method objects. (And maybe some eval loop shortcuts for calling methods?) From bac at OCF.Berkeley.EDU Tue Jan 4 22:11:54 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Tue Jan 4 22:12:16 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <45D8D02A-5E81-11D9-ADA4-000A95EFAE9E@aleax.it> References: <1104842608.3227.60.camel@presto.wooz.org> <45D8D02A-5E81-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <41DB069A.8030406@ocf.berkeley.edu> Alex Martelli wrote: > > On 2005 Jan 04, at 17:17, Jeremy Hylton wrote: > >> That's fine with me. We had taken it to the compiler-sig when it >> wasn't clear there was interest in the ast branch :-). > > > Speaking for myself, I have a burning interest in the AST branch (though > I can't seem to get it correctly downloaded so far, I guess it's just my > usual CVS-clumsiness and I'll soon find out what I'm doing wrong & fix > it) See http://www.python.org/dev/devfaq.html#how-can-i-check-out-a-tagged-branch on how to do a checkout of a tagged branch. -Brett From bac at OCF.Berkeley.EDU Tue Jan 4 22:50:28 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Tue Jan 4 22:50:39 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <20050104021909.GB11833@unpythonic.net> References: <41D9F94C.3020005@ocf.berkeley.edu> <20050104021909.GB11833@unpythonic.net> Message-ID: <41DB0FA4.3070405@ocf.berkeley.edu> Jeff Epler wrote: > On Mon, Jan 03, 2005 at 06:02:52PM -0800, Brett C. wrote: > >>Although if someone can start sooner than by all means, go for it! >>And obviously help would be great since it isn't a puny codebase >>(4,000 lines so far for the CST->AST and AST->bytecode code). > > > And obviously knowing a little more about the AST branch would be > helpful for those considering helping. > > Is there any relatively up-to-date document about ast-branch? googling > about it turned up some pypy stuff from 2003, and I didn't look much > further. > Beyond the text file Python/compile.txt in CVS, nope. I have tried to flesh that doc out as well as I could to explain how it all works. If it doesn't answer all your questions then just ask here on python-dev (as the rest of this thread has seemed to agreed upon). I will do my best to make sure any info that needs to work its way back into the doc gets checked in. > I just built the ast-branch for fun, and "make test" mostly worked. > 8 tests failed: > test_builtin test_dis test_generators test_inspect test_pep263 > test_scope test_symtable test_trace > 6 skips unexpected on linux2: > test_csv test_hotshot test_bsddb test_parser test_logging > test_email > I haven't looked at any of the failures in detail, but at least > test_bsddb is due to missing development libs on this system > > One more thing: The software I work on by day has python scripting. > One part of that functionality is a tree display of a script. I'm not > actively involved with this part of the software (yet). Any comments on > whether ast-branch could be construed as helping make this kind of > functionality work better, faster, or easier? The code we use currently > is based on a modified version of the parser which includes comment > information, so we need to be aware of changes in this area anyhow. > If by tree you mean execution paths, then yes, eventually. When the back-end is finished the hope is to be able to export the AST to Python objects and thus have it usable in Python. You could use the AST representation to display your tree. -Brett From jhylton at gmail.com Tue Jan 4 22:54:28 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Tue Jan 4 22:54:31 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <41DB0FA4.3070405@ocf.berkeley.edu> References: <41D9F94C.3020005@ocf.berkeley.edu> <20050104021909.GB11833@unpythonic.net> <41DB0FA4.3070405@ocf.berkeley.edu> Message-ID: Does anyone want to volunteer to integrate the current head to the branch? I think that's a pretty important near-term step. Jeremy From Jack.Jansen at cwi.nl Wed Jan 5 00:01:34 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Wed Jan 5 00:01:25 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: Message-ID: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> On 4-jan-05, at 19:28, Guido van Rossum wrote: > The > extra type checking on the first argument that unbound methods are > supposed to provide is not useful in practice (I can't remember that > it ever caught a bug in my code) It caught bugs for me a couple of times. If I remember correctly I was calling methods of something that was supposed to be a mixin class but I forgot to actually list the mixin as a base. But I don't think that's a serious enough issue alone to keep the unbound method type. But I'm more worried about losing the other information in an unbound method, specifically im_class. I would guess that info is useful to class browsers and such, or are there other ways to get at that? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From jack at uitdesloot.nl Tue Jan 4 23:34:58 2005 From: jack at uitdesloot.nl (Jack Jansen) Date: Wed Jan 5 00:04:54 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in Message-ID: First question: what is the Python 2.3.5 release schedule and who is responsible? Second question: I thought this info was in a PEP somewhere, but I could only find PEPs on major releases, should I have found this info somewhere? And now the question that matters: there's some stuff I'd really like to get into 2.3.5, but it involves changes to configure, the Makefile and distutils, so because it's fairly extensive I thought I'd ask before just committing it. The problem we're trying to solve is that due to the way Apple's framework architecture works newer versions of frameworks are preferred (at link time, and sometimes even at runtime) over older ones. That's fine for most uses of frameworks, but not when linking a Python extension against the Python framework: if you build an extension with Python 2.3 to later load it into 2.3 you don't want that framework to be linked against 2.4. Now there's a way around this, from MacOSX 10.3 onwards, and that is not to link against the framework at all, but link with "-undefined dynamic_lookup". This will link the extension in a way similar to what other Unix systems do: any undefined externals are looked up when the extension is dynamically loaded. But because this feature only works with the dynamic loader from 10.3 or later you must have the environment variable MACOSX_DEPLOYMENT_TARGET set to 10.3 or higher when you build the extension, otherwise the linker will complain. We've solved this issue for the trunk and we can solve it for 2.4.1: if MACOSX_DEPLOYMENT_TARGET isn't set and we're on 10.3 we force it to 10.3. Moreover, when it is 10.3 or higher (possibly after being forced) we use the dynamic_lookup way of linking extensions. We also record the value of MACOSX_DEPLOYMENT_TARGET in the Makefile, and distutils picks it up later and sets the environment variable again. We even have a hack to fix Apple-installed Python 2.3 in place by mucking with lib/config/Makefile, which we can do because Apple-installed Python 2.3 will obviously only be run on 10.3. And we check whether this hack is needed when you install a later Python version on 10.3. That leaves Python 2.3.5 itself. The best fix here would be to backport the 2.4.1 solution: configure.in 1.456 and 1.478, distutils/sysconfig.py 1.59 and 1.62, Makefile.pre.in 1.144. Note that though the build procedure for extensions will change it doesn't affect binary compatibility: both types of extensions are loadable by both types of interpreters. I think this is all safe, and these patches shouldn't affect any system other than MacOSX, but I'm a bit reluctant to fiddle with the build procedure for a micro-release, so that's why I'm asking. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From Jack.Jansen at cwi.nl Wed Jan 5 00:06:29 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Wed Jan 5 00:06:18 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in Message-ID: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl> First question: what is the Python 2.3.5 release schedule and who is responsible? Second question: I thought this info was in a PEP somewhere, but I could only find PEPs on major releases, should I have found this info somewhere? And now the question that matters: there's some stuff I'd really like to get into 2.3.5, but it involves changes to configure, the Makefile and distutils, so because it's fairly extensive I thought I'd ask before just committing it. The problem we're trying to solve is that due to the way Apple's framework architecture works newer versions of frameworks are preferred (at link time, and sometimes even at runtime) over older ones. That's fine for most uses of frameworks, but not when linking a Python extension against the Python framework: if you build an extension with Python 2.3 to later load it into 2.3 you don't want that framework to be linked against 2.4. Now there's a way around this, from MacOSX 10.3 onwards, and that is not to link against the framework at all, but link with "-undefined dynamic_lookup". This will link the extension in a way similar to what other Unix systems do: any undefined externals are looked up when the extension is dynamically loaded. But because this feature only works with the dynamic loader from 10.3 or later you must have the environment variable MACOSX_DEPLOYMENT_TARGET set to 10.3 or higher when you build the extension, otherwise the linker will complain. We've solved this issue for the trunk and we can solve it for 2.4.1: if MACOSX_DEPLOYMENT_TARGET isn't set and we're on 10.3 we force it to 10.3. Moreover, when it is 10.3 or higher (possibly after being forced) we use the dynamic_lookup way of linking extensions. We also record the value of MACOSX_DEPLOYMENT_TARGET in the Makefile, and distutils picks it up later and sets the environment variable again. We even have a hack to fix Apple-installed Python 2.3 in place by mucking with lib/config/Makefile, which we can do because Apple-installed Python 2.3 will obviously only be run on 10.3. And we check whether this hack is needed when you install a later Python version on 10.3. That leaves Python 2.3.5 itself. The best fix here would be to backport the 2.4.1 solution: configure.in 1.456 and 1.478, distutils/sysconfig.py 1.59 and 1.62, Makefile.pre.in 1.144. Note that though the build procedure for extensions will change it doesn't affect binary compatibility: both types of extensions are loadable by both types of interpreters. I think this is all safe, and these patches shouldn't affect any system other than MacOSX, but I'm a bit reluctant to fiddle with the build procedure for a micro-release, so that's why I'm asking. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From bob at redivi.com Wed Jan 5 00:26:27 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed Jan 5 00:26:49 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> References: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> Message-ID: <0EB97E30-5EA8-11D9-96B0-000A9567635C@redivi.com> On Jan 4, 2005, at 6:01 PM, Jack Jansen wrote: > > On 4-jan-05, at 19:28, Guido van Rossum wrote: >> The >> extra type checking on the first argument that unbound methods are >> supposed to provide is not useful in practice (I can't remember that >> it ever caught a bug in my code) > > It caught bugs for me a couple of times. If I remember correctly I was > calling methods of something that was supposed to be a mixin class but > I forgot to actually list the mixin as a base. But I don't think > that's a serious enough issue alone to keep the unbound method type. > > But I'm more worried about losing the other information in an unbound > method, specifically im_class. I would guess that info is useful to > class browsers and such, or are there other ways to get at that? For a class browser, presumably, you would start at the class and then find the methods. Starting from some class and walking the mro, you can inspect the dicts along the way and you'll find everything and know where it came from. -bob From martin at v.loewis.de Wed Jan 5 00:54:02 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 5 00:53:56 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: References: Message-ID: <41DB2C9A.4070800@v.loewis.de> Jack Jansen wrote: > First question: what is the Python 2.3.5 release schedule and who is > responsible? Last I heard it is going to be released "in January", and Anthony Baxter is the release manager. > Second question: I thought this info was in a PEP somewhere, but I could > only find PEPs on major releases, should I have found this info somewhere? By following python-dev, or in a python-dev summary, e.g. http://www.python.org/dev/summary/2004-11-01_2004-11-15.html > The problem we're trying to solve is that due to the way Apple's > framework architecture works newer versions of frameworks are preferred > (at link time, and sometimes even at runtime) over older ones. Can you elaborate on that somewhat? According to http://developer.apple.com/documentation/MacOSX/Conceptual/BPFrameworks/Concepts/VersionInformation.html there are major and minor versions of frameworks. I would think that every Python minor (2.x) release should produce a new major framework version of the Python framework. Then, there would be no problem. Why does this not work? > I think this is all safe, and these patches shouldn't affect any system > other than MacOSX, but I'm a bit reluctant to fiddle with the build > procedure for a micro-release, so that's why I'm asking. This is ultimately for the release manager to decide. My personal feeling is that it is ok to fiddle with the build procedure. I'm more concerned that the approach taken might be "wrong", in the sense that it uses a stack of hacks and work-arounds for problems which Apple envisions to be solved differently. That would be bad, because it might make an implementation of the "proper" solution more difficult. Regards, Martin From bob at redivi.com Wed Jan 5 01:08:54 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed Jan 5 01:09:05 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DB2C9A.4070800@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> Message-ID: On Jan 4, 2005, at 6:54 PM, Martin v. L?wis wrote: > Jack Jansen wrote: >> First question: what is the Python 2.3.5 release schedule and who is >> responsible? > > Last I heard it is going to be released "in January", and Anthony > Baxter > is the release manager. > >> Second question: I thought this info was in a PEP somewhere, but I >> could only find PEPs on major releases, should I have found this info >> somewhere? > > By following python-dev, or in a python-dev summary, e.g. > > http://www.python.org/dev/summary/2004-11-01_2004-11-15.html > >> The problem we're trying to solve is that due to the way Apple's >> framework architecture works newer versions of frameworks are >> preferred (at link time, and sometimes even at runtime) over older >> ones. > > Can you elaborate on that somewhat? According to > > http://developer.apple.com/documentation/MacOSX/Conceptual/ > BPFrameworks/Concepts/VersionInformation.html > > there are major and minor versions of frameworks. I would think that > every Python minor (2.x) release should produce a new major framework > version of the Python framework. Then, there would be no problem. > > Why does this not work? It doesn't for reasons I care not to explain in depth, again. Search the pythonmac-sig archives for longer explanations. The gist is that you specifically do not want to link directly to the framework at all when building extensions. These patches are required to do that correctly. >> I think this is all safe, and these patches shouldn't affect any >> system other than MacOSX, but I'm a bit reluctant to fiddle with the >> build procedure for a micro-release, so that's why I'm asking. > > This is ultimately for the release manager to decide. My personal > feeling is that it is ok to fiddle with the build procedure. I'm > more concerned that the approach taken might be "wrong", in the > sense that it uses a stack of hacks and work-arounds for problems > which Apple envisions to be solved differently. That would be bad, > because it might make an implementation of the "proper" solution > more difficult. This is not the wrong way to do it. -bob From kbk at shore.net Wed Jan 5 01:57:28 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed Jan 5 01:58:07 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: (Guido van Rossum's message of "Tue, 4 Jan 2005 07:31:30 -0800") References: <1104842608.3227.60.camel@presto.wooz.org> Message-ID: <87hdlwae13.fsf@hydra.bayview.thirdcreek.com> Guido van Rossum writes: > I'd be fine with keeping this on python-dev too. Maybe tag the Subject: with [AST] when starting a thread? -- KBK From jcarlson at uci.edu Wed Jan 5 02:18:30 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Jan 5 02:27:55 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <1f7befae05010410576effd024@mail.gmail.com> References: <1f7befae05010410576effd024@mail.gmail.com> Message-ID: <20050104154707.927B.JCARLSON@uci.edu> Tim Peters wrote: > Guido wrote: > > Let's get rid of unbound methods. When class C defines a method [snip] > Really? Unbound methods are used most often (IME) to call a > base-class method from a subclass, like my_base.the_method(self, ...). > It's especially easy to forget to write `self, ` there, and the > exception msg then is quite focused because of that extra bit of type > checking. Otherwise I expect we'd see a more-mysterious > AttributeError or TypeError when the base method got around to trying > to do something with the bogus `self` passed to it. Agreed. While it seems that super() is the 'modern paradigm' for this, I have been using base.method(self, ...) for years now, and have been quite happy with it. After attempting to convert my code to use the super() paradigm, and having difficulty, I discovered James Knight's "Python's Super Considered Harmful" (available at http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I discovered how super really worked (I should have read the documention in the first place), and reverted my changes to the base.method version. > I could live with that, though. I could live with it too, but I would probably use an equivalent of the following (with actual type checking): def mysuper(typ, obj): lm = list(o.__class__.__mro__) indx = lm.index(typ) if indx == 0: return obj return super(lm[indx-1], obj) All in all, I'm -0. I don't desire to replace all of my base.method with mysuper(base, obj).method, but if I must sacrifice convenience for the sake of making Python 2.5's implementation simpler, I guess I'll deal with it. My familiarity with grep's regular expressions leaves something to be desired, so I don't know how often base.method(self,...) is or is not used in the standard library. - Josiah From gvanrossum at gmail.com Wed Jan 5 03:02:17 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 5 03:02:20 2005 Subject: [Python-Dev] super() harmful? In-Reply-To: <20050104154707.927B.JCARLSON@uci.edu> References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> Message-ID: [Josiah] > Agreed. While it seems that super() is the 'modern paradigm' for this, > I have been using base.method(self, ...) for years now, and have been > quite happy with it. After attempting to convert my code to use the > super() paradigm, and having difficulty, I discovered James Knight's > "Python's Super Considered Harmful" (available at > http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I > discovered how super really worked (I should have read the documention > in the first place), and reverted my changes to the base.method version. I think that James Y Knight's page misrepresents the issue. Quoting: """ Note that the __init__ method is not special -- the same thing happens with any method, I just use __init__ because it is the method that most often needs to be overridden in many classes in the hierarchy. """ But __init__ *is* special, in that it is okay for a subclass __init__ (or __new__) to have a different signature than the base class __init__; this is not true for other methods. If you change a regular method's signature, you would break Liskov substitutability (i.e., your subclass instance wouldn't be acceptable where a base class instance would be acceptable). Super is intended for use that are designed with method cooperation in mind, so I agree with the best practices in James's Conclusion: """ * Use it consistently, and document that you use it, as it is part of the external interface for your class, like it or not. * Never call super with anything but the exact arguments you received, unless you really know what you're doing. * When you use it on methods whose acceptable arguments can be altered on a subclass via addition of more optional arguments, always accept *args, **kw, and call super like "super(MyClass, self).currentmethod(alltheargsideclared, *args, **kwargs)". If you don't do this, forbid addition of optional arguments in subclasses. * Never use positional arguments in __init__ or __new__. Always use keyword args, and always call them as keywords, and always pass all keywords on to super. """ But that's not the same as calling it harmful. :-( -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Wed Jan 5 03:07:31 2005 From: tim.peters at gmail.com (Tim Peters) Date: Wed Jan 5 03:07:35 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <20050104154707.927B.JCARLSON@uci.edu> References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> Message-ID: <1f7befae050104180711743ebd@mail.gmail.com> [Tim Peters] >> ... Unbound methods are used most often (IME) to call a >> base-class method from a subclass, like >> my_base.the_method(self, ...). >> It's especially easy to forget to write `self, ` there, and the >> exception msg then is quite focused because of that extra bit of >> type checking. Otherwise I expect we'd see a more-mysterious >> AttributeError or TypeError when the base method got around to >> trying to do something with the bogus `self` passed to it. [Josiah Carlson] > Agreed. Well, it's not that easy to agree with. Guido replied that most such cases would raise an argument-count-mismatch exception instead. I expect that's because he stopped working on Zope code, so actually thinks it's odd again to see a gazillion methods like: class Registerer(my_base): def register(*args, **kws): my_base.register(*args, **kws) I bet he even presumes that if you chase such chains long enough, you'll eventually find a register() method *somewhere* that actually uses its arguments . > While it seems that super() is the 'modern pradigm' for this, > I have been using base.method(self, ...) for years now, and have > been quite happy with it. After attempting to convert my code to > use the super() paradigm, and having difficulty, I discovered James > Knight's "Python's Super Considered Harmful" (available at > http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I > discovered how super really worked (I should have read the > documention in the first place), and reverted my changes to the > base.method version. How did super() get into this discussion? I don't think I've ever used it myself, but I avoid fancy inheritance graphs in "my own" code, so can live with anything. > I could live with it too, but I would probably use an equivalent of the > following (with actual type checking): > > def mysuper(typ, obj): > lm = list(o.__class__.__mro__) > indx = lm.index(typ) > if indx == 0: > return obj > return super(lm[indx-1], obj) > > All in all, I'm -0. I don't desire to replace all of my base.method > with mysuper(base, obj).method, but if I must sacrifice > convenience for the sake of making Python 2.5's implementation > simpler, I guess I'll deal with it. My familiarity with grep's regular > expressions leaves something to be desired, so I don't know how > often base.method(self,...) is or is not used in the standard library. I think there may be a misunderstanding here. Guido isn't proposing that base.method(self, ...) would stop working -- it would still work fine. The result of base.method would still be a callable object: it would no longer be of an "unbound method" type (it would just be a function), and wouldn't do special checking on the first argument passed to it anymore, but base.method(self, ...) would still invoke the base class method. You wouldn't need to rewrite anything (unless you're doing heavy-magic introspection, picking callables apart). From bob at redivi.com Wed Jan 5 04:12:59 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed Jan 5 04:13:11 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <20050104154707.927B.JCARLSON@uci.edu> References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> Message-ID: On Jan 4, 2005, at 8:18 PM, Josiah Carlson wrote: > > Tim Peters wrote: >> Guido wrote: >>> Let's get rid of unbound methods. When class C defines a method > [snip] >> Really? Unbound methods are used most often (IME) to call a >> base-class method from a subclass, like my_base.the_method(self, ...). >> It's especially easy to forget to write `self, ` there, and the >> exception msg then is quite focused because of that extra bit of type >> checking. Otherwise I expect we'd see a more-mysterious >> AttributeError or TypeError when the base method got around to trying >> to do something with the bogus `self` passed to it. > > Agreed. While it seems that super() is the 'modern paradigm' for this, > I have been using base.method(self, ...) for years now, and have been > quite happy with it. After attempting to convert my code to use the > super() paradigm, and having difficulty, I discovered James Knight's > "Python's Super Considered Harmful" (available at > http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I > discovered how super really worked (I should have read the documention > in the first place), and reverted my changes to the base.method > version. How does removing the difference between unmount methods and base.method(self, ...) break anything at all if it was correct code in the first place? As far as I can tell, all it does is remove any restriction on what "self" is allowed to be. On another note - I don't agree with the "super considered harmful" rant at all. Yes, when you're using __init__ and __new__ of varying signatures in a complex class hierarchy, initialization is going to be one hell of a problem -- no matter which syntax you use. All super is doing is taking the responsibility of calculating the MRO away from you, and it works awfully well for the general case where a method of a given name has the same signature and the class hierarchies are not insane. If you have a class hierarchy where this is a problem, it's probably pretty fragile to begin with, and you should think about making it simpler. -bob From barry at python.org Wed Jan 5 04:42:43 2005 From: barry at python.org (Barry Warsaw) Date: Wed Jan 5 04:42:47 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> References: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> Message-ID: <1104896563.16766.19.camel@geddy.wooz.org> On Tue, 2005-01-04 at 18:01, Jack Jansen wrote: > But I'm more worried about losing the other information in an unbound > method, specifically im_class. I would guess that info is useful to > class browsers and such, or are there other ways to get at that? That would be my worry too. OTOH, we have function attributes now, so why couldn't we just stuff the class on the function's im_class attribute? Who'd be the wiser? (Could the same be done for im_self and im_func for backwards compatibility?) quack-quack-ly y'rs, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050104/70559f60/attachment.pgp From jcarlson at uci.edu Wed Jan 5 07:28:37 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Jan 5 07:37:05 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <1f7befae050104180711743ebd@mail.gmail.com> References: <20050104154707.927B.JCARLSON@uci.edu> <1f7befae050104180711743ebd@mail.gmail.com> Message-ID: <20050104220744.927E.JCARLSON@uci.edu> Tim Peters wrote: > > [Tim Peters] > >> ... Unbound methods are used most often (IME) to call a > >> base-class method from a subclass, like > >> my_base.the_method(self, ...). > >> It's especially easy to forget to write `self, ` there, and the > >> exception msg then is quite focused because of that extra bit of > >> type checking. Otherwise I expect we'd see a more-mysterious > >> AttributeError or TypeError when the base method got around to > >> trying to do something with the bogus `self` passed to it. > > [Josiah Carlson] > > Agreed. > > Well, it's not that easy to agree with. Guido replied that most such > cases would raise an argument-count-mismatch exception instead. I > expect that's because he stopped working on Zope code, so actually > thinks it's odd again to see a gazillion methods like: > > class Registerer(my_base): > def register(*args, **kws): > my_base.register(*args, **kws) > > I bet he even presumes that if you chase such chains long enough, > you'll eventually find a register() method *somewhere* that actually > uses its arguments . If type checking is important, one can always add it using decorators. Then again, I would be willing to wager that most people wouldn't add it due to laziness, until it bites them for more than a few hours worth of debugging time. > > While it seems that super() is the 'modern pradigm' for this, > > I have been using base.method(self, ...) for years now, and have > > been quite happy with it. After attempting to convert my code to > > use the super() paradigm, and having difficulty, I discovered James > > Knight's "Python's Super Considered Harmful" (available at > > http://www.ai.mit.edu/people/jknight/super-harmful/ ), wherein I > > discovered how super really worked (I should have read the > > documention in the first place), and reverted my changes to the > > base.method version. > > How did super() get into this discussion? I don't think I've ever > used it myself, but I avoid fancy inheritance graphs in "my own" code, > so can live with anything. It was my misunderstanding of your statement in regards to base.method. I had thought that base.method(self, ...) would stop working, and attempted to discover how one would be able to get the equivalent back, regardless of the inheritance graph. > > I could live with it too, but I would probably use an equivalent of the > > following (with actual type checking): > > > > def mysuper(typ, obj): > > lm = list(o.__class__.__mro__) > > indx = lm.index(typ) > > if indx == 0: > > return obj > > return super(lm[indx-1], obj) > > > > All in all, I'm -0. I don't desire to replace all of my base.method > > with mysuper(base, obj).method, but if I must sacrifice > > convenience for the sake of making Python 2.5's implementation > > simpler, I guess I'll deal with it. My familiarity with grep's regular > > expressions leaves something to be desired, so I don't know how > > often base.method(self,...) is or is not used in the standard library. > > I think there may be a misunderstanding here. Guido isn't proposing > that base.method(self, ...) would stop working -- it would still work > fine. The result of base.method would still be a callable object: it > would no longer be of an "unbound method" type (it would just be a > function), and wouldn't do special checking on the first argument > passed to it anymore, but base.method(self, ...) would still invoke > the base class method. You wouldn't need to rewrite anything (unless > you're doing heavy-magic introspection, picking callables apart). Indeed, there was a misunderstanding on my part. I misunderstood your discussion of base.method(self, ...) to mean that such things would stop working. My apologies. - Josiah From andrewm at object-craft.com.au Wed Jan 5 08:06:43 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 5 08:06:41 2005 Subject: [Python-Dev] csv module TODO list Message-ID: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> There's a bunch of jobs we (CSV module maintainers) have been putting off - attached is a list (in no particular order): * unicode support (this will probably uglify the code considerably). * 8 bit transparency (specifically, allow \0 characters in source string and as delimiters, etc). * Reader and universal newlines don't interact well, reader doesn't honour Dialect's lineterminator setting. All outstanding bug id's (789519, 944890, 967934 and 1072404) are related to this - it's a difficult problem and further discussion is needed. * compare PEP-305 and library reference manual to the module as implemented and either document the differences or correct them. * Address or document Francis Avila's issues as mentioned in this posting: http://www.google.com.au/groups?selm=vsb89q1d3n5qb1%40corp.supernews.com * Several blogs complain that the CSV module is no good for parsing strings. Suggest making it clearer in the documentation that the reader accepts an iterable, rather than a file, and document why an iterable (as opposed to a string) is necessary (multi-line records with embedded newlines). We could also provide an interface that parses a single string (or the old Object Craft interface) for those that really feel the need. See: http://radio.weblogs.com/0124960/2003/09/12.html http://zephyrfalcon.org/weblog/arch_d7_2003_09_06.html#e335 * Compatability API for old Object Craft CSV module? http://mechanicalcat.net/cgi-bin/log/2003/08/18 For example: "from csv.legacy import reader" or something. * Pure python implementation? * Some CSV-like formats consider a quoted field a string, and an unquoted field a number - consider supporting this in the Reader and Writer. See: http://radio.weblogs.com/0124960/2004/04/23.html * Add line number and record number counters to reader object? * it's possible to get the csv parser to suck the whole source file into memory with an unmatched quote character. Need to limit size of internal buffer. Also, review comments from Neal Norwitz, 22 Mar 2003 (some of these should already have been addressed): * remove TODO comment at top of file--it's empty * is CSV going to be maintained outside the python tree? If not, remove the 2.2 compatibility macros for: PyDoc_STR, PyDoc_STRVAR, PyMODINIT_FUNC, etc. * inline the following functions since they are used only in one place get_string, set_string, get_nullchar_as_None, set_nullchar_as_None, join_reset (maybe) * rather than use PyErr_BadArgument, should you use assert? (first example, Dialect_set_quoting, line 218) * is it necessary to have Dialect_methods, can you use 0 for tp_methods? * remove commented out code (PyMem_DEL) on line 261 Have you used valgrind on the test to find memory overwrites/leaks? * PyString_AsString()[0] on line 331 could return NULL in which case you are dereferencing a NULL pointer * note sure why there are casts on 0 pointers lines 383-393, 733-743, 1144-1154, 1164-1165 * Reader_getiter() can be removed and use PyObject_SelfIter() * I think you need PyErr_NoMemory() before returning on line 768, 1178 * is PyString_AsString(self->dialect->lineterminator) on line 994 guaranteed not to return NULL? If not, it could crash by passing to memmove. * PyString_AsString() can return NULL on line 1048 and 1063, the result is passed to join_append() * iteratable should be iterable? (line 1088) * why doesn't csv_writerows() have a docstring? csv_writerow does * any PyUnicode_* methods should be protected with #ifdef Py_USING_UNICODE * csv_unregister_dialect, csv_get_dialect could use METH_O so you don't need to use PyArg_ParseTuple * in init_csv, recommend using PyModule_AddIntConstant and PyModule_AddStringConstant where appropriate Also, review comments from Jeremy Hylton, 10 Apr 2003: I've been reviewing extension modules looking for C types that should participate in garbage collection. I think the csv ReaderObj and WriterObj should participate. The ReaderObj it contains a reference to input_iter that could be an arbitrary Python object. The iterator object could well participate in a cycle that refers to the ReaderObj. The WriterObj has a reference to a writeline callable, which could well be a method of an object that also points to the WriterObj. The Dialect object appears to be safe, because the only PyObject * it refers should be a string. Safe until someone creates an insane string subclass <0.4 wink>. Also, an unrelated comment about the code, the lineterminator of the Dialect is managed by a collection of little helper functions like get_string, set_string, etc. This code appears to be excessively general; since they're called only once, it seems clearer to inline the logic directly in the get/set methods for the lineterminator. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From skip at pobox.com Wed Jan 5 08:33:04 2005 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 5 08:33:17 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> Message-ID: <16859.38960.9935.682429@montanaro.dyndns.org> Andrew> There's a bunch of jobs we (CSV module maintainers) have been Andrew> putting off - attached is a list (in no particular order): ... In addition, it occurred to me this evening that there's functionality in the csv module I don't think anybody uses. For example, you can register CSV dialects by name, then pass in the string name instead of the dialect class. I'd be in favor of scrapping list_dialects, register_dialect and unregister_dialect altogether. While they are probably trivial little functions I don't think they add much if anything to the implementation and just complicate the _csv extension module slightly. I'm also not aware that anyone really uses the Sniffer class, though it does provide some useful functionality should you need to analyze random CSV files. Skip From olsongt at verizon.net Wed Jan 5 08:22:32 2005 From: olsongt at verizon.net (olsongt@verizon.net) Date: Wed Jan 5 08:34:09 2005 Subject: [Python-Dev] Will ASTbranch compile on windows yet? Message-ID: <20050105072232.VHEV24088.out009.verizon.net@outgoing.verizon.net> [TIM] > > I don't have time to join the current crusade. If there's pent-up > interest among Windows users, it would be good to say which > compiler(s) you can use, since I expect not everyone can deal with VC > 7.1 (e.g., I think Raymond Hettinger is limited to VC 6; and you said > you worked up a VC 6 patch, but didn't say whether you could use 7.1 > now). > I've attached an updated patch that gets things working against current cvs. This also includes some fixes for typos that appear to have slipped through gcc and my have caused obscure bugs in *nix as well. I'll gladly fix the MSVC 7.1 project files after someone with commit privleges merges changes from HEAD as Jeremy requested. Any windows users building based on this patch would also need to run the 'asdl_c.py' utility manually right now before compiling. Something like: C:\Src\ast-branch\dist\src\Parser>asdl_c.py -h ..\Include -c ..\Python Python.asdl I'll get a proper fix in for MSVC 7.1, but don't feel like dealing with it for the obsolete 6.0 project files. -Grant From andrewm at object-craft.com.au Wed Jan 5 08:55:06 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 5 08:55:01 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <16859.38960.9935.682429@montanaro.dyndns.org> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <16859.38960.9935.682429@montanaro.dyndns.org> Message-ID: <20050105075506.314C93C8E5@coffee.object-craft.com.au> > Andrew> There's a bunch of jobs we (CSV module maintainers) have been > Andrew> putting off - attached is a list (in no particular order): > ... > >In addition, it occurred to me this evening that there's functionality in >the csv module I don't think anybody uses. It's very difficult to say for sure that nobody is using it once it's released to the world. >For example, you can register CSV dialects by name, then pass in the >string name instead of the dialect class. I'd be in favor of scrapping >list_dialects, register_dialect and unregister_dialect altogether. While >they are probably trivial little functions I don't think they add much if >anything to the implementation and just complicate the _csv extension >module slightly. Yes, in hindsight, they're not really necessary, although I'm sure we had some motivation for them initially. That said, they're there now, and they shouldn't require much maintenance. >I'm also not aware that anyone really uses the Sniffer class, though it >does provide some useful functionality should you need to analyze random >CSV files. The comment I get repeatedly is that they don't use it because it's "too magic/scary". That's as it should be. But if it didn't exist, then someone would be requesting we add it... 8-) -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From martin at v.loewis.de Wed Jan 5 09:33:13 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 5 09:33:09 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: References: <41DB2C9A.4070800@v.loewis.de> Message-ID: <41DBA649.3080008@v.loewis.de> Bob Ippolito wrote: > It doesn't for reasons I care not to explain in depth, again. Search > the pythonmac-sig archives for longer explanations. The gist is that > you specifically do not want to link directly to the framework at all > when building extensions. Because an Apple-built extension then may pick up a user-installed Python? Why can this problem not be solved by adding -F options, as Jack Jansen proposed? > This is not the wrong way to do it. I'm not convinced. Regards, Martin From martin at v.loewis.de Wed Jan 5 09:39:44 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 5 09:39:37 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> Message-ID: <41DBA7D0.80101@v.loewis.de> Andrew McNamara wrote: > There's a bunch of jobs we (CSV module maintainers) have been putting > off - attached is a list (in no particular order): > > * unicode support (this will probably uglify the code considerably). Can you please elaborate on that? What needs to be done, and how is that going to be done? It might be possible to avoid considerable uglification. Regards, Martin From mal at egenix.com Wed Jan 5 10:10:30 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Jan 5 10:10:33 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <41DBA7D0.80101@v.loewis.de> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> Message-ID: <41DBAF06.6020401@egenix.com> Martin v. L?wis wrote: > Andrew McNamara wrote: > >> There's a bunch of jobs we (CSV module maintainers) have been putting >> off - attached is a list (in no particular order): >> * unicode support (this will probably uglify the code considerably). > > > Can you please elaborate on that? What needs to be done, and how is > that going to be done? It might be possible to avoid considerable > uglification. Indeed. The trick is to convert to Unicode early and to use Unicode literals instead of string literals in the code. Note that the only real-life Unicode format in use is UTF-16 (with BOM mark) written by Excel. Note that there's no standard for specifying the encoding in CSV files, so this is also the only feasable format. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ronaldoussoren at mac.com Wed Jan 5 10:19:09 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed Jan 5 10:19:12 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DBA649.3080008@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> Message-ID: On 5-jan-05, at 9:33, Martin v. L?wis wrote: > Bob Ippolito wrote: >> It doesn't for reasons I care not to explain in depth, again. Search >> the pythonmac-sig archives for longer explanations. The gist is >> that you specifically do not want to link directly to the framework >> at all when building extensions. > > Because an Apple-built extension then may pick up a user-installed > Python? Why can this problem not be solved by adding -F options, > as Jack Jansen proposed? It gets worse when you have a user-installed python 2.3 and a user-installed python 2.4. Those will be both be installed as /Library/Frameworks/Python.framework. This means that you cannot use the -F flag to select which one you want to link to, '-framework Python' will only link to the python that was installed the latest. This is an issue on Mac OS X 10.2. Ronald From andrewm at object-craft.com.au Wed Jan 5 10:34:14 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 5 10:34:11 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <41DBAF06.6020401@egenix.com> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> Message-ID: <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> >> Andrew McNamara wrote: >>> There's a bunch of jobs we (CSV module maintainers) have been putting >>> off - attached is a list (in no particular order): >>> * unicode support (this will probably uglify the code considerably). >> >Martin v. Löwis wrote: >> Can you please elaborate on that? What needs to be done, and how is >> that going to be done? It might be possible to avoid considerable >> uglification. I'm not altogether sure there. The parsing state machine is all written in C, and deals with signed chars - I expect we'll need two versions of that (or one version that's compiled twice using pre-processor macros). Quite a large job. Suggestions gratefully received. M.-A. Lemburg wrote: >Indeed. The trick is to convert to Unicode early and to use Unicode >literals instead of string literals in the code. Yes, although it would be nice to also retain the 8-bit versions as well. >Note that the only real-life Unicode format in use is UTF-16 >(with BOM mark) written by Excel. Note that there's no standard >for specifying the encoding in CSV files, so this is also the only >feasable format. Yes - that's part of the problem I hadn't really thought about yet - the csv module currently interacts directly with files as iterators, but it's clear that we'll need to decode as we go. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From mal at egenix.com Wed Jan 5 10:44:40 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Jan 5 10:44:43 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> Message-ID: <41DBB708.5030501@egenix.com> Andrew McNamara wrote: >>>Andrew McNamara wrote: >>> >>>>There's a bunch of jobs we (CSV module maintainers) have been putting >>>>off - attached is a list (in no particular order): >>>>* unicode support (this will probably uglify the code considerably). >>> >>Martin v. L?wis wrote: >> >>>Can you please elaborate on that? What needs to be done, and how is >>>that going to be done? It might be possible to avoid considerable >>>uglification. > > > I'm not altogether sure there. The parsing state machine is all written in > C, and deals with signed chars - I expect we'll need two versions of that > (or one version that's compiled twice using pre-processor macros). Quite > a large job. Suggestions gratefully received. > > M.-A. Lemburg wrote: > >>Indeed. The trick is to convert to Unicode early and to use Unicode >>literals instead of string literals in the code. > > > Yes, although it would be nice to also retain the 8-bit versions as well. You can do so by using latin-1 as default encoding. Works great ! >>Note that the only real-life Unicode format in use is UTF-16 >>(with BOM mark) written by Excel. Note that there's no standard >>for specifying the encoding in CSV files, so this is also the only >>feasable format. > > Yes - that's part of the problem I hadn't really thought about yet - the > csv module currently interacts directly with files as iterators, but it's > clear that we'll need to decode as we go. Depends on your needs: CSV files tend to be small enough to do the decoding in one call in memory. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From andrewm at object-craft.com.au Wed Jan 5 11:03:25 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 5 11:03:20 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <41DBB708.5030501@egenix.com> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> <41DBB708.5030501@egenix.com> Message-ID: <20050105100325.A220D3C8E5@coffee.object-craft.com.au> >> Yes, although it would be nice to also retain the 8-bit versions as well. > >You can do so by using latin-1 as default encoding. Works great ! Yep, although that means we wear the cost of decoding and encoding for all 8 bit input. What does the _sre.c code do? >Depends on your needs: CSV files tend to be small enough >to do the decoding in one call in memory. We are routinely dealing with multi-gigabyte csv files - which is why the original 2001 vintage csv module was written as a C state machine. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From mal at egenix.com Wed Jan 5 11:16:50 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Jan 5 11:16:54 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <20050105100325.A220D3C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> <41DBB708.5030501@egenix.com> <20050105100325.A220D3C8E5@coffee.object-craft.com.au> Message-ID: <41DBBE92.4070106@egenix.com> Andrew McNamara wrote: >>>Yes, although it would be nice to also retain the 8-bit versions as well. >> >>You can do so by using latin-1 as default encoding. Works great ! > > Yep, although that means we wear the cost of decoding and encoding for > all 8 bit input. Right, but it makes the code very clean and straight forward. Again, it depends on what you need. If performance is critical then you probably need a C version written using the same trick as _sre.c... > What does the _sre.c code do? It comes in two versions: one for 8-bit the other for Unicode. >>Depends on your needs: CSV files tend to be small enough >>to do the decoding in one call in memory. > > We are routinely dealing with multi-gigabyte csv files - which is why the > original 2001 vintage csv module was written as a C state machine. I see, but are you sure that the typical Python user will have the same requirements to make it worth the effort (and complexity) ? I've written a few CSV parsers and writers myself over the years and the requirements were different every time, in terms of being flexible in the parsing phase, the interfaces and the performance needs. Haven't yet found a one fits all solution and don't really expect to any more :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From andrewm at object-craft.com.au Wed Jan 5 11:33:05 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 5 11:33:00 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <41DBBE92.4070106@egenix.com> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> <41DBB708.5030501@egenix.com> <20050105100325.A220D3C8E5@coffee.object-craft.com.au> <41DBBE92.4070106@egenix.com> Message-ID: <20050105103305.AD80B3C8E5@coffee.object-craft.com.au> >> Yep, although that means we wear the cost of decoding and encoding for >> all 8 bit input. > >Right, but it makes the code very clean and straight forward. I agree it makes for a very clean solution, and 99% of the time I'd chose that option. >Again, it depends on what you need. If performance is critical >then you probably need a C version written using the same trick >as _sre.c... > >> What does the _sre.c code do? > >It comes in two versions: one for 8-bit the other for Unicode. That's what I thought. I think the motivations here are similar to those that drove the _sre developers. >> We are routinely dealing with multi-gigabyte csv files - which is why the >> original 2001 vintage csv module was written as a C state machine. > >I see, but are you sure that the typical Python user will have >the same requirements to make it worth the effort (and >complexity) ? This is open source, so I scratch my own itch (and that of my employers) - we need fast csv parsing more than we need unicode... 8-) Okay, assuming we go the "produce two versions via evil macro tricks" path, it's still not quite the same situation as _sre.c, which only has to deal with the internal unicode representation. One way to approach this would be to add an "encoding" keyword argument to the readers and writers. If given, the parser would decode the input stream to the internal representation before passing it through the unicode state machine, which would yield tuples of unicode objects. That leaves us with a bit of a problem where the source is already unicode (eg, a list of unicode strings)... hmm. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From andrewm at object-craft.com.au Wed Jan 5 12:08:49 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 5 12:08:43 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> Message-ID: <20050105110849.CBA843C8E5@coffee.object-craft.com.au> >Also, review comments from Neal Norwitz, 22 Mar 2003 (some of these should >already have been addressed): I should apologise to Neal here for not replying to him at the time. Okay, going though the issues Neal raised... >* remove TODO comment at top of file--it's empty Was fixed. >* is CSV going to be maintained outside the python tree? > If not, remove the 2.2 compatibility macros for: > PyDoc_STR, PyDoc_STRVAR, PyMODINIT_FUNC, etc. Does anyone thing we should continue to maintain this 2.2 compatibility? >* inline the following functions since they are used only in one place > get_string, set_string, get_nullchar_as_None, set_nullchar_as_None, > join_reset (maybe) It was done that way as I felt we would be adding more getters and setters to the dialect object in future. >* rather than use PyErr_BadArgument, should you use assert? > (first example, Dialect_set_quoting, line 218) You mean C assert()? I don't think I'm really following you here - where would the type of the object be checked in a way the user could recover from? >* is it necessary to have Dialect_methods, can you use 0 for tp_methods? I was assuming I would need to add methods at some point (in fact, I did have methods, but removed them). >* remove commented out code (PyMem_DEL) on line 261 > Have you used valgrind on the test to find memory overwrites/leaks? No, valgrind wasn't used. >* PyString_AsString()[0] on line 331 could return NULL in which case > you are dereferencing a NULL pointer Was fixed. >* note sure why there are casts on 0 pointers > lines 383-393, 733-743, 1144-1154, 1164-1165 To make it easier when the time comes to add one of those members. >* Reader_getiter() can be removed and use PyObject_SelfIter() Okay, wasn't aware of PyObject_SelfIter - will fix. >* I think you need PyErr_NoMemory() before returning on line 768, 1178 The examples I looked at in the Python core didn't do this - are you sure? (now lines 832 and 1280). >* is PyString_AsString(self->dialect->lineterminator) on line 994 > guaranteed not to return NULL? If not, it could crash by > passing to memmove. >* PyString_AsString() can return NULL on line 1048 and 1063, > the result is passed to join_append() Looking at the PyString_AsString implementation, it looks safe (we ensure it's really a string elsewhere)? >* iteratable should be iterable? (line 1088) Sorry, I don't know what you're getting at here? (now line 1162). >* why doesn't csv_writerows() have a docstring? csv_writerow does Was fixed. >* any PyUnicode_* methods should be protected with #ifdef Py_USING_UNICODE Was fixed. >* csv_unregister_dialect, csv_get_dialect could use METH_O > so you don't need to use PyArg_ParseTuple Was fixed. >* in init_csv, recommend using > PyModule_AddIntConstant and PyModule_AddStringConstant > where appropriate Was fixed. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From aleax at aleax.it Wed Jan 5 12:11:37 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 5 12:11:42 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <1104896563.16766.19.camel@geddy.wooz.org> References: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> <1104896563.16766.19.camel@geddy.wooz.org> Message-ID: <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 05, at 04:42, Barry Warsaw wrote: > On Tue, 2005-01-04 at 18:01, Jack Jansen wrote: > >> But I'm more worried about losing the other information in an unbound >> method, specifically im_class. I would guess that info is useful to >> class browsers and such, or are there other ways to get at that? > > That would be my worry too. OTOH, we have function attributes now, so > why couldn't we just stuff the class on the function's im_class > attribute? Who'd be the wiser? (Could the same be done for im_self > and > im_func for backwards compatibility?) Hmmm, seems to me we'd need copies of the function object for this purpose: def f(*a): pass class C(object): pass class D(object): pass C.f = D.f = f If now we want C.f.im_class to differ from D.f.im_class then we need f to get copied implicitly when it's assigned to C.f (or, of course, when C.f is accessed... but THAT might be substantial overhead). OK, I guess, as long as we don't expect any further attribute setting on f to affect C.f or D.f (and I don't know of any real use case where that would be needed). Alex From aleax at aleax.it Wed Jan 5 12:28:39 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 5 12:28:45 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl> References: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl> Message-ID: On 2005 Jan 05, at 00:06, Jack Jansen wrote: ... > We've solved this issue for the trunk and we can solve it for 2.4.1: > if MACOSX_DEPLOYMENT_TARGET isn't set and we're on 10.3 we force it to > 10.3. Moreover, when it is 10.3 or higher (possibly after being > forced) we use the dynamic_lookup way of linking Not having followed Python/Mac developments closely (my fault, sigh), I would like to understand what this would imply for the forthcoming 10.4 ("Tiger") release of MacOS -- and that in turn depends, I assume, on what Python release will come with it. Anybody who's under nondisclosure should of course keep mum, but can somebody help e.g. by telling me what Python is included in the current "development previews" versions of Tiger? I'm not gonna spend $500 to become a highly-ranked enough "apple developer" to get those previews. Considering Apple's habitual timings, I'm sort of resigned to us being stuck with 2.3 for Tiger, but I would at least hope they'd get as late a 2.3.* as they can. So, assuming Tiger's Python is going to be, say, 2.3.4 or 2.3.5, would the change you're proposing make it MORE attractive to Apple to go for 2.3.5, LESS so, or is it indifferent from their POV...? Thanks in advance for any help in getting the tradeoffs about this clearer in my mind! Alex From ronaldoussoren at mac.com Wed Jan 5 12:40:07 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed Jan 5 12:40:13 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: References: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl> Message-ID: <8C4DEDD2-5F0E-11D9-85EE-000D93AD379E@mac.com> On 5-jan-05, at 12:28, Alex Martelli wrote: > > On 2005 Jan 05, at 00:06, Jack Jansen wrote: > ... >> We've solved this issue for the trunk and we can solve it for 2.4.1: >> if MACOSX_DEPLOYMENT_TARGET isn't set and we're on 10.3 we force it >> to 10.3. Moreover, when it is 10.3 or higher (possibly after being >> forced) we use the dynamic_lookup way of linking > > Not having followed Python/Mac developments closely (my fault, sigh), > I would like to understand what this would imply for the forthcoming > 10.4 ("Tiger") release of MacOS -- and that in turn depends, I assume, > on what Python release will come with it. Anybody who's under > nondisclosure should of course keep mum, but can somebody help e.g. by > telling me what Python is included in the current "development > previews" versions of Tiger? The Tiger that was released at WWDC included a patched version of Python 2.3.3. See: http://www.opensource.apple.com/darwinsource/WWDC2004/. Ronald From aleax at aleax.it Wed Jan 5 13:02:53 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 5 13:02:59 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <41DB069A.8030406@ocf.berkeley.edu> References: <1104842608.3227.60.camel@presto.wooz.org> <45D8D02A-5E81-11D9-ADA4-000A95EFAE9E@aleax.it> <41DB069A.8030406@ocf.berkeley.edu> Message-ID: On 2005 Jan 04, at 22:11, Brett C. wrote: ... >> Speaking for myself, I have a burning interest in the AST branch >> (though I can't seem to get it correctly downloaded so far, I guess >> it's just my usual CVS-clumsiness and I'll soon find out what I'm >> doing wrong & fix it) > > See > http://www.python.org/dev/devfaq.html#how-can-i-check-out-a-tagged- > branch on how to do a checkout of a tagged branch. Done! Believe it or not, I _had_ already tried following those very instructions -- and I kept omitting the word 'python' at the end and/or mispelling the tag as ast_branch (while it wants a dash, NOT an underscore...). I guess having the instructions recommended to me again prompted me to doublecheck character by character extracarefully, so, thanks!-) Alex From andrewm at object-craft.com.au Wed Jan 5 13:29:11 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 5 13:29:05 2005 Subject: [Python-Dev] Re: csv module TODO list In-Reply-To: <20050105121921.GB24030@idi.ntnu.no> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <16859.38960.9935.682429@montanaro.dyndns.org> <20050105075506.314C93C8E5@coffee.object-craft.com.au> <20050105121921.GB24030@idi.ntnu.no> Message-ID: <20050105122911.83EE93C8E5@coffee.object-craft.com.au> >Quite a while ago I posted some material to the csv-list about >problems using the csv module on Unix-style colon-separated files -- >it just doesn't deal properly with backslash escaping and is quite >useless for this kind of file. I seem to recall the general view was >that it wasn't intended for this kind of thing -- only the sort of csv >that Microsoft Excel outputs/inputs, but if I am mistaken about this, >perhaps fixing this issue might be put on the TODO-list? I'll be happy >to re-send or summarize the relevant emails, if needed. I think a related issue was included in my TODO list: >* Address or document Francis Avila's issues as mentioned in this posting: > > http://www.google.com.au/groups?selm=vsb89q1d3n5qb1%40corp.supernews.com -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From aleax at aleax.it Wed Jan 5 13:37:50 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 5 13:37:56 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <8C4DEDD2-5F0E-11D9-85EE-000D93AD379E@mac.com> References: <443EF94C-5EA5-11D9-BB20-000D934FF6B4@cwi.nl> <8C4DEDD2-5F0E-11D9-85EE-000D93AD379E@mac.com> Message-ID: <9CD489F6-5F16-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 05, at 12:40, Ronald Oussoren wrote: ... > The Tiger that was released at WWDC included a patched version of > Python 2.3.3. See: > http://www.opensource.apple.com/darwinsource/WWDC2004/. Thanks! So, since WWDC was on June 28 and 2.3.4 had been released on May 27, we get some first sense of the speed or lack thereof of 2.3.x releases' entrance in Tiger's previews... Alex From mwh at python.net Wed Jan 5 14:49:05 2005 From: mwh at python.net (Michael Hudson) Date: Wed Jan 5 14:49:07 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DBA649.3080008@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed, 05 Jan 2005 09:33:13 +0100") References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> Message-ID: <2mzmzogf5a.fsf@starship.python.net> "Martin v. L?wis" writes: > Bob Ippolito wrote: >> It doesn't for reasons I care not to explain in depth, again. >> Search the pythonmac-sig archives for longer explanations. The >> gist is that you specifically do not want to link directly to the >> framework at all when building extensions. > > Because an Apple-built extension then may pick up a user-installed > Python? Why can this problem not be solved by adding -F options, > as Jack Jansen proposed? > >> This is not the wrong way to do it. > > I'm not convinced. Martin, can you please believe that Jack, Bob, Ronald et al know what they are talking about here? Cheers, mwh -- Q: Isn't it okay to just read Slashdot for the links? A: No. Reading Slashdot for the links is like having "just one hit" off the crack pipe. -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#faq From bob at redivi.com Wed Jan 5 16:18:08 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed Jan 5 16:18:18 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DBA649.3080008@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> Message-ID: <019B7FCB-5F2D-11D9-9DC0-000A9567635C@redivi.com> On Jan 5, 2005, at 3:33 AM, Martin v. L?wis wrote: > Bob Ippolito wrote: >> It doesn't for reasons I care not to explain in depth, again. Search >> the pythonmac-sig archives for longer explanations. The gist is >> that you specifically do not want to link directly to the framework >> at all when building extensions. > > Because an Apple-built extension then may pick up a user-installed > Python? Why can this problem not be solved by adding -F options, > as Jack Jansen proposed? > >> This is not the wrong way to do it. > > I'm not convinced. Then you haven't done the appropriate research by searching pythonmac-sig. Do you even own a Mac? -bob From glyph at divmod.com Wed Jan 5 16:37:16 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Wed Jan 5 16:35:24 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> Message-ID: <1104939436.5854.25.camel@localhost> On Tue, 2005-01-04 at 22:12 -0500, Bob Ippolito wrote: > If you have a class hierarchy where this is a problem, it's probably > pretty fragile to begin with, and you should think about making it > simpler. I agree with James's rant almost entirely, but I like super() anyway. I think it is an indication not of a new weakness of super(), but of a long-standing weakness of __init__. One approach I have taken in order to avoid copiously over-documenting every super() using class is to decouple different phases of initialization by making __init__ as simple as possible (setting a few attributes, resisting the temptation to calculate things), and then providing class methods like '.fromString' or '.forUnserialize' that create instances that have been completely constructed for a particular purpose. That way the signatures are much more likely to line up across inheritance hierarchies. Perhaps this should be a suggested "best practice" when using super() as well? From glyph at divmod.com Wed Jan 5 16:41:30 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Wed Jan 5 16:39:37 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it> References: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> <1104896563.16766.19.camel@geddy.wooz.org> <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <1104939690.5854.30.camel@localhost> On Wed, 2005-01-05 at 12:11 +0100, Alex Martelli wrote: > Hmmm, seems to me we'd need copies of the function object for this > purpose: For the stated use-case of serialization, only one copy would be necessary, and besides - even *I* don't use idioms as weird as the one you are suggesting very often ;). I think it would be reasonable to assign im_class only to functions defined in class scope. The only serialization that would break in that case is if your example had a 'del f' at the end. From arigo at tunes.org Wed Jan 5 17:10:45 2005 From: arigo at tunes.org (Armin Rigo) Date: Wed Jan 5 17:21:42 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <41DAF22B.6030605@zope.com> References: <41DAE213.9070906@zope.com> <41DAF22B.6030605@zope.com> Message-ID: <20050105161045.GA19431@vicky.ecs.soton.ac.uk> Hi Jim, On Tue, Jan 04, 2005 at 02:44:43PM -0500, Jim Fulton wrote: > >Actually, unbound builtin methods are a different type than bound > >builtin methods: > > Of course, but conceptually they are similar. You would still > encounter the concept if you got an unbound builtin method. There are no such things as unbound builtin methods: >>> list.append is list.__dict__['append'] True In other words 'list.append' just returns exactly the same object as stored in the list type's dict. Guido's proposal is to make Python methods behave in the same way. Armin From seojiwon at gmail.com Wed Jan 5 17:32:44 2005 From: seojiwon at gmail.com (Jiwon Seo) Date: Wed Jan 5 17:32:47 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: <41D9F94C.3020005@ocf.berkeley.edu> Message-ID: I'd like to help here on the AST branch, if it's not too late. (Especially I'm interested with the generator expression part.) If I want to volunteer, do I just begin to work with it? Or do I need to read something or discuss with someone? Thanks. Jiwon. On Mon, 3 Jan 2005 23:03:33 -0500, Jeremy Hylton wrote: > On Mon, 03 Jan 2005 18:02:52 -0800, Brett C. wrote: > > Plus there is the running tradition of sprinting on the AST branch at PyCon. I > > was planning on shedding my bug fixing drive at PyCon this year and sprinting > > with (hopefully) Jeremy, Neal, Tim, and Neil on the AST branch as a prep for > > working on it afterwards for my class credit. > > I'd like to sprint on it before PyCon; we'll have to see what my > schedule allows. > > > If anyone would like to see the current code, check out ast-branch from CVS > > (read the dev FAQ on how to check out a branch from CVS). Read > > Python/compile.txt for an overview of how the thing works and such. > > > > It will get done, just don't push for a 2.5 release within a month. =) > > I think the branch is in an awkward state, because of the new features > added to Python 2.4 after the AST branch work ceased. The ast branch > doesn't handle generator expressions or decorators; extending the ast > to support them would be a good first step. > > There are also the simple logistical questions of integrating changes. > Since most of the AST branch changes are confined to a few files, I > suspect the best thing to do is to merge all the changes from the head > except for compile.c. I haven't done a major CVS branch integrate in > at least nine months; if someone feels more comfortable with that, it > would also be a good step. > > Perhaps interested parties should take up the discussion on the > compiler-sig. I think we can recover the state of last May's effort > pretty quickly, and I can help outline the remaining work even if I > can't help much. (Although I hope I can help, too.) > > Jeremy > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com > From arigo at tunes.org Wed Jan 5 17:30:06 2005 From: arigo at tunes.org (Armin Rigo) Date: Wed Jan 5 17:40:57 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: References: Message-ID: <20050105163006.GB19431@vicky.ecs.soton.ac.uk> Hi Guido, On Tue, Jan 04, 2005 at 10:28:03AM -0800, Guido van Rossum wrote: > Let's get rid of unbound methods. Is there any other use case for 'C.x' not returning the same as 'appropriate_super_class_of_C.__dict__["x"]' ? I guess it's too late now but it would have been nice if user-defined __get__() methods had the more obvious signature (self, instance) instead of (self, instance_or_None, cls=None). Given the amount of potential breakage people already pointed out I guess it is not reasonable to change that. Armin From jhylton at gmail.com Wed Jan 5 17:42:57 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Wed Jan 5 17:43:00 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: References: <41D9F94C.3020005@ocf.berkeley.edu> Message-ID: On Thu, 6 Jan 2005 01:32:44 +0900, Jiwon Seo wrote: > I'd like to help here on the AST branch, if it's not too late. > (Especially I'm interested with the generator expression part.) Great! It's not too late. > If I want to volunteer, do I just begin to work with it? Or do I need > to read something or discuss with someone? The file Python/compile.txt on the ast-branch has a brief overview of the project: http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.8&only_with_tag=ast-branch&view=auto Jeremy From shane.holloway at ieee.org Wed Jan 5 17:44:31 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Wed Jan 5 17:45:07 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it> References: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> <1104896563.16766.19.camel@geddy.wooz.org> <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <41DC196F.8070400@ieee.org> Alex Martelli wrote: > def f(*a): pass class C(object): pass class D(object): pass C.f = D.f > = f > > If now we want C.f.im_class to differ from D.f.im_class then we need > f to get copied implicitly when it's assigned to C.f (or, of course, > when C.f is accessed... but THAT might be substantial overhead). OK, > I guess, as long as we don't expect any further attribute setting on > f to affect C.f or D.f (and I don't know of any real use case where > that would be needed). You'd have to do a copy anyway, because f() is still a module-level callable entity. I also agree with Glyph that im_class should only really be set in the case of methods defined within the class block. Also, interestingly, removing unbound methods makes another thing possible. class A(object): def foo(self): pass class B(object): foo = A.foo class C(object): pass C.foo = A.foo I'd really like to avoid making copies of functions for the sake of reload() and edit-and-continue functionality. Currently we can track down everything that has a reference to foo, and replace it with newfoo. With copies, this would more difficult. Thanks, -Shane From jhylton at gmail.com Wed Jan 5 17:49:05 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Wed Jan 5 17:49:08 2005 Subject: [Python-Dev] ast branch pragmatics Message-ID: The existing ast-branch mostly works, but it does not include most of the new features of Python 2.4. There is a substantial integration effort, perhaps easy for someone who does a lot of CVS branch merges. (In particular, the head has already been merged to this branch once.) I think it would be easier to create a new branch from the current head, integrate the small number of changed files from ast-branch, and work with that branch instead. The idea is that it's an end-run around doing an automatic CVS merge and relying on someone to manually merge the changes. At the same time, since there is a groundswell of support for finishing the AST work, I'd like to propose that we stop making compiler / bytecode changes until it is done. Every change to compile.c or the bytecode ends up creating a new incompatibilty that needs to be merged. If these two plans sound good, I'll get started on the new branch. Jeremy From foom at fuhm.net Wed Jan 5 17:55:54 2005 From: foom at fuhm.net (James Y Knight) Date: Wed Jan 5 17:56:01 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> Message-ID: I'm not sure why super got dragged into this, but... On Jan 4, 2005, at 9:02 PM, Guido van Rossum wrote: > I think that James Y Knight's page misrepresents the issue. Quoting: > But __init__ *is* special, in that it is okay for a subclass __init__ > (or __new__) to have a different signature than the base class > __init__; this is not true for other methods. If you change a regular > method's signature, you would break Liskov substitutability (i.e., > your subclass instance wouldn't be acceptable where a base class > instance would be acceptable). You're right, some issues do apply to __init__ alone. However, two important ones do not: The issue of mixing super() and explicit calls to the superclass's method occur with any method. (Thus making it difficult/impossible for a framework to convert to using super without breaking client code that subclasses). Adding optional arguments to one branch of the inheritance tree, but not another, or adding different optional args in both branches. (breaks unless you always pass optional args as keywordargs, and all methods take **kwargs and pass that on to super). > Super is intended for use that are designed with method cooperation in > mind, so I agree with the best practices in James's Conclusion: > [[omitted]] > But that's not the same as calling it harmful. :-( The 'harmfulness' comes from people being confused by, and misusing super, because it is so very very easy to do so, and so very hard to use correctly. From what I can tell, it is mostly used incorrectly. *Especially* uses in __init__ or __new__. Many people seem to use super in their __init__ methods thinking that it'll magically improve something (like perhaps making multiple inheritance trees that include their class work better), only to just cause a different set of problems for multiple inheritance trees, instead, because they don't realize they need to follow those recommendations. Here's another page that says much the same thing, but from the viewpoint of recommending the use of super and showing you all the hoops to use it right: http://wiki.osafoundation.org/bin/view/Chandler/UsingSuper James PS, I wrote that page last pycon but never got around to finishing it up and therefore never really publically announced it. But I told some people about it and then they kept asking me for the URL so I linked to?it, and well, then google found it of course, so I guess it's public now. ;) From barry at python.org Wed Jan 5 18:26:38 2005 From: barry at python.org (Barry Warsaw) Date: Wed Jan 5 18:26:56 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <1104939436.5854.25.camel@localhost> References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> <1104939436.5854.25.camel@localhost> Message-ID: <1104945997.32311.8.camel@geddy.wooz.org> On Wed, 2005-01-05 at 10:37, Glyph Lefkowitz wrote: > One approach I have taken in order to avoid copiously over-documenting > every super() using class is to decouple different phases of > initialization by making __init__ as simple as possible (setting a few > attributes, resisting the temptation to calculate things), and then > providing class methods like '.fromString' or '.forUnserialize' that > create instances that have been completely constructed for a particular > purpose. That way the signatures are much more likely to line up across > inheritance hierarchies. Perhaps this should be a suggested "best > practice" when using super() as well? Yep, I've done the same thing. It's definitely a good practice. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050105/f78f1142/attachment.pgp From barry at python.org Wed Jan 5 18:29:01 2005 From: barry at python.org (Barry Warsaw) Date: Wed Jan 5 18:29:10 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <1104939690.5854.30.camel@localhost> References: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> <1104896563.16766.19.camel@geddy.wooz.org> <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it> <1104939690.5854.30.camel@localhost> Message-ID: <1104946141.32311.12.camel@geddy.wooz.org> On Wed, 2005-01-05 at 10:41, Glyph Lefkowitz wrote: > I think it would be reasonable to assign im_class only to functions > defined in class scope. The only serialization that would break in that > case is if your example had a 'del f' at the end. +1. If you're doing something funkier, then you can set that attribute yourself. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050105/7f041dd9/attachment-0001.pgp From ark-mlist at att.net Wed Jan 5 18:33:09 2005 From: ark-mlist at att.net (Andrew Koenig) Date: Wed Jan 5 18:32:53 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <41DAE213.9070906@zope.com> Message-ID: <001d01c4f34c$a02f8770$6402a8c0@arkdesktop> > duck typing? That's the Australian pronunciation of "duct taping". From fumanchu at amor.org Wed Jan 5 18:38:52 2005 From: fumanchu at amor.org (Robert Brewer) Date: Wed Jan 5 18:41:36 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list Message-ID: <3A81C87DC164034AA4E2DDFE11D258E33980EE@exchange.hqamor.amorhq.net> Skip Montanaro wrote: > Andrew> There's a bunch of jobs we (CSV module > maintainers) have been > Andrew> putting off - attached is a list (in no particular order): > > ... > > In addition, it occurred to me this evening that there's > functionality in the csv module I don't think anybody uses. > ... > I'm also not aware that anyone really uses the Sniffer class, > though it does provide some useful functionality should you > need to analyze random CSV files. I used Sniffer quite heavily for my last contract. The client had multiple multigig csv's which needed deduplicating, but they were all from different sources and therefore in different formats. It would have cost me many more hours without the Sniffer. Please keep it. <:) Robert Brewer MIS Amor Ministries fumanchu@amor.org From jim at zope.com Wed Jan 5 18:53:19 2005 From: jim at zope.com (Jim Fulton) Date: Wed Jan 5 18:53:25 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <20050105161045.GA19431@vicky.ecs.soton.ac.uk> References: <41DAE213.9070906@zope.com> <41DAF22B.6030605@zope.com> <20050105161045.GA19431@vicky.ecs.soton.ac.uk> Message-ID: <41DC298F.6040802@zope.com> Armin Rigo wrote: > Hi Jim, > > On Tue, Jan 04, 2005 at 02:44:43PM -0500, Jim Fulton wrote: > >>>Actually, unbound builtin methods are a different type than bound >>>builtin methods: >> >>Of course, but conceptually they are similar. You would still >>encounter the concept if you got an unbound builtin method. > > > There are no such things as unbound builtin methods: > > >>>>list.append is list.__dict__['append'] > > True > > In other words 'list.append' just returns exactly the same object as stored in > the list type's dict. Guido's proposal is to make Python methods behave in > the same way. OK, interesting. I'm sold then. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pje at telecommunity.com Wed Jan 5 19:03:42 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 5 19:03:58 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <20050105163006.GB19431@vicky.ecs.soton.ac.uk> References: Message-ID: <5.1.1.6.0.20050105130143.02abfec0@mail.telecommunity.com> At 04:30 PM 1/5/05 +0000, Armin Rigo wrote: >Hi Guido, > >On Tue, Jan 04, 2005 at 10:28:03AM -0800, Guido van Rossum wrote: > > Let's get rid of unbound methods. > >Is there any other use case for 'C.x' not returning the same as >'appropriate_super_class_of_C.__dict__["x"]' ? Er, classmethod would be one; a rather important one at that. From tjreedy at udel.edu Wed Jan 5 19:04:07 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Wed Jan 5 19:04:16 2005 Subject: [Python-Dev] Re: Please help complete the AST branch References: <41D9F94C.3020005@ocf.berkeley.edu> Message-ID: "Jeremy Hylton" wrote in message news:e8bf7a530501050842affb328@mail.gmail.com... > The file Python/compile.txt on the ast-branch has a brief overview of > the project: http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.8&only_with_tag=ast-branch&view=auto Clicking on the above gave me: (502) Bad Gateway The proxy server received an invalid response from an upstream server ??? Perhaps it is a temporary glitch on SF's backend cvs server. Terry J. Reedy From pje at telecommunity.com Wed Jan 5 19:04:35 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 5 19:04:52 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <1104946141.32311.12.camel@geddy.wooz.org> References: <1104939690.5854.30.camel@localhost> <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> <1104896563.16766.19.camel@geddy.wooz.org> <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it> <1104939690.5854.30.camel@localhost> Message-ID: <5.1.1.6.0.20050105130351.02ac2e60@mail.telecommunity.com> At 12:29 PM 1/5/05 -0500, Barry Warsaw wrote: >On Wed, 2005-01-05 at 10:41, Glyph Lefkowitz wrote: > > > I think it would be reasonable to assign im_class only to functions > > defined in class scope. The only serialization that would break in that > > case is if your example had a 'del f' at the end. > >+1. If you're doing something funkier, then you can set that attribute >yourself. > >-Barry Um, isn't all this stuff going to be more complicated and spread out over more of the code than just leaving unbound methods in place? From gvanrossum at gmail.com Wed Jan 5 19:10:32 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 5 19:10:36 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <5.1.1.6.0.20050105130351.02ac2e60@mail.telecommunity.com> References: <94D0198B-5EA4-11D9-BB20-000D934FF6B4@cwi.nl> <1104896563.16766.19.camel@geddy.wooz.org> <91280215-5F0A-11D9-ADA4-000A95EFAE9E@aleax.it> <1104939690.5854.30.camel@localhost> <1104946141.32311.12.camel@geddy.wooz.org> <5.1.1.6.0.20050105130351.02ac2e60@mail.telecommunity.com> Message-ID: > Um, isn't all this stuff going to be more complicated and spread out over > more of the code than just leaving unbound methods in place? Well, in an early version of Python it was as simple as I'd like ot to be again: the instancemethod type was only used for bound methods (hence the name) and C.f would return same the function object as C.__dict__["f"]. Apart from backwards compatibility with all the code that has grown cruft to deal with the fact that C.f is not a function object, I still see no reason why the current state of affairs is better. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Wed Jan 5 19:20:51 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed Jan 5 19:21:11 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: (Jeremy Hylton's message of "Tue, 4 Jan 2005 16:54:28 -0500") References: <41D9F94C.3020005@ocf.berkeley.edu> <20050104021909.GB11833@unpythonic.net> <41DB0FA4.3070405@ocf.berkeley.edu> Message-ID: <87d5wjagak.fsf@hydra.bayview.thirdcreek.com> Jeremy Hylton writes: > Does anyone want to volunteer to integrate the current head to the > branch? I think that's a pretty important near-term step. I'll take a shot at it. I see the following: 2216 changes: 1428 modifications w/o confict 399 adds 360 removes 29 conflicts Major conflict: Python/compile.c (Probably not merged during 1st merge) Lib/test/test_compile.c (ditto) Lib/test/test_os.py (AST?) Lib/test/test_re.py (AST?) Major conflict probably not AST related: Lib/test/test_bool.py Lib/test/test_urllib.py Lib/test/output/test_profile Python/pythonrun.c (check brackets!) Other issues: need local -kk to avoid another 80 conflicts due to the priceless keyword expansion, have to watch out for binary files like IDLE icons. ViewCVS is down, slows things up. I'm going to tag the trunk: mrg_to_ast-branch_05JAN05 -- KBK From gvanrossum at gmail.com Wed Jan 5 19:23:01 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 5 19:23:04 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> Message-ID: > The issue of mixing super() and explicit calls to the superclass's > method occur with any method. (Thus making it difficult/impossible for > a framework to convert to using super without breaking client code that > subclasses). Well, client classes which are leaves of the class tree can still safely use BaseClass.thisMethod(self, args) -- it's only classes that are written to be extended that must all be converted to using super(). So I'm not sure how you think your clients are breaking. > Adding optional arguments to one branch of the inheritance tree, but > not another, or adding different optional args in both branches. > (breaks unless you always pass optional args as keywordargs, and all > methods take **kwargs and pass that on to super). But that breaks anyway; I don't see how using the old Base.method(self, args) approach makes this easier, *unless* you are using single inheritance. If you're expecting single inheritance anyway, why bother with super()? > > Super is intended for use that are designed with method cooperation in > > mind, so I agree with the best practices in James's Conclusion: > > [[omitted]] > > But that's not the same as calling it harmful. :-( > > The 'harmfulness' comes from people being confused by, and misusing > super, because it is so very very easy to do so, and so very hard to > use correctly. And using multiple inheritance the old was was not confusing? Surely you are joking. > From what I can tell, it is mostly used incorrectly. *Especially* uses > in __init__ or __new__. Many people seem to use super in their __init__ > methods thinking that it'll magically improve something (like perhaps > making multiple inheritance trees that include their class work > better), only to just cause a different set of problems for multiple > inheritance trees, instead, because they don't realize they need to > follow those recommendations. If they're happy with single inheritance, let them use super() incorrectly. It works, and that's what count. Their code didn't work right with multiple inheritance before, it still doesn't. Some people just are uncomfortable with calling Base.method(self, ...) and feel super is "more correct". Let them. > Here's another page that says much the same thing, but from the > viewpoint of recommending the use of super and showing you all the > hoops to use it right: > http://wiki.osafoundation.org/bin/view/Chandler/UsingSuper The problem isn't caused by super but by multiple inheritance. > James > > PS, I wrote that page last pycon but never got around to finishing it > up and therefore never really publically announced it. But I told some > people about it and then they kept asking me for the URL so I linked > to it, and well, then google found it of course, so I guess it's public > now. ;) Doesn't mean you can't fix it. :) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Wed Jan 5 19:23:49 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 5 19:23:53 2005 Subject: [Python-Dev] ast branch pragmatics In-Reply-To: References: Message-ID: > I think it would be easier to create a new branch from the current > head, integrate the small number of changed files from ast-branch, and > work with that branch instead. The idea is that it's an end-run > around doing an automatic CVS merge and relying on someone to manually > merge the changes. > > At the same time, since there is a groundswell of support for > finishing the AST work, I'd like to propose that we stop making > compiler / bytecode changes until it is done. Every change to > compile.c or the bytecode ends up creating a new incompatibilty that > needs to be merged. > > If these two plans sound good, I'll get started on the new branch. +1 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Wed Jan 5 19:28:11 2005 From: python at rcn.com (Raymond Hettinger) Date: Wed Jan 5 19:31:22 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <87d5wjagak.fsf@hydra.bayview.thirdcreek.com> Message-ID: <001501c4f354$500c43c0$e841fea9@oemcomputer> Would it be helpful for me to move the peepholer out of compile.c into a separate source file? Raymond Hettinger From kbk at shore.net Wed Jan 5 19:35:24 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed Jan 5 19:35:39 2005 Subject: [Python-Dev] ast branch pragmatics In-Reply-To: (Jeremy Hylton's message of "Wed, 5 Jan 2005 11:49:05 -0500") References: Message-ID: <878y77afmb.fsf@hydra.bayview.thirdcreek.com> Jeremy Hylton writes: > The existing ast-branch mostly works, but it does not include most of > the new features of Python 2.4. There is a substantial integration > effort, perhaps easy for someone who does a lot of CVS branch merges. > (In particular, the head has already been merged to this branch once.) > > I think it would be easier to create a new branch from the current > head, integrate the small number of changed files from ast-branch, and > work with that branch instead. The idea is that it's an end-run > around doing an automatic CVS merge and relying on someone to manually > merge the changes. > > At the same time, since there is a groundswell of support for > finishing the AST work, I'd like to propose that we stop making > compiler / bytecode changes until it is done. Every change to > compile.c or the bytecode ends up creating a new incompatibilty that > needs to be merged. > > If these two plans sound good, I'll get started on the new branch. Hm, I saw this after making my previous post. Well, you can see from that post that it's a bit of work, but not overwhelming. You have a better feel for how much change was made on ast-branch and how complete the previous merge was. So, you decide: if you want me to do the merge, I can. But ast-branch-2 sounds OK, also. -- KBK From magnus at hetland.org Wed Jan 5 13:19:21 2005 From: magnus at hetland.org (Magnus Lie Hetland) Date: Wed Jan 5 19:48:51 2005 Subject: [Python-Dev] Re: csv module TODO list In-Reply-To: <20050105075506.314C93C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <16859.38960.9935.682429@montanaro.dyndns.org> <20050105075506.314C93C8E5@coffee.object-craft.com.au> Message-ID: <20050105121921.GB24030@idi.ntnu.no> Quite a while ago I posted some material to the csv-list about problems using the csv module on Unix-style colon-separated files -- it just doesn't deal properly with backslash escaping and is quite useless for this kind of file. I seem to recall the general view was that it wasn't intended for this kind of thing -- only the sort of csv that Microsoft Excel outputs/inputs, but if I am mistaken about this, perhaps fixing this issue might be put on the TODO-list? I'll be happy to re-send or summarize the relevant emails, if needed. -- Magnus Lie Hetland Fallen flower I see / Returning to its branch http://hetland.org Ah! a butterfly. [Arakida Moritake] From jhylton at gmail.com Wed Jan 5 19:54:03 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Wed Jan 5 19:54:06 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <001501c4f354$500c43c0$e841fea9@oemcomputer> References: <87d5wjagak.fsf@hydra.bayview.thirdcreek.com> <001501c4f354$500c43c0$e841fea9@oemcomputer> Message-ID: On Wed, 5 Jan 2005 13:28:11 -0500, Raymond Hettinger wrote: > Would it be helpful for me to move the peepholer out of compile.c into a > separate source file? It doesn't really matter. There are two reasons. 1) We've been working on the new compiler code in newcompile.c, rather than compile.c. When it is finished, we'll replace compile.c with newcompile.c, but it was helpful to have both around at first. 2) Peephole optimizations would be done on the basic block intermediate representation rather than code objects. So we'll need to rewrite it anyway to use the new IR. Jeremy From jhylton at gmail.com Wed Jan 5 19:58:02 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Wed Jan 5 19:58:05 2005 Subject: [Python-Dev] Please help complete the AST branch In-Reply-To: <87d5wjagak.fsf@hydra.bayview.thirdcreek.com> References: <41D9F94C.3020005@ocf.berkeley.edu> <20050104021909.GB11833@unpythonic.net> <41DB0FA4.3070405@ocf.berkeley.edu> <87d5wjagak.fsf@hydra.bayview.thirdcreek.com> Message-ID: On Wed, 05 Jan 2005 13:20:51 -0500, Kurt B. Kaiser wrote: > Jeremy Hylton writes: > > > Does anyone want to volunteer to integrate the current head to the > > branch? I think that's a pretty important near-term step. > > I'll take a shot at it. Great! I say this after reading your other message in response to my suggestion to create a new branch. If you can manage to do the integration, it's simpler for everyone to stick to a single branch. (For example, there will be no opportunity for someone to work on the wrong branch.) > 29 conflicts Oh. That's not as bad as I expected. > Major conflict: > Python/compile.c (Probably not merged during 1st merge) I think that's right. I didn't merge any of the changes, then. > Lib/test/test_compile.c (ditto) Probably. > Lib/test/test_os.py (AST?) > Lib/test/test_re.py (AST?) I wonder if these two were edited to worm around some bugs in early versions of newcompile.c. You could check the revision history. If that's the case, it's safe to drop the changes. > Major conflict probably not AST related: > Lib/test/test_bool.py > Lib/test/test_urllib.py > Lib/test/output/test_profile > Python/pythonrun.c (check brackets!) There are actually a lot of AST-related changes in pythonrun.c, because it is the gunk between files and stdin and the actual compiler and runtime. Jeremy From martin at v.loewis.de Wed Jan 5 22:15:48 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 5 22:15:41 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <019B7FCB-5F2D-11D9-9DC0-000A9567635C@redivi.com> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> <019B7FCB-5F2D-11D9-9DC0-000A9567635C@redivi.com> Message-ID: <41DC5904.4070507@v.loewis.de> Bob Ippolito wrote: > Then you haven't done the appropriate research by searching > pythonmac-sig. Hmm. > Do you even own a Mac? Do I have to, in order to understand the issues? But to answer your question: yes, I do. Regards, Martin From tjreedy at udel.edu Wed Jan 5 22:24:14 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Wed Jan 5 22:24:24 2005 Subject: [Python-Dev] Re: Please help complete the AST branch References: <41D9F94C.3020005@ocf.berkeley.edu> Message-ID: "Terry Reedy" wrote in message news:crha6n$tgf$1@sea.gmane.org... http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.8&only_with_tag=ast-branch&view=auto > > Clicking on the above gave me: > > (502) Bad Gateway > The proxy server received an invalid response from an upstream server > > ??? Perhaps it is a temporary glitch on SF's backend cvs server. Seems so, working now. From kbk at shore.net Wed Jan 5 22:30:34 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed Jan 5 22:31:01 2005 Subject: [Python-Dev] Please help complete the AST branch References: <41D9F94C.3020005@ocf.berkeley.edu> <20050104021909.GB11833@unpythonic.net> <41DB0FA4.3070405@ocf.berkeley.edu> <87d5wjagak.fsf@hydra.bayview.thirdcreek.com> Message-ID: <87llb78sxx.fsf@hydra.bayview.thirdcreek.com> Jeremy Hylton writes: >> 29 conflicts > > Oh. That's not as bad as I expected. Proceeding.... >> Major conflict: >> Python/compile.c (Probably not merged during 1st merge) > > I think that's right. I didn't merge any of the changes, then. > >> Lib/test/test_compile.c (ditto) > > Probably. So maybe it's not necessary to merge these two; just leave them behind? That would lighten the load quite a bit. -- KBK From bob at redivi.com Wed Jan 5 22:39:11 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed Jan 5 22:39:19 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DC5904.4070507@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> <019B7FCB-5F2D-11D9-9DC0-000A9567635C@redivi.com> <41DC5904.4070507@v.loewis.de> Message-ID: <3CAA6728-5F62-11D9-AB1C-000A95BA5446@redivi.com> On Jan 5, 2005, at 16:15, Martin v. L?wis wrote: > Bob Ippolito wrote: >> Then you haven't done the appropriate research by searching >> pythonmac-sig. > > Hmm. > > > Do you even own a Mac? > > Do I have to, in order to understand the issues? > > But to answer your question: yes, I do. Well, this issue has been discussed over and over again on pythonmac-sig over the past year or so (perhaps as far back as the 10.3.0 release). I do not have time at the moment to summarize, but the solution proposed is sane and there is no known better way. If you take a look at the WWDC2004 sources for Python, a similar patch is applied by Apple. However, Apple's patch breaks (at least) C++ compilation and SciPy's distutils extension for compiling Fortran due to distutils' stupidity. -bob From martin at v.loewis.de Wed Jan 5 22:58:04 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 5 22:57:58 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> Message-ID: <41DC62EC.6060608@v.loewis.de> Ronald Oussoren wrote: > It gets worse when you have a user-installed python 2.3 and a > user-installed python 2.4. Those will be both be installed as > /Library/Frameworks/Python.framework. Yes, but one is installed in Versions/2.3, and the other in Versions/2.4. > This means that you cannot use the > -F flag to select which one you want to link to, '-framework Python' > will only link to the python that was installed the latest. What about using -F /Library/Frameworks/Python.framework/Versions/2.3? Or, would there be a different way to specify the version of a framework when linking, in addition to -F? What about -framework Python,/Versions/2.3 I could not find a specification how the suffix in -framework is meant to work - perhaps it could be used here? Regards, Martin From martin at v.loewis.de Wed Jan 5 23:00:26 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 5 23:00:20 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> Message-ID: <41DC637A.5050105@v.loewis.de> Andrew McNamara wrote: >>>Can you please elaborate on that? What needs to be done, and how is >>>that going to be done? It might be possible to avoid considerable >>>uglification. > > > I'm not altogether sure there. The parsing state machine is all written in > C, and deals with signed chars - I expect we'll need two versions of that > (or one version that's compiled twice using pre-processor macros). Quite > a large job. Suggestions gratefully received. I'm still trying to understand what *needs* to be done - I would move to how this is done only later. What APIs should be extended/changed, and in what way? Regards, Martin From bob at redivi.com Wed Jan 5 23:06:10 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed Jan 5 23:06:17 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DC62EC.6060608@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> <41DC62EC.6060608@v.loewis.de> Message-ID: <01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com> On Jan 5, 2005, at 16:58, Martin v. L?wis wrote: > Ronald Oussoren wrote: > >> It gets worse when you have a user-installed python 2.3 and a >> user-installed python 2.4. Those will be both be installed as >> /Library/Frameworks/Python.framework. > > Yes, but one is installed in Versions/2.3, and the other in > Versions/2.4. > >> This means that you cannot use the -F flag to select which one you >> want to link to, '-framework Python' will only link to the python >> that was installed the latest. > > What about using -F /Library/Frameworks/Python.framework/Versions/2.3? > Or, would there be a different way to specify the version of a > framework when linking, in addition to -F? What about > > -framework Python,/Versions/2.3 Nope. The only way to link to a non-current framework version is to forego any linker searching and specify the dyld file directly, i.e. /Library/Frameworks/Python.framework/Versions/2.3/Python. The gcc toolchain does not in any way whatsoever understand versioned frameworks, period. > I could not find a specification how the suffix in -framework is meant > to work - perhaps it could be used here? dylib suffixes are used for having separate versions of the dylib (debug, profile, etc.). It is NOT for general production use, ever. -bob From martin at v.loewis.de Wed Jan 5 23:19:40 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 5 23:19:33 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <2mzmzogf5a.fsf@starship.python.net> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> <2mzmzogf5a.fsf@starship.python.net> Message-ID: <41DC67FC.8020703@v.loewis.de> Michael Hudson wrote: > Martin, can you please believe that Jack, Bob, Ronald et al know what > they are talking about here? I find that really hard to believe, because it contradicts to what I think Apple wants me to believe. I'm willing to follow a series of statements that I can confirm to be facts somehow (e.g. "As TechNote XY says, OSX has a bug in that it loads the Current version at run-time, no matter what version the binary says should be used"). I'm not really willing to believe a statement without any kind of proof - regardless who made that statement. "Read the mailing lists" is no proof. If I was to accept anything said without doubt, Jack would not have needed to post his message in the first place - he expressed his opinion that he believed the changes to be appropriate. It was his doubt that triggered mine. I am not going to interfere with the changes -- it's just that I want to understand them. Kind regards, Martin From martin at v.loewis.de Wed Jan 5 23:38:26 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Jan 5 23:38:19 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> <41DC62EC.6060608@v.loewis.de> <01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com> Message-ID: <41DC6C62.90101@v.loewis.de> Bob Ippolito wrote: > Nope. The only way to link to a non-current framework version is to > forego any linker searching and specify the dyld file directly, i.e. > /Library/Frameworks/Python.framework/Versions/2.3/Python. The gcc > toolchain does not in any way whatsoever understand versioned > frameworks, period. I see. I wish you had told me right from the beginning. Regards, Martin From arigo at tunes.org Wed Jan 5 23:39:18 2005 From: arigo at tunes.org (Armin Rigo) Date: Wed Jan 5 23:50:12 2005 Subject: [Python-Dev] Let's get rid of unbound methods In-Reply-To: <5.1.1.6.0.20050105130143.02abfec0@mail.telecommunity.com> References: <5.1.1.6.0.20050105130143.02abfec0@mail.telecommunity.com> Message-ID: <20050105223918.GA26613@vicky.ecs.soton.ac.uk> Hi Phillip, On Wed, Jan 05, 2005 at 01:03:42PM -0500, Phillip J. Eby wrote: > >Is there any other use case for 'C.x' not returning the same as > >'appropriate_super_class_of_C.__dict__["x"]' ? > > Er, classmethod would be one; a rather important one at that. Oups. Right, sorry. Armin From foom at fuhm.net Thu Jan 6 00:00:38 2005 From: foom at fuhm.net (James Y Knight) Date: Thu Jan 6 00:00:41 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> Message-ID: <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> On Jan 5, 2005, at 1:23 PM, Guido van Rossum wrote: >> The issue of mixing super() and explicit calls to the superclass's >> method occur with any method. (Thus making it difficult/impossible for >> a framework to convert to using super without breaking client code >> that >> subclasses). > > Well, client classes which are leaves of the class tree can still > safely use BaseClass.thisMethod(self, args) -- it's only classes that > are written to be extended that must all be converted to using > super(). So I'm not sure how you think your clients are breaking. See the section "Subclasses must use super if their superclasses do". This is particularly a big issue with __init__. >> Adding optional arguments to one branch of the inheritance tree, but >> not another, or adding different optional args in both branches. >> (breaks unless you always pass optional args as keywordargs, and all >> methods take **kwargs and pass that on to super). > > But that breaks anyway; I don't see how using the old > Base.method(self, args) approach makes this easier, *unless* you are > using single inheritance. If you're expecting single inheritance > anyway, why bother with super()? There is a distinction between simple multiple inheritance, which did work in the old system vs. multiple inheritance in a diamond structure which did not work in the old system. However, consider something like the following (ignore the Interface/implements bit if you want. It's just to point out a common situation where two classes can independently implement the same method without having a common superclass): class IFrob(Interface): def frob(): """Frob the knob""" class A: implements(IFrob) def frob(self, foo=False): print "A.frob(foo=%r)"%foo class B: implements(IFrob) def frob(self, bar=False): print "B.frob(bar=%r)"%bar class C(A,B): def m(self, foo=False, bar=False): A.m(self, foo=foo) B.m(self, bar=bar) print "C.frob(foo=%r, bar=%r)"%(foo,bar) Now, how do you write that to use super? Here's what I come up with: class IFrob(Interface): def frob(): """Frob the knob""" class A(object): implements(IFrob) def frob(self, foo=False, *args, **kwargs): try: f = super(A, self).frob except AttributeError: pass else: f(foo=foo, *args, **kwargs) print "A.frob(foo=%r)"%foo class B(object): implements(IFrob) def frob(self, bar=False, *args, **kwargs): try: f = super(B, self).frob except AttributeError: pass else: f(bar=bar, *args, **kwargs) print "B.frob(bar=%r)"%bar class C(A,B): def frob(self, foo=False, bar=False, *args, **kwargs): super(C, self).frob(foo, bar, *args, **kwargs) print "C.frob(foo=%r, bar=%r)"%(foo,bar) > And using multiple inheritance the old was was not confusing? Surely > you are joking. It was pretty simple until you start having diamond structures. Then it's complicated. Now, don't get me wrong, I think that MRO-calculating mechanism really is "the right thing", in the abstract. I just think the way it works out as implemented in python is really confusing and it's easy to be worse off with it than without it. > If they're happy with single inheritance, let them use super() > incorrectly. It works, and that's what count. Their code didn't work > right with multiple inheritance before, it still doesn't. Some people > just are uncomfortable with calling Base.method(self, ...) and feel > super is "more correct". Let them. Their code worked right in M-I without diamonds before. Now it likely doesn't work in M-I at all. James From bob at redivi.com Thu Jan 6 00:14:19 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu Jan 6 00:14:28 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DC6C62.90101@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> <41DC62EC.6060608@v.loewis.de> <01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com> <41DC6C62.90101@v.loewis.de> Message-ID: <8735A624-5F6F-11D9-AB1C-000A95BA5446@redivi.com> On Jan 5, 2005, at 17:38, Martin v. L?wis wrote: > Bob Ippolito wrote: >> Nope. The only way to link to a non-current framework version is to >> forego any linker searching and specify the dyld file directly, i.e. >> /Library/Frameworks/Python.framework/Versions/2.3/Python. The gcc >> toolchain does not in any way whatsoever understand versioned >> frameworks, period. > > I see. I wish you had told me right from the beginning. That is only part of the reason for these changes (concurrent Python 2.3 and Python 2.4 in the same location), and is fringe enough that I wasn't even thinking of it at the time. I just dug up some information I had written on this particular topic but never published, if you're interested: http://bob.pythonmac.org/archives/2005/01/05/versioned-frameworks- considered-harmful/ -bob From Jack.Jansen at cwi.nl Thu Jan 6 00:21:58 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Thu Jan 6 00:21:46 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: References: <41DB2C9A.4070800@v.loewis.de> Message-ID: <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> [Grmpf. I should check which account I use before pressing send. Here goes again] On 5-jan-05, at 1:08, Bob Ippolito wrote: >>> The problem we're trying to solve is that due to the way Apple's >>> framework architecture works newer versions of frameworks are >>> preferred (at link time, and sometimes even at runtime) over older >>> ones. >> >> Can you elaborate on that somewhat? According to >> >> http://developer.apple.com/documentation/MacOSX/Conceptual/ >> BPFrameworks/Concepts/VersionInformation.html >> >> there are major and minor versions of frameworks. I would think that >> every Python minor (2.x) release should produce a new major framework >> version of the Python framework. Then, there would be no problem. >> >> Why does this not work? > > It doesn't for reasons I care not to explain in depth, again. But I do care:-) Specifically because I trust the crowd here to come up with good ideas (even if they're not Mac users:-). Ronald already explained most of the problem, what it boils down to is that multiple versions of a framework can live in a single location. For most applications that's better than the old MacOS9 architecture (which I believe is pretty similar to the Windows dll architecture) because you can ship a single foo.framework that contains both version 1.2 and 1.3. There's also a symlink "Current" that will point to 1.3. At build time the linker will pick the version pointed at by "Current", but in the file it will record the actual version number. Hence, if you ship this framework new development will link to the newest version, but older programs will still load the older one. When I did the framework python design I overlooked the fact that an older Python would have no way to specify that an extension would have to link against its own, old, framework, because on MacOS9 this wasn't a problem (the two had different filenames). As an aside, I also overlooked the fact that a Python framework residing in /System could be overridden by one in /Library because in 2.3 we linked frameworks by relative pathname, because I simply didn't envision any Python living in /System for some time to be. The -F options could solve that problem, but not the 2.3 and 2.4 both in /Library problem. The "new" solution is basically to go back to the Unix way of building an extension: link it against nothing and sort things out at runtime. Not my personal preference, but at least we know that loading an extension into one Python won't bring in a fresh copy of a different interpreter or anything horrible like that. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From gvanrossum at gmail.com Thu Jan 6 00:36:02 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Jan 6 00:36:05 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> Message-ID: On Wed, 5 Jan 2005 18:00:38 -0500, James Y Knight wrote: > On Jan 5, 2005, at 1:23 PM, Guido van Rossum wrote: > >> The issue of mixing super() and explicit calls to the superclass's > >> method occur with any method. (Thus making it difficult/impossible for > >> a framework to convert to using super without breaking client code > >> that subclasses). > > > > Well, client classes which are leaves of the class tree can still > > safely use BaseClass.thisMethod(self, args) -- it's only classes that > > are written to be extended that must all be converted to using > > super(). So I'm not sure how you think your clients are breaking. > > See the section "Subclasses must use super if their superclasses do". > This is particularly a big issue with __init__. I see. I was thinking about subclassing a single class, you are talking about subclassing multiple bases. Subclassing two or more classes is *always* very subtle. Before 2.2 and super(), the only sane way to do that was to have all except one base class be written as a mix-in class for a specific base class (or family of base classes). The idea of calling both __init__ methods doesn't work if there's a diamond; if there *is* a diamond (or could be one), using super() is the only sane solution. > >> Adding optional arguments to one branch of the inheritance tree, but > >> not another, or adding different optional args in both branches. > >> (breaks unless you always pass optional args as keywordargs, and all > >> methods take **kwargs and pass that on to super). > > > > But that breaks anyway; I don't see how using the old > > Base.method(self, args) approach makes this easier, *unless* you are > > using single inheritance. If you're expecting single inheritance > > anyway, why bother with super()? > > There is a distinction between simple multiple inheritance, which did > work in the old system Barely; see above. > vs. multiple inheritance in a diamond structure > which did not work in the old system. However, consider something like > the following (ignore the Interface/implements bit if you want. It's > just to point out a common situation where two classes can > independently implement the same method without having a common > superclass): > > class IFrob(Interface): > def frob(): > """Frob the knob""" > > class A: > implements(IFrob) > def frob(self, foo=False): > print "A.frob(foo=%r)"%foo > > class B: > implements(IFrob) > def frob(self, bar=False): > print "B.frob(bar=%r)"%bar > > class C(A,B): > def m(self, foo=False, bar=False): [I presume you meant from instead of m here] > A.m(self, foo=foo) > B.m(self, bar=bar) > print "C.frob(foo=%r, bar=%r)"%(foo,bar) > > Now, how do you write that to use super? The problem isn't in super(), the problem is that the classes A and B aren't written cooperatively, so attempting to combine them using multiple inheritance is asking for trouble. You'd be better off making C a container class that has separate A and B instances. > > And using multiple inheritance the old was was not confusing? Surely > > you are joking. > > It was pretty simple until you start having diamond structures. Then > it's complicated. Now, don't get me wrong, I think that MRO-calculating > mechanism really is "the right thing", in the abstract. I just think > the way it works out as implemented in python is really confusing and > it's easy to be worse off with it than without it. So then don't use it. You couldn't have diamonds at all before 2.2. With *care* and *understanding* you can do the right thing in 2.2 and beyond. I'm getting tired of super() being blamed for the problems inherent to cooperative multiple inheritance. super() is the tool that you need to solve a hairy problem; but don't blame super() for the problem's hairiness. > > If they're happy with single inheritance, let them use super() > > incorrectly. It works, and that's what count. Their code didn't work > > right with multiple inheritance before, it still doesn't. Some people > > just are uncomfortable with calling Base.method(self, ...) and feel > > super is "more correct". Let them. > > Their code worked right in M-I without diamonds before. Now it likely > doesn't work in M-I at all. If you have a framework with classes written using the old paradigm that a subclass must call the __init__ (or frob) method of each of its superclasses, you can't change your framework to use super() instead while maintaining backwards compatibility. If you didn't realize that before you made the change and then got bitten by it, tough. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Thu Jan 6 00:46:32 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 6 00:46:25 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <8735A624-5F6F-11D9-AB1C-000A95BA5446@redivi.com> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> <41DC62EC.6060608@v.loewis.de> <01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com> <41DC6C62.90101@v.loewis.de> <8735A624-5F6F-11D9-AB1C-000A95BA5446@redivi.com> Message-ID: <41DC7C58.6040006@v.loewis.de> Bob Ippolito wrote: > I just dug up some information I had written on this particular topic > but never published, if you're interested: > http://bob.pythonmac.org/archives/2005/01/05/versioned-frameworks- > considered-harmful/ Interesting. I don't get the part why "-undefined dynamic_lookup" is a good idea (and this is indeed what bothered me most to begin with). As you say, explicitly specifying the target .dylib should work as well, and it also does not require 10.3. Regards, Martin From martin at v.loewis.de Thu Jan 6 00:49:52 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 6 00:49:45 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> References: <41DB2C9A.4070800@v.loewis.de> <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> Message-ID: <41DC7D20.4000901@v.loewis.de> Jack Jansen wrote: > But I do care:-) Specifically because I trust the crowd here to come up > with good ideas (even if they're not Mac users:-). Thanks a lot. > The "new" solution is basically to go back to the Unix way of building > an extension: link it against nothing and sort things out at runtime. > Not my personal preference, but at least we know that loading an > extension into one Python won't bring in a fresh copy of a different > interpreter or anything horrible like that. This sounds good, except that it only works on OS X 10.3, right? What about older versions? Regards, Martin From andrewm at object-craft.com.au Thu Jan 6 02:10:55 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Thu Jan 6 02:11:02 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <41DC637A.5050105@v.loewis.de> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> <41DC637A.5050105@v.loewis.de> Message-ID: <20050106011055.001163C8E5@coffee.object-craft.com.au> >>>>Can you please elaborate on that? What needs to be done, and how is >>>>that going to be done? It might be possible to avoid considerable >>>>uglification. >> >> I'm not altogether sure there. The parsing state machine is all written in >> C, and deals with signed chars - I expect we'll need two versions of that >> (or one version that's compiled twice using pre-processor macros). Quite >> a large job. Suggestions gratefully received. > >I'm still trying to understand what *needs* to be done - I would move to >how this is done only later. What APIs should be extended/changed, and >in what way? That's certainly the first step, and I have to admit that I don't have a clear idea at this time - the unicode issue has been in the "too hard" basket since we started. Marc-Andre Lemburg mentioned that he has encountered UTF-16 encoded csv files, so a reasonable starting point would be the ability to read and parse, as well as the ability to generate, one of these. The reader interface currently returns a row at a time, consuming as many lines from the supplied iterable (with the most common iterable being a file). This suggests to me that we will need an optional "encoding" argument to the reader constructor, and that the reader will need to decode the source lines. That said, I'm hardly a unicode expert, so I may be overlooking something (could a utf-16 encoded character span a line break, for example). The writer interface probably should have similar facilities. However - a number of people have complained about the "iterator" interface, wanting to supply strings (the iterable is necessary because a CSV row can span multiple lines). It's also conceiveable that the source lines could already be unicode objects. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From andrewm at object-craft.com.au Thu Jan 6 03:03:08 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Thu Jan 6 03:03:13 2005 Subject: [Csv] Re: [Python-Dev] csv module TODO list In-Reply-To: <20050106011055.001163C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> <41DC637A.5050105@v.loewis.de> <20050106011055.001163C8E5@coffee.object-craft.com.au> Message-ID: <20050106020308.EBE5A3C8E5@coffee.object-craft.com.au> >>I'm still trying to understand what *needs* to be done - I would move to >>how this is done only later. What APIs should be extended/changed, and >>in what way? [...] >The reader interface currently returns a row at a time, consuming as many >lines from the supplied iterable (with the most common iterable being >a file). This suggests to me that we will need an optional "encoding" >argument to the reader constructor, and that the reader will need to >decode the source lines. That said, I'm hardly a unicode expert, so I >may be overlooking something (could a utf-16 encoded character span a >line break, for example). The writer interface probably should have >similar facilities. Ah - I see that the codecs module provides an EncodedFile class - better to use this than add encoding/decoding cruft to the csv module. So, do we duplicate the current reader and writer as UnicodeReader and UnicodeWriter (how else do we know to use the unicode parser)? What about the "dialects"? I guess if a dialect uses no unicode strings, it can be applied to the current parser, but if it does include unicode strings, then the parser would need to raise an exception. The DictReader and DictWriter classes will probably need matching UnicodeDictReader/UnicodeDictWriter versions (use common base class, just specify alternate parser). -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From ilya at bluefir.net Thu Jan 6 06:27:16 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Thu Jan 6 06:24:30 2005 Subject: [Python-Dev] an idea for improving struct.unpack api Message-ID: A problem: The current struct.unpack api works well for unpacking C-structures where everything is usually unpacked at once, but it becomes inconvenient when unpacking binary files where things often have to be unpacked field by field. Then one has to keep track of offsets, slice the strings,call struct.calcsize(), etc... Eg. with a current api unpacking of a record which consists of a header followed by a variable number of items would go like this hdr_fmt="iiii" item_fmt="IIII" item_size=calcsize(item_fmt) hdr_size=calcsize(hdr_fmt) hdr=unpack(hdr_fmt, rec[0:hdr_size]) #rec is the record to unpack offset=hdr_size for i in range(hdr[0]): #assume 1st field of header is a counter item=unpack( item_fmt, rec[ offset: offset+item_size]) offset+=item_size which is quite inconvenient... A solution: We could have an optional offset argument for unpack(format, buffer, offset=None) the offset argument is an object which contains a single integer field which gets incremented inside unpack() to point to the next byte. so with a new API the above code could be written as offset=struct.Offset(0) hdr=unpack("iiii", offset) for i in range(hdr[0]): item=unpack( "IIII", rec, offset) When an offset argument is provided, unpack() should allow some bytes to be left unpacked at the end of the buffer.. Does this suggestion make sense? Any better ideas? Ilya From bob at redivi.com Thu Jan 6 08:29:23 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu Jan 6 08:29:36 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DC7D20.4000901@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> <41DC7D20.4000901@v.loewis.de> Message-ID: On Jan 5, 2005, at 18:49, Martin v. L?wis wrote: > Jack Jansen wrote: >> The "new" solution is basically to go back to the Unix way of >> building an extension: link it against nothing and sort things out >> at runtime. Not my personal preference, but at least we know that >> loading an extension into one Python won't bring in a fresh copy of >> a different interpreter or anything horrible like that. > > This sounds good, except that it only works on OS X 10.3, right? > What about older versions? Older versions do not support this feature and have to deal with the way things are as-is. Mac OS X 10.2 is the only supported version that suffers this consequence, I don't think anyone has supported Python on Mac OS X 10.1 in quite some time. -bob From bob at redivi.com Thu Jan 6 08:31:45 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu Jan 6 08:31:50 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DC7C58.6040006@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> <41DBA649.3080008@v.loewis.de> <41DC62EC.6060608@v.loewis.de> <01B2E2DC-5F66-11D9-AB1C-000A95BA5446@redivi.com> <41DC6C62.90101@v.loewis.de> <8735A624-5F6F-11D9-AB1C-000A95BA5446@redivi.com> <41DC7C58.6040006@v.loewis.de> Message-ID: <04A1624E-5FB5-11D9-AB1C-000A95BA5446@redivi.com> On Jan 5, 2005, at 18:46, Martin v. L?wis wrote: > Bob Ippolito wrote: >> I just dug up some information I had written on this particular topic >> but never published, if you're interested: >> http://bob.pythonmac.org/archives/2005/01/05/versioned-frameworks- >> considered-harmful/ > > Interesting. I don't get the part why "-undefined dynamic_lookup" > is a good idea (and this is indeed what bothered me most to begin > with). > As you say, explicitly specifying the target .dylib should work as > well, and it also does not require 10.3. Without -undefined dynamic_lookup, your Python extensions are bound to a specific Python installation location (i.e. the system 2.3.0 and a user-installed 2.3.4). This tends to be quite a problem. With -undefined dynamic_lookup, they are not. Just search for "version mismatch" on pythonmac-sig: http://www.google.com/search?q=%22version+mismatch%22+pythonmac- sig+site:mail.python.org&ie=UTF-8&oe=UTF-8 -bob From python at rcn.com Thu Jan 6 08:33:39 2005 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 6 08:36:52 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: Message-ID: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer> [Ilya Sandler] > A problem: > > The current struct.unpack api works well for unpacking C-structures where > everything is usually unpacked at once, but it > becomes inconvenient when unpacking binary files where things > often have to be unpacked field by field. Then one has to keep track > of offsets, slice the strings,call struct.calcsize(), etc... Yes. That bites. > Eg. with a current api unpacking of a record which consists of a > header followed by a variable number of items would go like this > > hdr_fmt="iiii" > item_fmt="IIII" > item_size=calcsize(item_fmt) > hdr_size=calcsize(hdr_fmt) > hdr=unpack(hdr_fmt, rec[0:hdr_size]) #rec is the record to unpack > offset=hdr_size > for i in range(hdr[0]): #assume 1st field of header is a counter > item=unpack( item_fmt, rec[ offset: offset+item_size]) > offset+=item_size > > which is quite inconvenient... > > > A solution: > > We could have an optional offset argument for > > unpack(format, buffer, offset=None) > > the offset argument is an object which contains a single integer field > which gets incremented inside unpack() to point to the next byte. > > so with a new API the above code could be written as > > offset=struct.Offset(0) > hdr=unpack("iiii", offset) > for i in range(hdr[0]): > item=unpack( "IIII", rec, offset) > > When an offset argument is provided, unpack() should allow some bytes to > be left unpacked at the end of the buffer.. > > > Does this suggestion make sense? Any better ideas? Rather than alter struct.unpack(), I suggest making a separate class that tracks the offset and encapsulates some of the logic that typically surrounds unpacking: r = StructReader(rec) hdr = r('iiii') for item in r.getgroups('IIII', times=rec[0]): . . . It would be especially nice if it handled the more complex case where the next offset is determined in-part by the data being read (see the example in section 11.3 of the tutorial): r = StructReader(open('myfile.zip', 'rb')) for i in range(3): # show the first 3 file headers fields = r.getgroup('LLLHH', offset=14) crc32, comp_size, uncomp_size, filenamesize, extra_size = fields filename = g.getgroup('c', offset=16, times=filenamesize) extra = g.getgroup('c', times=extra_size) r.advance(comp_size) print filename, hex(crc32), comp_size, uncomp_size If you come up with something, I suggest posting it as an ASPN recipe and then announcing it on comp.lang.python. That ought to generate some good feedback based on other people's real world issues with struct.unpack(). Raymond Hettinger From foom at fuhm.net Thu Jan 6 08:46:11 2005 From: foom at fuhm.net (James Y Knight) Date: Thu Jan 6 08:46:23 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> Message-ID: <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net> On Jan 5, 2005, at 6:36 PM, Guido van Rossum wrote: > The idea of calling both __init__ methods doesn't work if there's a > diamond; if there *is* a diamond (or could be one), using super() is > the only sane solution. Very true. > So then don't use it. You couldn't have diamonds at all before 2.2. > With *care* and *understanding* you can do the right thing in 2.2 and > beyond. > > I'm getting tired of super() being blamed for the problems inherent to > cooperative multiple inheritance. super() is the tool that you need to > solve a hairy problem; but don't blame super() for the problem's > hairiness. Please notice that I'm talking about concrete, real issues, not just a "super is bad!" rant. These are initially non-obvious (to me, at least) things that will actually happen in real code and that you actually do need to watch out for if you use super. Yes. It is a hard problem. However, the issues I talk about are not issues with the functionality and theory of calling the next method in an MRO, they are issues with the combination of MROs, the implementation of MRO-calling in python (via "super"), and current practices in writing python code. They are not inherent in cooperative multiple inheritance, but occur mostly because of its late addition to python, and the cumbersome way in which you have to invoke super. I wrote up the page as part of an investigation into converting Twisted to use super. I thought it would be a good idea to do the conversion, but others told me it would be a bad idea for backwards compatibility reasons. I did not believe, at first, and conducted experiments. In the end, I concluded that it is not possible, because of the issues with mixing the new and old paradigm. > If you have a framework with classes written using the old paradigm > that a subclass must call the __init__ (or frob) method of each of its > superclasses, you can't change your framework to use super() instead > while maintaining backwards compatibility. Yep, that's what I said, too. > If you didn't realize that > before you made the change and then got bitten by it, tough. Luckily, I didn't get bitten by it because I figured out the consequences and wrote a webpage about them before making an incorrect code change. Leaving behind the backwards compatibility issues... In order to make super really nice, it should be easier to use right. Again, the two major issues that cause problems are: 1) having to declare every method with *args, **kwargs, and having to pass those and all the arguments you take explicitly to super, and 2) that traditionally __init__ is called with positional arguments. To fix #1, it would be really nice if you could write code something like the following snippet. Notice especially here that the 'bar' argument gets passed through C.__init__ and A.__init__, into D.__init__, without the previous two having to do anything about it. However, if you ask me to detail how this could *possibly* *ever* work in python, I have no idea. Probably the answer is that it can't. class A(object): def __init__(self): print "A" next_method class B(object): def __init__(self): print "B" next_method class C(A): def __init__(self, foo): print "C","foo=",foo next_method self.foo=foo class D(B): def __init__(self, bar): print "D", "bar=",bar next_method self.bar=bar class E(C,D): def __init__(self, foo, bar): print "E" next_method class E2(C,D): """Even worse, not defining __init__ should work right too.""" E(foo=10, bar=20) E2(foo=10, bar=20) # Yet, these ought to result in a TypeError because the quaz keyword isn't recognized by # any __init__ method on any class in the hierarchy above E/E2: E(foo=10, bar=20, quaz=5) E2(foo=10, bar=20, quaz=5) James From aleax at aleax.it Thu Jan 6 09:33:20 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 6 09:33:24 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: References: Message-ID: <9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 06, at 06:27, Ilya Sandler wrote: ... > We could have an optional offset argument for > > unpack(format, buffer, offset=None) I do agree on one concept here: when a function wants a string argument S, and the value for that string argument S is likely to come from some other bigger string Z as a subset Z[O:O+L], being able to optionally specify Z, O and L (or the endpoint, O+L), rather than having to do the slicing, can be a simplification and a substantial speedup. When I had this kind of problem in the past I approached it with the buffer built-in. Say I've slurped in a whole not-too-huge binary file into `data', and now need to unpack several pieces of it from different offsets; rather than: somestuff = struct.unpack(fmt, data[offs:offs+struct.calcsize(fmt)]) I can use: somestuff = struct.unpack(fmt, buffer(data, offs, struct.calcsize(fmt))) as a kind of "virtual slicing". Besides the vague-to-me "impending deprecation" state of the buffer builtin, there is some advantage, but it's a bit modest. If I could pass data and offs directly to struct.unpack and thus avoid churning of one-use readonly buffer objects I'd probably be happier. As for "passing offset implies the length is calcsize(fmt)" sub-concept, I find that slightly more controversial. It's convenient, but somewhat ambiguous; in other cases (e.g. string methods) passing a start/offset and no end/length means to go to the end. Maybe something more explicit, such as a length= parameter with a default of None (meaning "go to the end") but which can be explicitly passed as -1 to mean "use calcsize internally", might go down better. As for the next part: > the offset argument is an object which contains a single integer field > which gets incremented inside unpack() to point to the next byte. ...I find this just too "magical". It's only useful when you're specifically unpacking data bytes that are compactly back to back (no "filler" e.g. for alignment purposes) and pays some conceptual price -- introducing a new specialized type to play the role of "mutable int" and having an argument mutated, which is not usual in Python's library. > so with a new API the above code could be written as > > offset=struct.Offset(0) > hdr=unpack("iiii", offset) > for i in range(hdr[0]): > item=unpack( "IIII", rec, offset) > > When an offset argument is provided, unpack() should allow some bytes > to > be left unpacked at the end of the buffer.. > > Does this suggestion make sense? Any better ideas? All in all, I suspect that something like...: # out of the record-by-record loop: hdrsize = struct.calcsize(hdr_fmt) itemsize = struct.calcsize(item_fmt) reclen = length_of_each_record # loop record by record while True: rec = binfile.read(reclen) if not rec: break hdr = struct.unpack(hdr_fmt, rec, 0, hdrsize) for offs in itertools.islice(xrange(hdrsize, reclen, itemsize), hdr[0]): item = struct.unpack(item_fmt, rec, offs, itemsize) # process item might be a better compromise. More verbose, because more explicit, of course. And if you do this kind of thing often, easy to encapsulate in a generator with 4 parameters -- the two formats (header and item), the record length, and the binfile -- just yield the hdr first, then each struct.unpack result from the inner loop. Having the offset and length parameters to struct.unpack might still be a performance gain worth pursuing (of course, we'd need some performance measurements from real-life use cases) even though from the point of view of code simplicity, in this example, there appears to be little or no gain wrt slicing rec[offs:offs+itemsize] or using buffer(rec, offs, itemsize). Alex From anthony at interlink.com.au Thu Jan 6 11:28:26 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu Jan 6 11:28:21 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it> References: <9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <200501062128.28905.anthony@interlink.com.au> My take on this: struct.pack/struct.unpack is already one of my least-favourite parts of the stdlib. Of the modules I use regularly, I pretty much only ever have to go back and re-read the struct (and re) documentation because they just won't fit in my brain. Adding additional complexity to them seems like a net loss to me. I'd _love_ to find the time to write a sane replacement for struct - as well as the current use case, I'd also like it to handle things like attribute-length-value 3-tuples nicely (where you get a fixed field which identifies the attribute, a fixed field which specifies the value length, and a value of 'length' bytes). Almost all sane network protocols (i.e. those written before the plague of pointy brackets) use this in some way. I'd much rather specify the format as something like a tuple of values - (INT, UINT, INT, STRING) (where INT &c are objects defined in the struct module). This also then allows users to specify their own formats if they have a particular need for something. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From p.f.moore at gmail.com Thu Jan 6 12:38:39 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Thu Jan 6 12:38:41 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <200501062128.28905.anthony@interlink.com.au> References: <9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it> <200501062128.28905.anthony@interlink.com.au> Message-ID: <79990c6b050106033816e8ea25@mail.gmail.com> On Thu, 6 Jan 2005 21:28:26 +1100, Anthony Baxter wrote: > My take on this: > > struct.pack/struct.unpack is already one of my least-favourite parts > of the stdlib. Of the modules I use regularly, I pretty much only ever > have to go back and re-read the struct (and re) documentation because > they just won't fit in my brain. Adding additional complexity to them > seems like a net loss to me. Have you looked at Thomas Heller's ctypes? Ignoring the FFI stuff, it has a fairly comprehensive interface for defining and using C structure types. A simple example: >>> class POINT(Structure): ... _fields_ = [('x', c_int), ('y', c_int)] ... >>> p = POINT(1,2) >>> p.x, p.y (1, 2) >>> str(buffer(p)) '\x01\x00\x00\x00\x02\x00\x00\x00' To convert *from* a byte string is messier, but not too bad: >>> s = str(buffer(p)) >>> s '\x01\x00\x00\x00\x02\x00\x00\x00' >>> p2 = POINT() >>> ctypes.memmove(p2, s, ctypes.sizeof(POINT)) 14688904 >>> p2.x, p2.y (1, 2) It might even be possible to get Thomas to add a small helper classmethod to ctypes types, something like POINT.unpack(str, offset=0, length=None) which does the equivalent of def unpack(cls, str, offset=0, length=None): if length is None: length=sizeof(cls) b = buffer(str, offset, length) new = cls() ctypes.memmove(new, b, length) return new > I'd _love_ to find the time to write a sane replacement for struct - as > well as the current use case, I'd also like it to handle things like > attribute-length-value 3-tuples nicely (where you get a fixed field > which identifies the attribute, a fixed field which specifies the value > length, and a value of 'length' bytes). Almost all sane network protocols > (i.e. those written before the plague of pointy brackets) use this in > some way. I'm not sure ctypes handles that, mainly because I don't think C does (without the usual trick of defining the last field as fixed length) Paul. From theller at python.net Thu Jan 6 13:22:52 2005 From: theller at python.net (Thomas Heller) Date: Thu Jan 6 13:21:32 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <79990c6b050106033816e8ea25@mail.gmail.com> (Paul Moore's message of "Thu, 6 Jan 2005 11:38:39 +0000") References: <9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it> <200501062128.28905.anthony@interlink.com.au> <79990c6b050106033816e8ea25@mail.gmail.com> Message-ID: <4qhuzqzn.fsf@python.net> Paul Moore writes: > On Thu, 6 Jan 2005 21:28:26 +1100, Anthony Baxter > wrote: >> My take on this: >> >> struct.pack/struct.unpack is already one of my least-favourite parts >> of the stdlib. Of the modules I use regularly, I pretty much only ever >> have to go back and re-read the struct (and re) documentation because >> they just won't fit in my brain. Adding additional complexity to them >> seems like a net loss to me. > > Have you looked at Thomas Heller's ctypes? Ignoring the FFI stuff, it > has a fairly comprehensive interface for defining and using C > structure types. A simple example: > >>>> class POINT(Structure): > ... _fields_ = [('x', c_int), ('y', c_int)] > ... >>>> p = POINT(1,2) >>>> p.x, p.y > (1, 2) >>>> str(buffer(p)) > '\x01\x00\x00\x00\x02\x00\x00\x00' > > To convert *from* a byte string is messier, but not too bad: [...] For reading structures from files, the undocumented (*) readinto method is very nice. An example: class IMAGE_DOS_HEADER(Structure): .... class IMAGE_NT_HEADERS(Structure): .... class PEReader(object): def read_image(self, pathname): ################ # the MSDOS header image = open(pathname, "rb") self.dos_header = IMAGE_DOS_HEADER() image.readinto(self.dos_header) ################ # The PE header image.seek(self.dos_header.e_lfanew) self.nt_headers = IMAGE_NT_HEADERS() image.readinto(self.nt_headers) > It might even be possible to get Thomas to add a small helper > classmethod to ctypes types, something like > > POINT.unpack(str, offset=0, length=None) Maybe, but I would prefer the unbeloved buffer object (*) as argument, because it has builtin offset and length. > which does the equivalent of > > def unpack(cls, str, offset=0, length=None): > if length is None: > length=sizeof(cls) > b = buffer(str, offset, length) > new = cls() > ctypes.memmove(new, b, length) > return new > >> I'd _love_ to find the time to write a sane replacement for struct - as >> well as the current use case, I'd also like it to handle things like >> attribute-length-value 3-tuples nicely (where you get a fixed field >> which identifies the attribute, a fixed field which specifies the value >> length, and a value of 'length' bytes). Almost all sane network protocols >> (i.e. those written before the plague of pointy brackets) use this in >> some way. > > I'm not sure ctypes handles that, mainly because I don't think C does > (without the usual trick of defining the last field as fixed length) Correct. (*) Which brings me to the questions I have in my mind for quite some time: Why is readinto undocumented, and what about the status of the buffer object: do the recent fixes to the buffer object change it's status? Thomas From ncoghlan at iinet.net.au Thu Jan 6 13:53:48 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Thu Jan 6 13:53:52 2005 Subject: [Python-Dev] Subscribing to PEP updates Message-ID: <41DD34DC.5010005@iinet.net.au> Someone asked on python-list about getting notifications of changes to PEP's. As a low-effort solution, would it be possible to add a Sourceforge mailing list hook just for checkins to the nondist/peps directory? Call it python-pep-updates or some such beast. If I remember how checkin notifications work correctly, the updates would even come with automatic diffs :) Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From Jack.Jansen at cwi.nl Thu Jan 6 14:04:34 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Thu Jan 6 14:04:46 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <41DC7D20.4000901@v.loewis.de> References: <41DB2C9A.4070800@v.loewis.de> <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> <41DC7D20.4000901@v.loewis.de> Message-ID: <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl> On 6 Jan 2005, at 00:49, Martin v. L?wis wrote: >> The "new" solution is basically to go back to the Unix way of >> building an extension: link it against nothing and sort things out >> at runtime. Not my personal preference, but at least we know that >> loading an extension into one Python won't bring in a fresh copy of >> a different interpreter or anything horrible like that. > > This sounds good, except that it only works on OS X 10.3, right? > What about older versions? 10.3 or later. For older OSX releases (either because you build Python on 10.2 or earlier, or because you've set MACOSX_DEPLOYMENT_TARGET to a value of 10.2 or less) we use the old behaviour of linking with "-framework Python". -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From mwh at python.net Thu Jan 6 14:17:40 2005 From: mwh at python.net (Michael Hudson) Date: Thu Jan 6 14:17:42 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: (Ilya Sandler's message of "Wed, 5 Jan 2005 21:27:16 -0800 (PST)") References: Message-ID: <2mr7kyhf2j.fsf@starship.python.net> Ilya Sandler writes: > A problem: > > The current struct.unpack api works well for unpacking C-structures where > everything is usually unpacked at once, but it > becomes inconvenient when unpacking binary files where things > often have to be unpacked field by field. Then one has to keep track > of offsets, slice the strings,call struct.calcsize(), etc... IMO (and E), struct.unpack is the primitive atop which something more sensible is built. I've certainly tried to build that more sensible thing at least once, but haven't ever got the point of believing what I had would be applicable to the general case... maybe it's time to write such a thing for the standard library. Cheers, mwh -- ARTHUR: Ford, you're turning into a penguin, stop it. -- The Hitch-Hikers Guide to the Galaxy, Episode 2 From goodger at python.org Thu Jan 6 15:01:42 2005 From: goodger at python.org (David Goodger) Date: Thu Jan 6 15:01:55 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates In-Reply-To: <41DD34DC.5010005@iinet.net.au> References: <41DD34DC.5010005@iinet.net.au> Message-ID: <41DD44C6.4010909@python.org> [Nick Coghlan] > Someone asked on python-list about getting notifications of changes to > PEP's. > > As a low-effort solution, would it be possible to add a Sourceforge > mailing list hook just for checkins to the nondist/peps directory? -0 Probably possible, but not no-effort, so even if it gets a favorable reaction someone needs to do some work. Why not just subscribe to python-checkins and filter out everything *but* nondist/peps? As PEP editor, that's what I do (although I filter manually/visually, since I'm also interested in other checkins). -- David Goodger -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 253 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20050106/adad03d4/signature.pgp From gjc at inescporto.pt Thu Jan 6 16:21:36 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Thu Jan 6 16:22:08 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <2mr7kyhf2j.fsf@starship.python.net> References: <2mr7kyhf2j.fsf@starship.python.net> Message-ID: <1105024896.25031.12.camel@localhost> On Thu, 2005-01-06 at 13:17 +0000, Michael Hudson wrote: > Ilya Sandler writes: > > > A problem: > > > > The current struct.unpack api works well for unpacking C-structures where > > everything is usually unpacked at once, but it > > becomes inconvenient when unpacking binary files where things > > often have to be unpacked field by field. Then one has to keep track > > of offsets, slice the strings,call struct.calcsize(), etc... > > IMO (and E), struct.unpack is the primitive atop which something more > sensible is built. I've certainly tried to build that more sensible > thing at least once, but haven't ever got the point of believing what > I had would be applicable to the general case... maybe it's time to > write such a thing for the standard library. I've been using this simple wrapper: def stream_unpack(stream, format): return struct.unpack(format, stream.read(struct.calcsize(format))) It works with file-like objects, such as file, StringIO, socket.makefile(), etc. Working with streams is useful because sometimes you don't know how much you need to read to decode a message in advance. Regards. > > Cheers, > mwh > -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic. -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3086 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050106/349cb526/smime.bin From martin at v.loewis.de Thu Jan 6 17:05:05 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 6 17:04:59 2005 Subject: [Python-Dev] csv module TODO list In-Reply-To: <20050106011055.001163C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <41DBA7D0.80101@v.loewis.de> <41DBAF06.6020401@egenix.com> <20050105093414.00DFF3C8E5@coffee.object-craft.com.au> <41DC637A.5050105@v.loewis.de> <20050106011055.001163C8E5@coffee.object-craft.com.au> Message-ID: <41DD61B1.1030507@v.loewis.de> Andrew McNamara wrote: > Marc-Andre Lemburg mentioned that he has encountered UTF-16 encoded csv > files, so a reasonable starting point would be the ability to read and > parse, as well as the ability to generate, one of these. I see. That would be reasonable, indeed. Notice that this is not so much a "Unicode issue", but more an "encoding" issue. If you solve the "arbitrary encodings" problem, you solve UTF-16 as a side effect. > The reader interface currently returns a row at a time, consuming as many > lines from the supplied iterable (with the most common iterable being > a file). This suggests to me that we will need an optional "encoding" > argument to the reader constructor, and that the reader will need to > decode the source lines. Ok. In this context, I see two possible implementation strategies: 1. Implement the csv module two times: once for bytes, and once for Unicode characters. It is likely that the source code would be the same for each case; you just need to make sure the "Dialect and Formatting Parameters" change their width accordingly. If you use the SRE approach, you would do #define CSV_ITEM_T char #define CSV_NAME_PREFIX byte_ #include "csvimpl.c" #define CSV_ITEM_T Py_Unicode #define CSV_NAME_PREFIX unicode_ #include "csvimpl.c" 2. Use just the existing _csv module, and represent non-byte encodings as UTF-8. This will work as long as the delimiters and other markup characters have always a single byte in UTF-8, which is the case for "':\, as well as for \r and \n. Then, wenn processing using an explicit encoding, first convert the input into Unicode objects. Then encode the Unicode objects into UTF-8, and pass it to _csv. For the results you get back, convert each element back from UTF-8 to a Unicode object. This could be implemented as def reader(f, encoding=None): if encoding is None: return _csv.reader(f) enc, dec, reader, writer = codecs.lookup(encoding) utf8_enc, utf8_dec, utf8_r, utf8_w = codecs.lookup("UTF-8") # Make a recoder which can only read utf8_stream = codecs.StreamRecoder(f, utf8_enc, None, Reader, None) csv_reader = _csv.reader(utf8_stream) # For performance reasons, map_result could be implemented in C def map_result(t): result = [None]*len(t) for i, val in enumerate(t): result[i] = utf8_dec(val) return tuple(result) return itertools.imap(map_result, csv_reader) # This code is untested This approach has the disadvantage of performing three recodings: from input charset to Unicode, from Unicode to UTF-8, from UTF-8 to Unicode. One could: - skip the initial recoding if the encoding is already known to be _csv-safe (i.e. if it is a pure ASCII superset). This would be valid for ASCII, iso-8859-n, UTF-8, ... - offer the user to keep the results in the input encoding, instead of always returning Unicode objects. Apart from this disadvantage, I think this gives people what they want: they can specify the encoding of the input, and they get the results not only csv-separated, but also unicode-decode. This approach is the same that is used for Python source code encodings: the source is first recoded into UTF-8, then parsed, then recoded back. > That said, I'm hardly a unicode expert, so I > may be overlooking something (could a utf-16 encoded character span a > line break, for example). This cannot happen: \r, in UTF-16, is also 2 bytes (0D 00, if UTF-16LE). There are issues that Unicode has additional line break characters, which is probably irrelevant. Regards, Martin From ajm at flonidan.dk Thu Jan 6 17:22:12 2005 From: ajm at flonidan.dk (Anders J. Munch) Date: Thu Jan 6 17:22:37 2005 Subject: [Python-Dev] csv module TODO list Message-ID: <6D9E824FA10BD411BE95000629EE2EC3C6DE3C@FLONIDAN-MAIL> Andrew McNamara wrote: > > I'm not altogether sure there. The parsing state machine is all > written in C, and deals with signed chars - I expect we'll need two > versions of that (or one version that's compiled twice using > pre-processor macros). Quite a large job. Suggestions gratefully > received. How about using UTF-8 internally? Change nothing in _csv.c, but in csv.py encode/decode any unicode strings into UTF-8 on the way to/from _csv. File-like objects passed in by the user can be wrapped in proxies that take care of encoding and decoding user strings, as well as trans-coding between UTF-8 and the users chosen file encoding. All that coding work may slow things down, but your original fast _csv module will still be there when you need it. - Anders From mchermside at ingdirect.com Thu Jan 6 17:33:54 2005 From: mchermside at ingdirect.com (Chermside, Michael) Date: Thu Jan 6 17:33:58 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates Message-ID: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com> > Why not just subscribe to > python-checkins and filter out everything *but* nondist/peps? But there are lots of people who might be interested in following PEP updates but not other checkins. Pretty much anyone who considers themselves a "user" of Python not a developer. Perhaps they don't even know C. That's a lot to filter through for such people. (After all, I sure HOPE that only a small fraction of checkins are for PEPs not code.) I'm +0 on it... but I'll mention that if such a list were created I'd subscribe. So maybe that's +0.2 instead. -- Michael Chermside This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it. From gvanrossum at gmail.com Thu Jan 6 18:13:54 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Jan 6 18:13:57 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net> References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net> Message-ID: > Please notice that I'm talking about concrete, real issues, not just a > "super is bad!" rant. Then why is the title "Python's Super Considered Harmful" ??? Here's my final offer. Change the title to something like "Multiple Inheritance Pitfalls in Python" and nobody will get hurt. > They are not inherent in cooperative > multiple inheritance, but occur mostly because of its late addition to python, Would you rather not have seen it (== cooperative inheritance) added at all? > and the cumbersome way in which you have to invoke super. Given Python's dynamic nature I couldn't think of a way to make it less cumbersome. I see you tried (see below) and couldn't either. At this point I tend to say "put up or shut up." > I wrote up the page as part of an investigation into converting Twisted > to use super. I thought it would be a good idea to do the conversion, > but others told me it would be a bad idea for backwards compatibility > reasons. I did not believe, at first, and conducted experiments. In the > end, I concluded that it is not possible, because of the issues with > mixing the new and old paradigm. So it has nothing to do with the new paradigm, just with backwards compatibility. I appreciate those issues (more than you'll ever know) but I don't see why you should try to discourage others from using the new paradigm, which is what your article appears to do. > Leaving behind the backwards compatibility issues... > > In order to make super really nice, it should be easier to use right. > Again, the two major issues that cause problems are: 1) having to > declare every method with *args, **kwargs, and having to pass those and > all the arguments you take explicitly to super, That's only an issue with __init__ or with code written without cooperative MI in mind. When using cooperative MI, you shouldn't redefine method signatures, and all is well. and 2) that > traditionally __init__ is called with positional arguments. Cooperative MI doesn't have a really good solution for __init__. Defining and calling __init__ only with keyword arguments is a good solution. But griping about "traditionally" is a backwards compatibility issue, which you said you were leaving behind. > To fix #1, it would be really nice if you could write code something > like the following snippet. Notice especially here that the 'bar' > argument gets passed through C.__init__ and A.__init__, into > D.__init__, without the previous two having to do anything about it. > However, if you ask me to detail how this could *possibly* *ever* work > in python, I have no idea. Probably the answer is that it can't. Exactly. What is your next_method statement supposed to do? No need to reply except when you've changed the article. I'm tired of the allegations. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From foom at fuhm.net Thu Jan 6 18:22:36 2005 From: foom at fuhm.net (James Y Knight) Date: Thu Jan 6 18:22:34 2005 Subject: [Python-Dev] buffer objects [was: an idea for improving struct.unpack api] In-Reply-To: <4qhuzqzn.fsf@python.net> References: <9EF1E97A-5FBD-11D9-ADA4-000A95EFAE9E@aleax.it> <200501062128.28905.anthony@interlink.com.au> <79990c6b050106033816e8ea25@mail.gmail.com> <4qhuzqzn.fsf@python.net> Message-ID: <8F37C34F-6007-11D9-8D68-000A95A50FB2@fuhm.net> On Jan 6, 2005, at 7:22 AM, Thomas Heller wrote: > (*) Which brings me to the questions I have in my mind for quite some > time: Why is readinto undocumented, and what about the status of the > buffer object: do the recent fixes to the buffer object change it's > status? I, for one, would be very unhappy if the byte buffer object were to go away. It's quite useful. I didn't even realize readinto existed. It'd be great to add more of them. os.readinto for reading from fds and socket.socket.recvinto for reading from sockets. Is there any reason the writable buffer interface isn't exposed to python-land? James From pje at telecommunity.com Thu Jan 6 18:44:58 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 6 18:46:05 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net> References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> Message-ID: <5.1.1.6.0.20050106122714.02135920@mail.telecommunity.com> At 02:46 AM 1/6/05 -0500, James Y Knight wrote: >To fix #1, it would be really nice if you could write code something like >the following snippet. Notice especially here that the 'bar' argument gets >passed through C.__init__ and A.__init__, into D.__init__, without the >previous two having to do anything about it. However, if you ask me to >detail how this could *possibly* *ever* work in python, I have no idea. >Probably the answer is that it can't. > >class A(object): > def __init__(self): > print "A" > next_method > >class B(object): > def __init__(self): > print "B" > next_method Not efficiently, no, but it's *possible*. Just write a 'next_method()' routine that walks the frame stack and self's MRO, looking for a match. You know the method name from f_code.co_name, and you can check each class' __dict__ until you find a function or classmethod object whose code is f_code. If not, move up to the next frame and try again. Once you know the class that the function comes from, you can figure out the "next" method, and pull its args from the calling frame's args, walking backward to other calls on the same object, until you find all the args you need. Oh, and don't forget to make sure that you're inspecting frames that have the same 'self' object. Of course, the result would be a hideous evil ugly hack that should never see the light of day, but you could *do* it, if you *really really* wanted to. And if you wrote it in C, it might be only 50 or 100 times slower than super(). :) From barry at python.org Thu Jan 6 19:01:51 2005 From: barry at python.org (Barry Warsaw) Date: Thu Jan 6 19:02:02 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates In-Reply-To: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com> References: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com> Message-ID: <1105034511.10728.3.camel@geddy.wooz.org> On Thu, 2005-01-06 at 11:33, Chermside, Michael wrote: > > Why not just subscribe to > > python-checkins and filter out everything *but* nondist/peps? > > But there are lots of people who might be interested in > following PEP updates but not other checkins. Pretty > much anyone who considers themselves a "user" of Python > not a developer. Perhaps they don't even know C. That's a > lot to filter through for such people. (After all, I > sure HOPE that only a small fraction of checkins are for > PEPs not code.) > > I'm +0 on it... but I'll mention that if such a list were > created I'd subscribe. So maybe that's +0.2 instead. As an experiment, I just added a PEP topic to the python-checkins mailing list. You could subscribe to this list and just select the PEP topic (which matches the regex "PEP" in the Subject header or first few lines of the body). Give it a shot and let's see if that does the trick. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050106/d1ebd9f4/attachment.pgp From tjreedy at udel.edu Thu Jan 6 20:16:23 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Thu Jan 6 20:16:38 2005 Subject: [Python-Dev] Re: super() harmful? References: <1f7befae05010410576effd024@mail.gmail.com><20050104154707.927B.JCARLSON@uci.edu><9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net> Message-ID: "James Y Knight" wrote in message news:091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net... > Please notice that I'm talking about concrete, real issues, not just a > "super is bad!" rant. Umm, James, come on. Let's be really real and concrete ;-). Your title "Python's Super Considered Harmful" is an obvious reference to and takeoff on Dijkstra's influential polemic "Goto Considered Harmful". To me, the obvious message therefore is that super(), like goto, is an ill-conceived monstrosity that warps peoples' minds and should be banished. I can also see a slight dig at Guido for introducing such a thing decades after Dijkstra taught us to know better. If that is your summary message for me, fine. If not, try something else. The title of a piece is part of its message -- especially when it has an intelligible meaning. For people who read the title in, for instance, a clp post (as I did), but don't follow the link and read what is behind the title (which I did do), the title *is* the message. Terry J. Reedy From janssen at parc.com Thu Jan 6 20:25:34 2005 From: janssen at parc.com (Bill Janssen) Date: Thu Jan 6 20:25:49 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: Your message of "Thu, 06 Jan 2005 09:13:54 PST." Message-ID: <05Jan6.112539pst."58617"@synergy1.parc.xerox.com> > Then why is the title "Python's Super Considered Harmful" ??? > > Here's my final offer. Change the title to something like "Multiple > Inheritance Pitfalls in Python" and nobody will get hurt. Or better yet, considering the recent thread on Python marketing, "Multiple Inheritance Mastery in Python" :-). Bill From bac at OCF.Berkeley.EDU Thu Jan 6 20:29:45 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Jan 6 20:30:05 2005 Subject: [Python-Dev] Subscribing to PEP updates In-Reply-To: <41DD34DC.5010005@iinet.net.au> References: <41DD34DC.5010005@iinet.net.au> Message-ID: <41DD91A9.1000901@ocf.berkeley.edu> Nick Coghlan wrote: > Someone asked on python-list about getting notifications of changes to > PEP's. > > As a low-effort solution, would it be possible to add a Sourceforge > mailing list hook just for checkins to the nondist/peps directory? > > Call it python-pep-updates or some such beast. If I remember how checkin > notifications work correctly, the updates would even come with automatic > diffs :) > Probably not frequent or comprehensive enough, but I try to always have at least a single news item that clumps all PEP updates that python-dev gets notified about. -Brett From bob at redivi.com Thu Jan 6 20:38:56 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu Jan 6 20:39:04 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <2mr7kyhf2j.fsf@starship.python.net> References: <2mr7kyhf2j.fsf@starship.python.net> Message-ID: <9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com> On Jan 6, 2005, at 8:17, Michael Hudson wrote: > Ilya Sandler writes: > >> A problem: >> >> The current struct.unpack api works well for unpacking C-structures >> where >> everything is usually unpacked at once, but it >> becomes inconvenient when unpacking binary files where things >> often have to be unpacked field by field. Then one has to keep track >> of offsets, slice the strings,call struct.calcsize(), etc... > > IMO (and E), struct.unpack is the primitive atop which something more > sensible is built. I've certainly tried to build that more sensible > thing at least once, but haven't ever got the point of believing what > I had would be applicable to the general case... maybe it's time to > write such a thing for the standard library. This is my ctypes-like attempt at a high-level interface for struct. It works well for me in macholib: http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py -bob From tim.peters at gmail.com Thu Jan 6 20:47:35 2005 From: tim.peters at gmail.com (Tim Peters) Date: Thu Jan 6 20:47:37 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: <4494809200119597707@unknownmsgid> References: <4494809200119597707@unknownmsgid> Message-ID: <1f7befae05010611474d76bebd@mail.gmail.com> [Guido] >> Then why is the title "Python's Super Considered Harmful" ??? >> >> Here's my final offer. Change the title to something like "Multiple >> Inheritance Pitfalls in Python" and nobody will get hurt. [Bill Janssen] > Or better yet, considering the recent thread on Python marketing, > "Multiple Inheritance Mastery in Python" :-). I'm sorry, but that's not good marketing -- it contains big words, and putting the brand name last is ineffective. How about Python's Super() is Super -- Over 1528.7% Faster than C! BTW, it's important that fractional percentages end with an odd digit. Research shows that if the last digit is even, 34.1% of consumers tend to suspect the number was made up. From bac at OCF.Berkeley.EDU Thu Jan 6 20:50:22 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Jan 6 20:50:40 2005 Subject: [Python-Dev] proto-pep: How to change CPython's bytecode In-Reply-To: <41CC7F67.9070009@ocf.berkeley.edu> References: <41CC7F67.9070009@ocf.berkeley.edu> Message-ID: <41DD967E.3040405@ocf.berkeley.edu> OK, latest update with all suggest revisions (mention this is for CPython, section for known previous bytecode work). If no one has any revisions I will submit to David for official PEP acceptance this weekend. ---------------------------------- PEP: XXX Title: How to change CPython's bytecode Version: $Revision: 1.4 $ Last-Modified: $Date: 2003/09/22 04:51:50 $ Author: Brett Cannoon Status: Draft Type: Informational Content-Type: text/x-rst Created: XX-XXX-XXXX Post-History: XX-XXX-XXXX Abstract ======== Python source code is compiled down to something called bytecode. This bytecode must implement enough semantics to perform the actions required by the Language Reference [#lang-ref]. As such, knowing how to add, remove, or change the bytecode is important to do properly when changing the abilities of the Python language. This PEP covers how to accomplish this in the CPython implementation of the language (referred to as simply "Python" for the rest of this PEP). Rationale ========= While changing Python's bytecode is not a frequent occurence, it still happens. Having the required steps documented in a single location should make experimentation with the bytecode easier since it is not necessarily obvious what the steps are to change the bytecode. This PEP, paired with PEP 306 [#PEP-306]_, should provide enough basic guidelines for handling any changes performed to the Python language itself in terms of syntactic changes that introduce new semantics. Checklist ========= This is a rough checklist of what files need to change and how they are involved with the bytecode. All paths are given from the viewpoint of ``/cvsroot/python/dist/src`` from CVS). This list should not be considered exhaustive nor to cover all possible situations. - ``Include/opcode.h`` This include file lists all known opcodes and associates each opcode name with a unique number. When adding a new opcode it is important to take note of the ``HAVE_ARGUMENT`` value. This ``#define``'s value specifies the value at which all opcodes that have a value greater than ``HAVE_ARGUMENT`` are expected to take an argument to the opcode. - ``Lib/opcode.py`` Lists all of the opcodes and their associated value. Used by the dis module [#dis]_ to map bytecode values to their names. - ``Python/ceval.c`` Contains the main interpreter loop. Code to handle the evalution of an opcode here. - ``Python/compile.c`` To make sure an opcode is actually used, this file must be altered. The emitting of all bytecode occurs here. - ``Lib/compiler/pyassem.py``, ``Lib/compiler/pycodegen.py`` The 'compiler' package [#compiler]_ needs to be altered to also reflect any changes to the bytecode. - ``Doc/lib/libdis.tex`` The documentation [#opcode-list] for the dis module contains a complete list of all the opcodes. - ``Python/import.c`` Defines the magic word (named ``MAGIC``) used in .pyc files to detect if the bytecode used matches the one used by the version of Python running. This number needs to be changed to make sure that the running interpreter does not try to execute bytecode that it does not know about. Suggestions for bytecode development ==================================== A few things can be done to make sure that development goes smoothly when experimenting with Python's bytecode. One is to delete all .py(c|o) files after each semantic change to Python/compile.c . That way all files will use any bytecode changes. Make sure to run the entire testing suite [#test-suite]_. Since the ``regrtest.py`` driver recompiles all source code before a test is run it acts a good test to make sure that no existing semantics are broken. Running parrotbench [#parrotbench]_ is also a good way to make sure existing semantics are not broken; this benchmark is practically a compliance test. Previous experiments ==================== Skip Montanaro presented a paper at a Python workshop on a peephole optimizer [#skip-peephole]_. Michael Hudson has a non-active SourceForge project named Bytecodehacks [#Bytecodehacks]_ that provides functionality for playing with bytecode directly. References ========== .. [#lang-ref] Python Language Reference, van Rossum & Drake (http://docs.python.org/ref/ref.html) .. [#PEP-306] PEP 306, How to Change Python's Grammar, Hudson (http://www.python.org/peps/pep-0306.html) .. [#dis] dis Module (http://docs.python.org/lib/module-dis.html) .. [#test-suite] 'test' package (http://docs.python.org/lib/module-test.html) .. [#parrotbench] Parrotbench (ftp://ftp.python.org/pub/python/parrotbench/parrotbench.tgz, http://mail.python.org/pipermail/python-dev/2003-December/041527.html) .. [#opcode-list] Python Byte Code Instructions (http://docs.python.org/lib/bytecodes.html) .. [#skip-peephole] http://www.foretec.com/python/workshops/1998-11/proceedings/papers/montanaro/montanaro.html .. [#Bytecodehacks] http://bytecodehacks.sourceforge.net/bch-docs/bch/index.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From ronaldoussoren at mac.com Thu Jan 6 20:59:30 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu Jan 6 20:59:33 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl> References: <41DB2C9A.4070800@v.loewis.de> <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> <41DC7D20.4000901@v.loewis.de> <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl> Message-ID: <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com> On 6-jan-05, at 14:04, Jack Jansen wrote: > > On 6 Jan 2005, at 00:49, Martin v. L?wis wrote: >>> The "new" solution is basically to go back to the Unix way of >>> building an extension: link it against nothing and sort things out >>> at runtime. Not my personal preference, but at least we know that >>> loading an extension into one Python won't bring in a fresh copy of >>> a different interpreter or anything horrible like that. >> >> This sounds good, except that it only works on OS X 10.3, right? >> What about older versions? > > 10.3 or later. For older OSX releases (either because you build Python > on 10.2 or earlier, or because you've set MACOSX_DEPLOYMENT_TARGET to > a value of 10.2 or less) we use the old behaviour of linking with > "-framework Python". Wouldn't it be better to link with the actual dylib inside the framework on 10.2? Otherwise you can no longer build 2.3 extensions after you've installed 2.4. Ronald From martin at v.loewis.de Thu Jan 6 21:03:39 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 6 21:03:31 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com> References: <41DB2C9A.4070800@v.loewis.de> <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> <41DC7D20.4000901@v.loewis.de> <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl> <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com> Message-ID: <41DD999B.4040206@v.loewis.de> Ronald Oussoren wrote: > Wouldn't it be better to link with the actual dylib inside the framework > on 10.2? Otherwise you can no longer build 2.3 extensions after you've > installed 2.4. That's what I thought, too. Regards, Martin From bob at redivi.com Thu Jan 6 21:03:39 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu Jan 6 21:03:57 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com> References: <41DB2C9A.4070800@v.loewis.de> <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> <41DC7D20.4000901@v.loewis.de> <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl> <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com> Message-ID: <0E9F643A-601E-11D9-AB1C-000A95BA5446@redivi.com> On Jan 6, 2005, at 14:59, Ronald Oussoren wrote: > > On 6-jan-05, at 14:04, Jack Jansen wrote: > >> >> On 6 Jan 2005, at 00:49, Martin v. L?wis wrote: >>>> The "new" solution is basically to go back to the Unix way of >>>> building an extension: link it against nothing and sort things out >>>> at runtime. Not my personal preference, but at least we know that >>>> loading an extension into one Python won't bring in a fresh copy >>>> of a different interpreter or anything horrible like that. >>> >>> This sounds good, except that it only works on OS X 10.3, right? >>> What about older versions? >> >> 10.3 or later. For older OSX releases (either because you build >> Python on 10.2 or earlier, or because you've set >> MACOSX_DEPLOYMENT_TARGET to a value of 10.2 or less) we use the old >> behaviour of linking with "-framework Python". > > Wouldn't it be better to link with the actual dylib inside the > framework on 10.2? Otherwise you can no longer build 2.3 extensions > after you've installed 2.4. It would certainly be better to do this for 10.2. -bob From martin at v.loewis.de Thu Jan 6 21:12:40 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 6 21:12:32 2005 Subject: [Python-Dev] Changing the default value of stat_float_times Message-ID: <41DD9BB8.5060206@v.loewis.de> When support for floating-point stat times was added in 2.3, it was the plan that this should eventually become the default. Does anybody object if I change the default now, for Python 2.5? Applications which then break can globally change it back, with os.stat_float_times(False) Regards, Martin From aleax at aleax.it Thu Jan 6 21:46:39 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 6 21:46:45 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: References: <1f7befae05010410576effd024@mail.gmail.com><20050104154707.927B.JCARLSON@uci.edu><9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net> Message-ID: <10835172-6024-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 06, at 20:16, Terry Reedy wrote: > > "James Y Knight" wrote in message > news:091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net... >> Please notice that I'm talking about concrete, real issues, not just a >> "super is bad!" rant. > > Umm, James, come on. Let's be really real and concrete ;-). > > Your title "Python's Super Considered Harmful" is an obvious reference > to > and takeoff on Dijkstra's influential polemic "Goto Considered > Harmful". ...or any other of the 345,000 google hits on "considered harmful"...?-) Alex From m.bless at gmx.de Thu Jan 6 22:40:36 2005 From: m.bless at gmx.de (Martin Bless) Date: Thu Jan 6 22:53:19 2005 Subject: [Python-Dev] Re: an idea for improving struct.unpack api References: Message-ID: On Wed, 5 Jan 2005 21:27:16 -0800 (PST), Ilya Sandler wrote: >The current struct.unpack api works well for unpacking C-structures where >everything is usually unpacked at once, but it >becomes inconvenient when unpacking binary files where things >often have to be unpacked field by field. It may be helpful to remember Sam Rushings NPSTRUCT extension which accompanied the Calldll module of that time (2001). Still available from http://www.nightmare.com/~rushing/dynwin/ mb - Martin From tdelaney at avaya.com Thu Jan 6 23:45:55 2005 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Thu Jan 6 23:46:09 2005 Subject: [Python-Dev] Re: super() harmful? Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE721229@au3010avexu1.global.avaya.com> Guido van Rossum wrote: >> and the cumbersome way in which you have to invoke super. > > Given Python's dynamic nature I couldn't think of a way to make it > less cumbersome. I see you tried (see below) and couldn't either. At > this point I tend to say "put up or shut up." Well, there's my autosuper recipe you've seen before: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286195 which does basically what Philip descibes ... >>> import autosuper >>> >>> class A (autosuper.autosuper): ... def test (self, a): ... print 'A.test: %s' % (a,) ... >>> class B (A): ... def test (self, a): ... print 'B.test: %s' % (a,) ... self.super(a + 1) ... >>> class C (A): ... def test (self, a): ... print 'C.test: %s' % (a,) ... self.super.test(a + 1) ... >>> class D (B, C): ... def test (self, a): ... print 'D.test: %s' % (a,) ... self.super(a + 1) ... >>> D().test(1) D.test: 1 B.test: 2 C.test: 3 A.test: 4 It uses sys._getframe() of course ... Tim Delaney From skip at pobox.com Wed Jan 5 21:21:18 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 7 01:23:51 2005 Subject: [Python-Dev] Re: csv module TODO list In-Reply-To: <20050105121921.GB24030@idi.ntnu.no> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <16859.38960.9935.682429@montanaro.dyndns.org> <20050105075506.314C93C8E5@coffee.object-craft.com.au> <20050105121921.GB24030@idi.ntnu.no> Message-ID: <16860.19518.824788.613286@montanaro.dyndns.org> Magnus> Quite a while ago I posted some material to the csv-list about Magnus> problems using the csv module on Unix-style colon-separated Magnus> files -- it just doesn't deal properly with backslash escaping Magnus> and is quite useless for this kind of file. I seem to recall the Magnus> general view was that it wasn't intended for this kind of thing Magnus> -- only the sort of csv that Microsoft Excel outputs/inputs, Yes, that's my recollection as well. It's possible that we can extend the interpretation of the escape char. Magnus> I'll be happy to re-send or summarize the relevant emails, if Magnus> needed. Yes, that would be helpful. Can you send me an example (three or four lines) of the sort of file it won't grok? Skip From skip at pobox.com Wed Jan 5 20:34:09 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 7 01:23:54 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <20050105110849.CBA843C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050105110849.CBA843C8E5@coffee.object-craft.com.au> Message-ID: <16860.16689.695012.975520@montanaro.dyndns.org> >> * is CSV going to be maintained outside the python tree? >> If not, remove the 2.2 compatibility macros for: PyDoc_STR, >> PyDoc_STRVAR, PyMODINIT_FUNC, etc. Andrew> Does anyone thing we should continue to maintain this 2.2 Andrew> compatibility? With the release of 2.4, 2.2 has officially dropped off the radar screen, right (zero probability of a 2.2.n+1 release, though the probability was vanishingly small before). I'd say toss it. Do just that in a single checkin so someone who's interested can do a simple cvs diff to yield an initial patch file for external maintenance of that feature. >> * inline the following functions since they are used only in one >> place get_string, set_string, get_nullchar_as_None, >> set_nullchar_as_None, join_reset (maybe) Andrew> It was done that way as I felt we would be adding more getters Andrew> and setters to the dialect object in future. The only new dialect attribute I envision is an encoding attribute. >> * is it necessary to have Dialect_methods, can you use 0 for tp_methods? Andrew> I was assuming I would need to add methods at some point (in Andrew> fact, I did have methods, but removed them). Dialect objects are really just data containers, right? I don't see that they would need any methods. >> * remove commented out code (PyMem_DEL) on line 261 >> Have you used valgrind on the test to find memory overwrites/leaks? Andrew> No, valgrind wasn't used. I have it here at work. I'll try to find a few minutes to run the csv tests under valgrind's control. Skip From tjreedy at udel.edu Fri Jan 7 05:23:31 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Jan 7 05:23:35 2005 Subject: [Python-Dev] Re: Re: super() harmful? References: <1f7befae05010410576effd024@mail.gmail.com><20050104154707.927B.JCARLSON@uci.edu><9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net><091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net> <10835172-6024-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: "Alex Martelli" wrote in message news:10835172-6024-11D9-ADA4-000A95EFAE9E@aleax.it... > > On 2005 Jan 06, at 20:16, Terry Reedy wrote: > >> [Knight's] title "Python's Super Considered Harmful" is an obvious >> reference to >> and takeoff on Dijkstra's influential polemic "Go To Statement >> Considered Harmful". http://www.acm.org/classics/oct95/ [title corrected from original posting and link added] > ...or any other of the 345,000 google hits on "considered harmful"...?-) Restricting the search space to 'Titles of computer science articles' would reduce the number of hits considerably. Many things have been considered harmful at sometime in almost every field of human endeavor. However, according to Eric Meyer, "Considered Harmful" Essays Considered Harmful > even that restriction would lead to thousands of hits inspired directly or indirectly by Niklaus Wirth's title for Dijkstra's Letter to the Editor. Thanks for the link. Terry J. Reedy From andrewm at object-craft.com.au Fri Jan 7 07:13:22 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Fri Jan 7 07:13:24 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> Message-ID: <20050107061322.A6A563C8E5@coffee.object-craft.com.au> >There's a bunch of jobs we (CSV module maintainers) have been putting >off - attached is a list (in no particular order): [...] >Also, review comments from Jeremy Hylton, 10 Apr 2003: > > I've been reviewing extension modules looking for C types that should > participate in garbage collection. I think the csv ReaderObj and > WriterObj should participate. The ReaderObj it contains a reference to > input_iter that could be an arbitrary Python object. The iterator > object could well participate in a cycle that refers to the ReaderObj. > The WriterObj has a reference to a writeline callable, which could well > be a method of an object that also points to the WriterObj. I finally got around to looking at this, only to realise Jeremy did the work back in Apr 2003 (thanks). One question, however - the GC doco in the Python/C API seems to suggest to me that PyObject_GC_Track should be called on the newly minted object prior to returning from the initialiser (and correspondingly PyObject_GC_UnTrack should be called prior to dismantling). This isn't being done in the module as it stands. Is the module wrong, or is my understanding of the reference manual incorrect? -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From andrewm at object-craft.com.au Fri Jan 7 08:54:54 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Fri Jan 7 08:55:02 2005 Subject: [Python-Dev] Minor change to behaviour of csv module Message-ID: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au> I'm considering a change to the csv module that could potentially break some obscure uses of the module (but CSV files usually quote, rather than escape, so the most common uses aren't effected). Currently, with a non-default escapechar='\\', input like: field one,field \ two,field three Returns: ["field one", "field \\\ntwo", "field three"] In the 2.5 series, I propose changing this to return: ["field one", "field \ntwo", "field three"] Is this reasonable? Is the old behaviour desirable in any way (we could add a switch to enable to new behaviour, but I feel that would only allow the confusion to continue)? BTW, some of my other changes have changed the exceptions raised when bad arguments were passed to the reader and writer factory functions - previously, the exceptions were semi-random, including TypeError, AttributeError and csv.Error - they should now almost always be TypeError (like most other argument passing errors). I can't see this being a problem, but I'm prepared to listen to arguments. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From bob at redivi.com Fri Jan 7 11:08:52 2005 From: bob at redivi.com (Bob Ippolito) Date: Fri Jan 7 11:09:10 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <0E9F643A-601E-11D9-AB1C-000A95BA5446@redivi.com> References: <41DB2C9A.4070800@v.loewis.de> <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> <41DC7D20.4000901@v.loewis.de> <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl> <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com> <0E9F643A-601E-11D9-AB1C-000A95BA5446@redivi.com> Message-ID: <21A975DC-6094-11D9-922D-000A95BA5446@redivi.com> On Jan 6, 2005, at 15:03, Bob Ippolito wrote: > > On Jan 6, 2005, at 14:59, Ronald Oussoren wrote: > >> >> On 6-jan-05, at 14:04, Jack Jansen wrote: >> >>> >>> On 6 Jan 2005, at 00:49, Martin v. L?wis wrote: >>>>> The "new" solution is basically to go back to the Unix way of >>>>> building an extension: link it against nothing and sort things >>>>> out at runtime. Not my personal preference, but at least we know >>>>> that loading an extension into one Python won't bring in a fresh >>>>> copy of a different interpreter or anything horrible like that. >>>> >>>> This sounds good, except that it only works on OS X 10.3, right? >>>> What about older versions? >>> >>> 10.3 or later. For older OSX releases (either because you build >>> Python on 10.2 or earlier, or because you've set >>> MACOSX_DEPLOYMENT_TARGET to a value of 10.2 or less) we use the old >>> behaviour of linking with "-framework Python". >> >> Wouldn't it be better to link with the actual dylib inside the >> framework on 10.2? Otherwise you can no longer build 2.3 extensions >> after you've installed 2.4. > > It would certainly be better to do this for 10.2. This patch implements the proposed direct framework linking: http://python.org/sf/1097739 -bob From andrewm at object-craft.com.au Fri Jan 7 13:06:23 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Fri Jan 7 13:06:23 2005 Subject: [Python-Dev] Minor change to behaviour of csv module In-Reply-To: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au> References: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au> Message-ID: <20050107120623.EC0673C8E5@coffee.object-craft.com.au> >I'm considering a change to the csv module that could potentially break >some obscure uses of the module (but CSV files usually quote, rather >than escape, so the most common uses aren't effected). > >Currently, with a non-default escapechar='\\', input like: > > field one,field \ > two,field three > >Returns: > > ["field one", "field \\\ntwo", "field three"] > >In the 2.5 series, I propose changing this to return: > > ["field one", "field \ntwo", "field three"] > >Is this reasonable? Is the old behaviour desirable in any way (we could >add a switch to enable to new behaviour, but I feel that would only >allow the confusion to continue)? Thinking about this further, I suspect we have to retain the current behaviour, as broken as it is, as the default: it's conceivable that someone somewhere is post-processing the result to remove the backslashes, and if we fix the csv module, we'll break their code. Note that PEP-305 had nothing to say about escaping, nor does the module reference manual. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From Jack.Jansen at cwi.nl Fri Jan 7 14:05:39 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Fri Jan 7 14:06:00 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: <21A975DC-6094-11D9-922D-000A95BA5446@redivi.com> References: <41DB2C9A.4070800@v.loewis.de> <98E121A6-5F70-11D9-94C7-000D934FF6B4@cwi.nl> <41DC7D20.4000901@v.loewis.de> <834E1A5F-5FE3-11D9-A99E-000A958D1666@cwi.nl> <7A747A97-601D-11D9-85EE-000D93AD379E@mac.com> <0E9F643A-601E-11D9-AB1C-000A95BA5446@redivi.com> <21A975DC-6094-11D9-922D-000A95BA5446@redivi.com> Message-ID: On 7 Jan 2005, at 11:08, Bob Ippolito wrote: >>>> 10.3 or later. For older OSX releases (either because you build >>>> Python on 10.2 or earlier, or because you've set >>>> MACOSX_DEPLOYMENT_TARGET to a value of 10.2 or less) we use the old >>>> behaviour of linking with "-framework Python". >>> >>> Wouldn't it be better to link with the actual dylib inside the >>> framework on 10.2? Otherwise you can no longer build 2.3 extensions >>> after you've installed 2.4. >> >> It would certainly be better to do this for 10.2. > > This patch implements the proposed direct framework linking: > http://python.org/sf/1097739 Looks good, I'll incorporate it. And as I haven't heard of any showstoppers for the -undefined dynamic_lookup (and Anthony seems to be offline this week) I'll put that in too. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From magnus at hetland.org Fri Jan 7 14:38:17 2005 From: magnus at hetland.org (Magnus Lie Hetland) Date: Fri Jan 7 14:38:32 2005 Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module In-Reply-To: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au> References: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au> Message-ID: <20050107133817.GB5503@idi.ntnu.no> Andrew McNamara : > [snip] > Currently, with a non-default escapechar='\\', input like: > > field one,field \ > two,field three > > Returns: > > ["field one", "field \\\ntwo", "field three"] > > In the 2.5 series, I propose changing this to return: > > ["field one", "field \ntwo", "field three"] IMO this is the *only* reasonable behaviour. I don't understand why the escape character should be left in; this is one of the reason why UNIX-style colon-separated values don't work with the current module. If one wanted the first version, one would (I presume) write field one,field \\\ two,field three -- Magnus Lie Hetland Fallen flower I see / Returning to its branch http://hetland.org Ah! a butterfly. [Arakida Moritake] From mcherm at mcherm.com Fri Jan 7 14:45:20 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Fri Jan 7 14:45:53 2005 Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module Message-ID: <1105105520.41de927049442@mcherm.com> Andrew explains that in the CSV module, escape characters are not properly removed. Magnus writes: > IMO this is the *only* reasonable behaviour. I don't understand why > the escape character should be left in; this is one of the reason why > UNIX-style colon-separated values don't work with the current module. Andrew writes back later: > Thinking about this further, I suspect we have to retain the current > behaviour, as broken as it is, as the default: it's conceivable that > someone somewhere is post-processing the result to remove the backslashes, > and if we fix the csv module, we'll break their code. I'm with Magnus on this. No one has 4 year old code using the CSV module. The existing behavior is just simply WRONG. Sure, of course we should try to maintain backward compatibility, but surely SOME cases don't require it, right? Can't we treat this misbehavior as an outright bug? -- Michael Chermside From aleax at aleax.it Fri Jan 7 14:51:52 2005 From: aleax at aleax.it (Alex Martelli) Date: Fri Jan 7 14:52:03 2005 Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module In-Reply-To: <1105105520.41de927049442@mcherm.com> References: <1105105520.41de927049442@mcherm.com> Message-ID: <48F57F83-60B3-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 07, at 14:45, Michael Chermside wrote: > Andrew explains that in the CSV module, escape characters are not > properly removed. > > Magnus writes: >> IMO this is the *only* reasonable behaviour. I don't understand why >> the escape character should be left in; this is one of the reason why >> UNIX-style colon-separated values don't work with the current module. > > Andrew writes back later: >> Thinking about this further, I suspect we have to retain the current >> behaviour, as broken as it is, as the default: it's conceivable that >> someone somewhere is post-processing the result to remove the >> backslashes, >> and if we fix the csv module, we'll break their code. > > I'm with Magnus on this. No one has 4 year old code using the CSV > module. > The existing behavior is just simply WRONG. Sure, of course we should > try to maintain backward compatibility, but surely SOME cases don't > require it, right? Can't we treat this misbehavior as an outright bug? +1 -- the nonremoval of escape characters smells like a bug to me, too. Alex From mwh at python.net Fri Jan 7 15:07:21 2005 From: mwh at python.net (Michael Hudson) Date: Fri Jan 7 15:28:37 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com> (Bob Ippolito's message of "Thu, 6 Jan 2005 14:38:56 -0500") References: <2mr7kyhf2j.fsf@starship.python.net> <9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com> Message-ID: <2mmzvlgwo6.fsf@starship.python.net> Bob Ippolito writes: > On Jan 6, 2005, at 8:17, Michael Hudson wrote: > >> Ilya Sandler writes: >> >>> A problem: >>> >>> The current struct.unpack api works well for unpacking C-structures >>> where >>> everything is usually unpacked at once, but it >>> becomes inconvenient when unpacking binary files where things >>> often have to be unpacked field by field. Then one has to keep track >>> of offsets, slice the strings,call struct.calcsize(), etc... >> >> IMO (and E), struct.unpack is the primitive atop which something more >> sensible is built. I've certainly tried to build that more sensible >> thing at least once, but haven't ever got the point of believing what >> I had would be applicable to the general case... maybe it's time to >> write such a thing for the standard library. > > This is my ctypes-like attempt at a high-level interface for struct. > It works well for me in macholib: > http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py Unsurprisingly, that's fairly similar to mine :) Cheers, mwh -- If trees could scream, would we be so cavalier about cutting them down? We might, if they screamed all the time, for no good reason. -- Jack Handey From theller at python.net Fri Jan 7 15:33:52 2005 From: theller at python.net (Thomas Heller) Date: Fri Jan 7 15:32:35 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <2mmzvlgwo6.fsf@starship.python.net> (Michael Hudson's message of "Fri, 07 Jan 2005 14:07:21 +0000") References: <2mr7kyhf2j.fsf@starship.python.net> <9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com> <2mmzvlgwo6.fsf@starship.python.net> Message-ID: > Bob Ippolito writes: >> This is my ctypes-like attempt at a high-level interface for struct. >> It works well for me in macholib: >> http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py Michael Hudson writes: > > Unsurprisingly, that's fairly similar to mine :) So, why don't you both use the original? ctypes works on the mac, too ;-) Thomas From bob at redivi.com Fri Jan 7 15:41:27 2005 From: bob at redivi.com (Bob Ippolito) Date: Fri Jan 7 15:41:33 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: References: <2mr7kyhf2j.fsf@starship.python.net> <9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com> <2mmzvlgwo6.fsf@starship.python.net> Message-ID: <36326B12-60BA-11D9-B08A-000A9567635C@redivi.com> On Jan 7, 2005, at 9:33 AM, Thomas Heller wrote: >> Bob Ippolito writes: > >>> This is my ctypes-like attempt at a high-level interface for struct. >>> It works well for me in macholib: >>> http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py > > Michael Hudson writes: >> >> Unsurprisingly, that's fairly similar to mine :) > > So, why don't you both use the original? > ctypes works on the mac, too ;-) I did use the original for the prototype of macholib! Then I wrote a version in pure python to eliminate the compiler dependency and ended up adding way more features than I actually needed (variable length nested structures and stuff like that). Eventually, I scaled it back to this so that it was easier to maintain and so that I could make some performance optimizations (well as many as you can make with the struct module). -bob From mwh at python.net Fri Jan 7 15:57:25 2005 From: mwh at python.net (Michael Hudson) Date: Fri Jan 7 15:57:27 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: (Thomas Heller's message of "Fri, 07 Jan 2005 15:33:52 +0100") References: <2mr7kyhf2j.fsf@starship.python.net> <9AB33D7A-601A-11D9-AB1C-000A95BA5446@redivi.com> <2mmzvlgwo6.fsf@starship.python.net> Message-ID: <2mis69gucq.fsf@starship.python.net> Thomas Heller writes: >> Bob Ippolito writes: > >>> This is my ctypes-like attempt at a high-level interface for struct. >>> It works well for me in macholib: >>> http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py > > Michael Hudson writes: >> >> Unsurprisingly, that's fairly similar to mine :) > > So, why don't you both use the original? > ctypes works on the mac, too ;-) Well, I probably wrote mine before ctypes worked on the Mac... and certainly when I was away from internet access. I guess I should look at ctypes' interface, at least... Cheers, mwh -- I located the link but haven't bothered to re-read the article, preferring to post nonsense to usenet before checking my facts. -- Ben Wolfson, comp.lang.python From ncoghlan at iinet.net.au Fri Jan 7 16:05:24 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri Jan 7 16:05:28 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates In-Reply-To: <1105034511.10728.3.camel@geddy.wooz.org> References: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com> <1105034511.10728.3.camel@geddy.wooz.org> Message-ID: <41DEA534.9090400@iinet.net.au> Barry Warsaw wrote: > As an experiment, I just added a PEP topic to the python-checkins > mailing list. You could subscribe to this list and just select the PEP > topic (which matches the regex "PEP" in the Subject header or first few > lines of the body). > > Give it a shot and let's see if that does the trick. Neat - I've subscribed to that topic now :) Do you mind if I suggest this to interested people on c.l.p? Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From barry at python.org Fri Jan 7 16:09:16 2005 From: barry at python.org (Barry Warsaw) Date: Fri Jan 7 16:09:21 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates In-Reply-To: <41DEA534.9090400@iinet.net.au> References: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com> <1105034511.10728.3.camel@geddy.wooz.org> <41DEA534.9090400@iinet.net.au> Message-ID: <1105110556.26433.57.camel@geddy.wooz.org> On Fri, 2005-01-07 at 10:05, Nick Coghlan wrote: > Barry Warsaw wrote: > > As an experiment, I just added a PEP topic to the python-checkins > > mailing list. You could subscribe to this list and just select the PEP > > topic (which matches the regex "PEP" in the Subject header or first few > > lines of the body). > > > > Give it a shot and let's see if that does the trick. > > Neat - I've subscribed to that topic now :) > > Do you mind if I suggest this to interested people on c.l.p? Please do (he says, hoping it works :). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050107/51aa8644/attachment-0001.pgp From FBatista at uniFON.com.ar Fri Jan 7 16:40:27 2005 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Fri Jan 7 16:43:17 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates Message-ID: [Barry Warsaw] > As an experiment, I just added a PEP topic to the python-checkins > mailing list. You could subscribe to this list and just select the PEP > topic (which matches the regex "PEP" in the Subject header or first few > lines of the body). > > Give it a shot and let's see if that does the trick. Can the defaults be configured? Because now the config is this: - Which topic categories would you like to subscribe to? (default: "pep" not checked) - Do you want to receive messages that do not match any topic filter? (default: No) So, happens the following (happened to me, je): You subscribe to the list, and confirm the registration, and you never get a message unless you change this. Thanks. . Facundo Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog PyAr - Python Argentina: http://pyar.decode.com.ar/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA. La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050107/6a350a3d/attachment.htm From tim.peters at gmail.com Fri Jan 7 17:00:42 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri Jan 7 17:00:45 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <20050107061322.A6A563C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050107061322.A6A563C8E5@coffee.object-craft.com.au> Message-ID: <1f7befae05010708005275e23d@mail.gmail.com> [Andrew McNamara] >> Also, review comments from Jeremy Hylton, 10 Apr 2003: >> >> I've been reviewing extension modules looking for C types that should >> participate in garbage collection. I think the csv ReaderObj and >> WriterObj should participate. The ReaderObj it contains a reference to >> input_iter that could be an arbitrary Python object. The iterator >> object could well participate in a cycle that refers to the ReaderObj. >> The WriterObj has a reference to a writeline callable, which could well >> be a method of an object that also points to the WriterObj. > I finally got around to looking at this, only to realise Jeremy did the > work back in Apr 2003 (thanks). One question, however - the GC doco in > the Python/C API seems to suggest to me that PyObject_GC_Track should be > called on the newly minted object prior to returning from the initialiser > (and correspondingly PyObject_GC_UnTrack should be called prior to > dismantling). This isn't being done in the module as it stands. Is the > module wrong, or is my understanding of the reference manual incorrect? The purpose of "tracking" and "untracking" is to let cyclic gc know when it (respectively) is and isn't safe to call an object's tp_traverse method. Primarily, when an object is first created at the C level, it may contain NULLs or heap trash in pointer slots, and then the object's tp_traverse could segfault if it were called while the object remained in an insane (wrt tp_traverse) state. Similarly, cleanup actions in the tp_dealloc may make a tp_traverse-sane object tp_traverse-insane, so tp_dealloc should untrack the object before that occurs. If tracking is never done, then the object effectively never participates in cyclic gc: its tp_traverse will never get called, and it will effectively act as an external root (keeping itself and everything reachable from it alive). So, yes, track it during construction, but not before all the members referenced by its tp_traverse are in a sane state. Putting the track call "at the end" of the constructor is usually best practice. tp_dealloc should untrack it then. In a debug build, that will assert-fail if the object hasn't actually been tracked. PyObject_GC_Del will untrack it for you (if it's still tracked), but it's risky to rely on that -- it's too easy to forget that Py_DECREFs on contained objects can end up executing arbitrary Python code (via __del__ and weakref callbacks, and via allowing other threads to run), which can in turn trigger a round of cyclic gc *while* your tp_dealloc is still running. So it's safest to untrack the object very early in tp_dealloc. I doubt this happens in the csv module, but an untrack/track pair should also be put around any block of method code that temporarily puts the object into a tp_traverse-insane state and that contains any C API calls that may end up triggering cyclic gc. That's very rare. From sjoerd at acm.org Fri Jan 7 17:01:31 2005 From: sjoerd at acm.org (Sjoerd Mullender) Date: Fri Jan 7 17:01:42 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates In-Reply-To: References: Message-ID: <41DEB25B.7040202@acm.org> Batista, Facundo wrote: > [Barry Warsaw] > > > As an experiment, I just added a PEP topic to the python-checkins > > mailing list. You could subscribe to this list and just select the PEP > > topic (which matches the regex "PEP" in the Subject header or first few > > lines of the body). > > > > Give it a shot and let's see if that does the trick. > > Can the defaults be configured? Because now the config is this: > > - Which topic categories would you like to subscribe to? (default: "pep" > not checked) > > - Do you want to receive messages that do not match any topic filter? > (default: No) > > So, happens the following (happened to me, je): You subscribe to the > list, and confirm the registration, and you never get a message unless > you change this. However, there is an additional line in the description: "If no topics of interest are selected above, then you will receive every message sent to the mailing list." In other words, don't check any topics, and you get everything. If you *do* check a topic, you only get the messages belonging to that topic. This seems to me a reasonable default. -- Sjoerd Mullender -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 374 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20050107/7807b763/signature.pgp From foom at fuhm.net Fri Jan 7 17:51:37 2005 From: foom at fuhm.net (James Y Knight) Date: Fri Jan 7 17:51:35 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE721229@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DE721229@au3010avexu1.global.avaya.com> Message-ID: <653D5CE4-60CC-11D9-8D68-000A95A50FB2@fuhm.net> On Jan 6, 2005, at 5:45 PM, Delaney, Timothy C (Timothy) wrote: > Well, there's my autosuper recipe you've seen before: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286195 > > which does basically what Philip descibes ... You missed the most important part of the example -- the automatic argument passing through unknowing methods. James From skip at pobox.com Fri Jan 7 17:09:13 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 7 18:10:27 2005 Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module In-Reply-To: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au> References: <20050107075454.AC1A13C8E5@coffee.object-craft.com.au> Message-ID: <16862.46121.778915.968964@montanaro.dyndns.org> Andrew> I'm considering a change to the csv module that could Andrew> potentially break some obscure uses of the module (but CSV files Andrew> usually quote, rather than escape, so the most common uses Andrew> aren't effected). I'm with the other respondents. This looks like a bug that should be squashed. Skip From ncoghlan at iinet.net.au Fri Jan 7 18:30:37 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri Jan 7 18:30:41 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates In-Reply-To: <1105110556.26433.57.camel@geddy.wooz.org> References: <0CFFADBB825C6249A26FDF11C1772AE101EDEB49@ingdexj1.ingdirect.com> <1105034511.10728.3.camel@geddy.wooz.org> <41DEA534.9090400@iinet.net.au> <1105110556.26433.57.camel@geddy.wooz.org> Message-ID: <41DEC73D.4090908@iinet.net.au> Barry Warsaw wrote: > Please do (he says, hoping it works :). Speaking of which. . . care to poke PEP 0 or one of the other PEP's? There's probably a couple of PEP's which could be moved from 'Open' to 'Accepted' or 'Accepted' to 'Implemented' to try it out. Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From foom at fuhm.net Fri Jan 7 18:45:53 2005 From: foom at fuhm.net (James Y Knight) Date: Fri Jan 7 18:45:53 2005 Subject: [Python-Dev] Re: super() harmful? In-Reply-To: References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> <9D5D97D6-5F6D-11D9-8D68-000A95A50FB2@fuhm.net> <091248B6-5FB7-11D9-8D68-000A95A50FB2@fuhm.net> Message-ID: On Jan 6, 2005, at 12:13 PM, Guido van Rossum wrote: > So it has nothing to do with the new paradigm, just with backwards > compatibility. I appreciate those issues (more than you'll ever know) > but I don't see why you should try to discourage others from using the > new paradigm, which is what your article appears to do. This is where I'm coming from: In my own code, it is very rare to have diamond inheritance structures. And if there are, even more rare that both sides need to cooperatively override a method. Given that, super has no necessary advantage. And it has disadvantages. - Backwards compatibility issues - Going along with that, inadvertent mixing of paradigms (you have to remember which classes you use super with and which you don't or your code might have hard-to-find errors). - Take your choice of: a) inability to add optional arguments to your methods, or b) having to use *args, **kwargs on every method and call super with those. - Having to try/catch AttributeErrors from super if you use interfaces instead of a base class to define the methods in use. So, I am indeed attempting to discourage people from using it, despite its importance. And also trying to educate people as to what they need to do if they have a case where it is necessary to use or if they just decide I'm full of crap and want to use it anyways. >> In order to make super really nice, it should be easier to use right. >> Again, the two major issues that cause problems are: 1) having to >> declare every method with *args, **kwargs, and having to pass those >> and >> all the arguments you take explicitly to super, > > That's only an issue with __init__ or with code written without > cooperative MI in mind. When using cooperative MI, you shouldn't > redefine method signatures, and all is well. I have two issues with that statement. Firstly, it's often quite useful to be able to add optional arguments to methods. Secondly, that's not a property of cooperative MI, but one of cooperative MI in python. As a counterpoint, with Dylan, you can add optional keyword arguments to a method as long as the generic was defined with the notation #key (specifying that it will accept keyword arguments at all). This is of course even true in a single inheritance situation like in the example below. Now please don't misunderstand me, here. I'm not at all trying to say that Python sucks because it's not Dylan. I don't even particularly like Dylan, but it does have a number of good ideas. Additionally, Python and Dylan differ in fundamental ways: Python has classes and inheritance, Dylan has generic functions/multimethods. Dylan is (I believe) generally whole-program-at-a-time compiled/optimized, Python is not. So, I think a solution for python would have to be fundamentally different as well. But anyways, an example of what I'm talking about: define generic g (arg1 :: , #key); define method g (arg1 :: , #key) format-out("number.g\n"); end method g; define method g (arg1 :: , #key base :: = 10) next-method(); format-out("rational.g %d\n", base); end method g; define method g (arg1 :: , #key) next-method(); format-out("integer.g\n"); end method g; // Prints: // number.g // rational.g 1 // integer.g g(1, base: 1); // Produces: Error: Unrecognized keyword (base) as the second argument in call of g g(1.0, base: 1); > Cooperative MI doesn't have a really good solution for __init__. > Defining and calling __init__ only with keyword arguments is a good > solution. But griping about "traditionally" is a backwards > compatibility issue, which you said you were leaving behind. Well, kind of. In my mind, it was a different kind of issue, as it isn't solved by everyone moving over to using super. As nearly all the code that currently uses super does so without using keyword arguments for __init__, I considered it not so much backwards compatibility as a re-educating users kind of issue, the same as the requirement for passing along all your arguments. > Exactly. What is your next_method statement supposed to do? Well that's easy. It's supposed to call the next function in the MRO with _all_ the arguments passed along, even the ones that the current function didn't explicitly ask for. I was afraid you might ask a hard question, like: if E2 inherits C's __init__, how the heck is it supposed to manage to take two arguments nonetheless. That one I *really* don't have an answer for. > No need to reply except when you've changed the article. I'm tired of > the allegations. Sigh. James From edcjones at erols.com Fri Jan 7 20:44:51 2005 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 7 20:43:21 2005 Subject: [Python-Dev] Concurrency and Python Message-ID: <41DEE6B3.8020006@erols.com> Today's Slashdot (http://slashdot.org/articles/05/01/07/158236.shtml?tid=137) points to: "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software" by Herb Sutter at "http://www.gotw.ca/publications/concurrency-ddj.htm". Is Python a suitable language for concurrent programming? Should Python be a good language for concurrent programming? Python nicely satisfies several user needs now including teaching beginners, scripting, algorithm development, non time-critical code, and wrapping libraries. Which of these users will be needing concurrency? What is the state of programming theory for concurrency? From jhylton at gmail.com Fri Jan 7 21:37:50 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Fri Jan 7 21:37:53 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python pythonrun.c, 2.161.2.15, 2.161.2.16 In-Reply-To: References: Message-ID: How's the merge going? And if I haven't already said thanks, then, thanks for doing it! Jeremy From foom at fuhm.net Fri Jan 7 21:56:55 2005 From: foom at fuhm.net (James Y Knight) Date: Fri Jan 7 21:56:59 2005 Subject: [Python-Dev] Concurrency and Python In-Reply-To: <41DEE6B3.8020006@erols.com> References: <41DEE6B3.8020006@erols.com> Message-ID: On Jan 7, 2005, at 2:44 PM, Edward C. Jones wrote: > Is Python a suitable language for concurrent programming? Depends on what you mean. Python is not very good for shared-memory concurrent programming. (That is, threads). The library doesn't have enough good abstractions for locks/synchronization/etc, and, of course the big issue of CPython only allowing one thread to execute bytecode at a time. At the moment, threads are the fad, but I don't believe that will be scaling very well. As you scale up the number of CPUs, the amount of time wasted on memory synchronization similarly goes up, until you're wasting more time on memory consistency than doing actual work. Thus, I expect the trend to be more towards async message passing architectures (that is, multiple processes each with their own memory), instead, and I think Python is about as good for that as any existing language. Which is to say: reasonable, but not insanely great. > What is the state of programming theory for concurrency? For an example of the kind of new language being developed around a asynchronous message passing model, see IBM's poorly-named "X10" language. I saw a talk on it and thought it sounded very promising. What it adds over the usual message passing system is an easier way to name and access remote data and to spawn parallel activities that operates on that data. The part about arrays of data spread out over a number of different "places" (roughly, a CPU and its own memory) and how to operate on them I found especially interesting. I tried to find their project website, but since their name conflicts with the home automation system, it's hard to google for. Or perhaps they don't have a website. Short summary information: http://www.csail.mit.edu/events/eventcalendar/calendar.php? show=event&id=131 Talk slides: http://www.cs.ualberta.ca/~amaral/cascon/CDP04/slides/sarkar.pdf More talk slides, and a video: http://www.research.ibm.com/vee04/video.html#sarkar "Vivek Sarkar, Language and Virtual Machine Challenges for Large-Scale Parallel Systems" James From kbk at shore.net Fri Jan 7 22:18:11 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri Jan 7 22:18:43 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python pythonrun.c, 2.161.2.15, 2.161.2.16 In-Reply-To: (Jeremy Hylton's message of "Fri, 7 Jan 2005 15:37:50 -0500") References: Message-ID: <87brc17xbg.fsf@hydra.bayview.thirdcreek.com> Jeremy Hylton writes: > How's the merge going? Looks like it's done. I tagged ast-branch when I finished: merged_from_MAIN_07JAN05 Right now I'm trying to get Python-ast.c to compile. It wasn't modified by the merge, so there's some other issue. > And if I haven't already said thanks, then, thanks for doing it! You're welcome! I volunteer to keep ast-branch synch'd, how often do you want to do it? -- KBK From olsongt at verizon.net Fri Jan 7 22:37:47 2005 From: olsongt at verizon.net (olsongt@verizon.net) Date: Fri Jan 7 22:37:50 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python pythonrun.c, 2.161.2.15, 2.161.2.16 Message-ID: <20050107213748.KMJJ28362.out005.verizon.net@outgoing.verizon.net> > > From: kbk@shore.net (Kurt B. Kaiser) > Date: 2005/01/07 Fri PM 09:18:11 GMT > To: python-dev@python.org > Subject: Re: [Python-Dev] Re: [Python-checkins] python/dist/src/Python > pythonrun.c, 2.161.2.15, 2.161.2.16 > > Jeremy Hylton writes: > > > How's the merge going? > > Looks like it's done. I tagged ast-branch when I finished: > > merged_from_MAIN_07JAN05 > > Right now I'm trying to get Python-ast.c to compile. It wasn't > modified by the merge, so there's some other issue. > Python-ast.c should be autogenerated in the make process by asdl_c.py. There are still some bugs in it. The fix I think you need is posted. A full diff against the current python_ast.c is attached to patch 742621. @@ -1310,7 +1310,7 @@ free_expr(o->v.Repr.value); break; case Num_kind: - Py_DECREF(o->v.Num.n); + free_expr(o->v.Num.n); break; case Str_kind: Py_DECREF(o->v.Str.s) From jhylton at gmail.com Fri Jan 7 22:43:56 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Fri Jan 7 22:43:59 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python pythonrun.c, 2.161.2.15, 2.161.2.16 In-Reply-To: <87brc17xbg.fsf@hydra.bayview.thirdcreek.com> References: <87brc17xbg.fsf@hydra.bayview.thirdcreek.com> Message-ID: On Fri, 07 Jan 2005 16:18:11 -0500, Kurt B. Kaiser wrote: > Looks like it's done. I tagged ast-branch when I finished: > > merged_from_MAIN_07JAN05 > > Right now I'm trying to get Python-ast.c to compile. It wasn't > modified by the merge, so there's some other issue. I'm getting a compilation failure in symtable.c: gcc -pthread -c -fno-strict-aliasing -g -Wall -Wstrict-prototypes -I. -I../Include -DPy_BUILD_CORE -o Python/symtable.o ../Python/symtable.c ../Python/symtable.c: In function `symtable_new': ../Python/symtable.c:193: structure has no member named `st_tmpname' Do you see that? There is this one ugly corner of Python-ast.c. There's a routine that expects to take a pointer to a node, but instead gets passed an int. The generated code is bogus, and I haven't decided if it needs to be worried about. You need to manually edit the generated code to add a cast. > > And if I haven't already said thanks, then, thanks for doing it! > > You're welcome! I volunteer to keep ast-branch synch'd, how often > do you want to do it? I don't think we'll need to merge again. This last merge got all the language changes that were made for 2.4. Since we've agreed to a moratorium on more compiler/bytecode changes, we shouldn't need to merge from the head again. Jeremy From kbk at shore.net Sat Jan 8 01:14:00 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat Jan 8 01:14:25 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python pythonrun.c, 2.161.2.15, 2.161.2.16 In-Reply-To: (Jeremy Hylton's message of "Fri, 7 Jan 2005 16:43:56 -0500") References: <87brc17xbg.fsf@hydra.bayview.thirdcreek.com> Message-ID: <877jmo93qv.fsf@hydra.bayview.thirdcreek.com> Jeremy Hylton writes: > ../Python/symtable.c:193: structure has no member named `st_tmpname' > > Do you see that? Yeah, the merge eliminated it from the symtable struct in symtable.h. You moved it to symtable_entry at rev 2.12 in MAIN :-) I'll research it. Apparently my build differs enough so that I'm still stuck in Python-ast.c (once I had fixed pythonrun.c). > There is this one ugly corner of Python-ast.c. There's a routine > that expects to take a pointer to a node, but instead gets passed an > int. The generated code is bogus, and I haven't decided if it needs > to be worried about. You need to manually edit the generated code to > add a cast. OK, I was looking in that direction. Problem is with cmpop stuff. Three hard errors when compiling. OpenBSD. [...] > I don't think we'll need to merge again. This last merge got all the > language changes that were made for 2.4. Since we've agreed to a > moratorium on more compiler/bytecode changes, we shouldn't need to > merge from the head again. Is the plan to merge ast-branch to MAIN? If so, it's a little tricky since all the changes to MAIN are on ast-branch. So just before the final merge we need to merge MAIN to ast-branch once more and then merge the diff from HEAD to ast-branch back to MAIN. Or something like that. -- KBK From ilya at bluefir.net Sat Jan 8 04:40:18 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Sat Jan 8 04:37:35 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer> References: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer> Message-ID: I will try to respond to all comments at once. But first a clarification: -I am not trying to design a high-level API on top of existing struct.unpack and -I am not trying to design a replacement for struct.unpack (If I were to replace struct.unpack(), then I would probably go along the lines of StructReader suggested by Raymond) I view struct module as a low-level (un)packing library on top on which a more complex stuff can be built and I am simply suggesting a way to improve this low level functionality... > > We could have an optional offset argument for > > > > unpack(format, buffer, offset=None) > > > > the offset argument is an object which contains a single integer field > > which gets incremented inside unpack() to point to the next byte. > As for "passing offset implies the length is calcsize(fmt)" sub-concept, > I find that slightly more controversial. It's convenient, > but somewhat ambiguous; in other cases (e.g. string methods) passing a > start/offset and no end/length means to go to the end. I am not sure I agree: in most cases starting offset and no length/end just means: "start whatever you are doing at this offset and stop it whenever you are happy.." At least that's the way I was alway thinking about functions like string.find() and friends Suggested struct.unpack() change seems to fit this mental model very well >> the offset argument is an object which contains a single integer field >> which gets incremented inside unpack() to point to the next byte. > I find this just too "magical". Why would it be magical? There is no guessing of user intentions involved. The function simply returns/uses an extra piece of information if the user asks for it. And the function already computes this piece of information.. > It's only useful when you're specifically unpacking data bytes that are > compactly back to back (no "filler" e.g. for alignment purposes) Yes, but it's a very common case when dealing with binary files formats. Eg. I just looked at xdrlib.py code and it seems that almost every invocation of struct._unpack would shrink from 3 lines to 1 line of code ( i = self.__pos self.__pos = j = i+4 data = self.__buf[i:j] return struct.unpack('>l', data)[0] would become: return struct.unpack('>l', self.__buf, self.__pos)[0] ) There are probably other places in stdlib which would benefit from this api and stdlib does not deal with binary files that much.. >and pays some conceptual price -- introducing a new specialized type > to play the role of "mutable int" but the user does not have to pay anything if he does not need it! The change is backward compatible. (Note that just supporting int offsets would eliminate slicing, but it would not eliminate other annoyances, and it's possible to support both Offset and int args, is it worth the hassle?) > and having an argument mutated, which is not usual in Python's library. Actually, it's so common that we simply stop noticing it :-) Eg. when we call a superclass's method: SuperClass.__init__(self) So, while I agree that there is an element of unusualness in the suggested unpack() API, this element seems pretty small to me > All in all, I suspect that something like. > hdrsize = struct.calcsize(hdr_fmt) > itemsize = struct.calcsize(item_fmt) > reclen = length_of_each_record > rec = binfile.read(reclen) > hdr = struct.unpack(hdr_fmt, rec, 0, hdrsize) > for offs in itertools.islice(xrange(hdrsize, reclen, itemsize), hdr[0]): > item = struct.unpack(item_fmt, rec, offs, itemsize) > # process item >might be a better compromise I think I again disagree: your example is almost as verbose as the current unpack() api and you still need to call calcsize() explicitly and I don't think there is any chance of gaining any noticeable perfomance benefit. Too little gain to bother with any changes... > struct.pack/struct.unpack is already one of my least-favourite parts > of the stdlib. Of the modules I use regularly, I pretty much only ever > have to go back and re-read the struct (and re) documentation because > they just won't fit in my brain. Adding additional complexity to them > seems like a net loss to me. Net loss to the end programmer? But if he does not need new functionality he doesnot have to use it! In fact, I started with providing an example of how new api makes client code simpler > I'd much rather specify the format as something like a tuple of values - > (INT, UINT, INT, STRING) (where INT &c are objects defined in the > struct module). This also then allows users to specify their own formats > if they have a particular need for something I don't disagree, but I think it's orthogonal to offset issue Ilya On Thu, 6 Jan 2005, Raymond Hettinger wrote: > [Ilya Sandler] > > A problem: > > > > The current struct.unpack api works well for unpacking C-structures > where > > everything is usually unpacked at once, but it > > becomes inconvenient when unpacking binary files where things > > often have to be unpacked field by field. Then one has to keep track > > of offsets, slice the strings,call struct.calcsize(), etc... > > Yes. That bites. > > > > Eg. with a current api unpacking of a record which consists of a > > header followed by a variable number of items would go like this > > > > hdr_fmt="iiii" > > item_fmt="IIII" > > item_size=calcsize(item_fmt) > > hdr_size=calcsize(hdr_fmt) > > hdr=unpack(hdr_fmt, rec[0:hdr_size]) #rec is the record to unpack > > offset=hdr_size > > for i in range(hdr[0]): #assume 1st field of header is a counter > > item=unpack( item_fmt, rec[ offset: offset+item_size]) > > offset+=item_size > > > > which is quite inconvenient... > > > > > > A solution: > > > > We could have an optional offset argument for > > > > unpack(format, buffer, offset=None) > > > > the offset argument is an object which contains a single integer field > > which gets incremented inside unpack() to point to the next byte. > > > > so with a new API the above code could be written as > > > > offset=struct.Offset(0) > > hdr=unpack("iiii", offset) > > for i in range(hdr[0]): > > item=unpack( "IIII", rec, offset) > > > > When an offset argument is provided, unpack() should allow some bytes > to > > be left unpacked at the end of the buffer.. > > > > > > Does this suggestion make sense? Any better ideas? > > Rather than alter struct.unpack(), I suggest making a separate class > that tracks the offset and encapsulates some of the logic that typically > surrounds unpacking: > > r = StructReader(rec) > hdr = r('iiii') > for item in r.getgroups('IIII', times=rec[0]): > . . . > > It would be especially nice if it handled the more complex case where > the next offset is determined in-part by the data being read (see the > example in section 11.3 of the tutorial): > > r = StructReader(open('myfile.zip', 'rb')) > for i in range(3): # show the first 3 file headers > fields = r.getgroup('LLLHH', offset=14) > crc32, comp_size, uncomp_size, filenamesize, extra_size = fields > filename = g.getgroup('c', offset=16, times=filenamesize) > extra = g.getgroup('c', times=extra_size) > r.advance(comp_size) > print filename, hex(crc32), comp_size, uncomp_size > > If you come up with something, I suggest posting it as an ASPN recipe > and then announcing it on comp.lang.python. That ought to generate some > good feedback based on other people's real world issues with > struct.unpack(). > > > Raymond Hettinger > > From ncoghlan at iinet.net.au Sat Jan 8 05:50:47 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat Jan 8 05:50:51 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: References: Message-ID: <41DF66A7.60800@iinet.net.au> Ilya Sandler wrote: > item=unpack( "IIII", rec, offset) How about making offset a standard integer, and change the signature to return a tuple when it is used: item = unpack(format, rec) # Full unpacking offset = 0 item, offset = unpack(format, rec, offset) # Partial unpacking The second item in the returned tuple being the offset of the first byte after the end of the unpacked item. Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From python at rcn.com Sat Jan 8 06:20:43 2005 From: python at rcn.com (Raymond Hettinger) Date: Sat Jan 8 06:24:03 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: Message-ID: <000a01c4f541$cd8a78a0$86b89d8d@oemcomputer> [Ilya Sandler] > I view struct module as a low-level (un)packing library on top on which > a more complex stuff can be built and I am simply suggesting a way to > improve this low level functionality... > > > We could have an optional offset argument for > > > > > > unpack(format, buffer, offset=None) > > > > > > the offset argument is an object which contains a single integer field > > > which gets incremented inside unpack() to point to the next byte. -1 on any modification of the existing unpack() function. It is already at its outer limits of complexity. Attaching a stateful tracking object (needing its own constructor and api) is not an improvement IMO. Also, I find the proposed "offset" object to be conceptually difficult to follow for anything other than the simplest case -- for anything else, it will make designing, reviewing, and debugging more difficult than it is now. In contrast, code built using the StructReader proposal leads to more flexible, readable code. Experience with the csv module points to reader objects being a better solution. [Nick Coghlan] > How about making offset a standard integer, and change the signature to > return a > tuple when it is used: > > item = unpack(format, rec) # Full unpacking > offset = 0 > item, offset = unpack(format, rec, offset) # Partial unpacking > > The second item in the returned tuple being the offset of the first byte > after > the end of the unpacked item Using standard integers helps improve the proposal by making the operation less obscure. But having than having the signature change is bad; create a separate function instead: item, offset = unpack_here(format, rec, offset) One other wrinkle is that "item" is itself a tuple and the whole thing looks odd if unpacked: ((var0, var1, var2, var3), offset) = unpack_here(fmtstr, rec, offset) Raymond From ilya at bluefir.net Sat Jan 8 06:37:36 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Sat Jan 8 06:34:52 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <41DF66A7.60800@iinet.net.au> References: <41DF66A7.60800@iinet.net.au> Message-ID: > How about making offset a standard integer, and change the signature to > return tuple when it is used: > item, offset = unpack(format, rec, offset) # Partial unpacking Well, it would work well when unpack results are assigned to individual vars: x,y,offset=unpack( "ii", rec, offset) but it gets more complicated if you have something like: coords=unpack("10i", rec) How would you pass/return offsets here? As an extra element in coords? coords=unpack("10i", rec, offset) offset=coords.pop() But that would be counterintuitive and somewhat inconvinient.. Ilya On Sat, 8 Jan 2005, Nick Coghlan wrote: > Ilya Sandler wrote: > > item=unpack( "IIII", rec, offset) > > How about making offset a standard integer, and change the signature to return a > tuple when it is used: > > item = unpack(format, rec) # Full unpacking > offset = 0 > item, offset = unpack(format, rec, offset) # Partial unpacking > > The second item in the returned tuple being the offset of the first byte after > the end of the unpacked item. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan@email.com | Brisbane, Australia > --------------------------------------------------------------- > http://boredomandlaziness.skystorm.net > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ilya%40bluefir.net > From kbk at shore.net Sat Jan 8 07:33:23 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat Jan 8 07:34:41 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python pythonrun.c, 2.161.2.15, 2.161.2.16 In-Reply-To: <877jmo93qv.fsf@hydra.bayview.thirdcreek.com> (Kurt B. Kaiser's message of "Fri, 07 Jan 2005 19:14:00 -0500") References: <87brc17xbg.fsf@hydra.bayview.thirdcreek.com> <877jmo93qv.fsf@hydra.bayview.thirdcreek.com> Message-ID: <873bxc8m6k.fsf@hydra.bayview.thirdcreek.com> kbk@shore.net (Kurt B. Kaiser) writes: > [JH] >> ../Python/symtable.c:193: structure has no member named `st_tmpname' >> >> Do you see that? > > Yeah, the merge eliminated it from the symtable struct in symtable.h. > You moved it to symtable_entry at rev 2.12 in MAIN :-) > > I'll research it. I think it would be more efficient if you tackled it since almost all the work is in compile.c ==> newcompile.c The relevant changes are compile.c 2.286 symtable.h 2.12 symtable.c 2.11 www.python.org/sf/734869 > Apparently my build differs enough so that I'm still stuck in > Python-ast.c (once I had fixed pythonrun.c). I resolved all the errors/warnings and diffed to against respository. I was astonished to see the same changes, slightly different, being replaced by mine. Those were /your/ tweaks. Apparently the $(AST_H) $(AST_C): target ran and Python-ast.c was recreated (without the changes). It's not clear to me how/why that happened. I did start with a clean checkout, but it seems that the target only runs if Python-ast.c and/or its .h are missing (they should have been in the checkout), or older than Python.asdl, which they are not. I don't see them in the .cvsignore. Very amusing. -- KBK From ncoghlan at iinet.net.au Sat Jan 8 08:15:23 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat Jan 8 08:15:27 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: References: <41DF66A7.60800@iinet.net.au> Message-ID: <41DF888B.2020208@iinet.net.au> Ilya Sandler wrote: >>How about making offset a standard integer, and change the signature to >>return tuple when it is used: >> item, offset = unpack(format, rec, offset) # Partial unpacking > > > Well, it would work well when unpack results are assigned to individual > vars: > > x,y,offset=unpack( "ii", rec, offset) > > but it gets more complicated if you have something like: > coords=unpack("10i", rec) > > How would you pass/return offsets here? As an extra element in coords? > coords=unpack("10i", rec, offset) > offset=coords.pop() > > But that would be counterintuitive and somewhat inconvinient.. I was thinking more along the lines of returning a 2-tuple with the 'normal' result of unpack as the first element: coords, offset = unpack("ii", rec, offset) x, y = coords Raymond's suggestion of a separate function like 'unpack_here' is probably a good one, as magically changing function signatures are evil. Something like: def unpack_here(format, record, offset = 0): end = offset + calcsize(format) return (unpack(format, record[offset:end]), end) Presumably, a C version could avoid the slicing and hence be significantly more efficient. Yes, the return type is a little clumsy, but it should still make it easier to write more efficient higher-level API's that unpack the structure a piece at a time. Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From p.f.moore at gmail.com Sat Jan 8 12:09:47 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Sat Jan 8 12:09:51 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: References: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer> Message-ID: <79990c6b05010803092b1570d1@mail.gmail.com> On Fri, 7 Jan 2005 19:40:18 -0800 (PST), Ilya Sandler wrote: > Eg. I just looked at xdrlib.py code and it seems that almost every > invocation of struct._unpack would shrink from 3 lines to 1 line of code > > ( i = self.__pos > self.__pos = j = i+4 > data = self.__buf[i:j] > return struct.unpack('>l', data)[0] > > would become: > return struct.unpack('>l', self.__buf, self.__pos)[0] > ) FWIW, I could read and understand your original code without any problems, whereas in the second version I would completely miss the fact that self.__pos is updated, precisely because mutating arguments are very rare in Python functions. OTOH, Nick's idea of returning a tuple with the new offset might make your example shorter without sacrificing readability: result, newpos = struct.unpack('>l', self.__buf, self.__pos) self.__pos = newpos # retained "newpos" for readability... return result A third possibility - rather than "magically" adding an additional return value because you supply a position, you could have a "where am I?" format symbol (say & by analogy with the C "address of" operator). Then you'd say result, newpos = struct.unpack('>l&', self.__buf, self.__pos) Please be aware, I don't have a need myself for this feature - my interest is as a potential reader of others' code... Paul. From anthony at interlink.com.au Sat Jan 8 14:05:15 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Sat Jan 8 14:05:04 2005 Subject: [Python-Dev] 2.3.5 schedule, and something I'd like to get in In-Reply-To: References: <21A975DC-6094-11D9-922D-000A95BA5446@redivi.com> Message-ID: <200501090005.15921.anthony@interlink.com.au> On Saturday 08 January 2005 00:05, Jack Jansen wrote: > > This patch implements the proposed direct framework linking: > > http://python.org/sf/1097739 > > Looks good, I'll incorporate it. And as I haven't heard of any > showstoppers for the -undefined dynamic_lookup (and Anthony seems to be > offline this week) I'll put that in too. Sorry, I've been busy on other projects for the last couple of weeks, and email's backed up to an alarming degree. Currently I'm thinking of a 2.3.5 sometime around the 20th or so. I'll have a better idea next week, once I've been back at work for a couple of days and I've seen what stuff's backed up awaiting my time. At the moment I'm thinking of a 2.4.1 in maybe early March. The only really outstanding bugfix is the marshal one, afaik. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From gvanrossum at gmail.com Sat Jan 8 17:52:31 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat Jan 8 17:52:34 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: <79990c6b05010803092b1570d1@mail.gmail.com> References: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer> <79990c6b05010803092b1570d1@mail.gmail.com> Message-ID: First, let me say two things: (a) A higher-level API can and should be constructed which acts like a (binary) stream but has additional methods for reading and writing values using struct format codes (or, preferably, somewhat higher-level type names, as suggested). Instances of this API should be constructable from a stream or from a "buffer" (e.g. a string). (b) -1 on Ilya's idea of having a special object that acts as an input-output integer; it is too unpythonic (no matter your objection). [Paul Moore] > OTOH, Nick's idea of returning a tuple with the new offset might make > your example shorter without sacrificing readability: > > result, newpos = struct.unpack('>l', self.__buf, self.__pos) > self.__pos = newpos # retained "newpos" for readability... > return result This is okay, except I don't want to overload this on unpack() -- let's pick a different function name like unpack_at(). > A third possibility - rather than "magically" adding an additional > return value because you supply a position, you could have a "where am > I?" format symbol (say & by analogy with the C "address of" operator). > Then you'd say > > result, newpos = struct.unpack('>l&', self.__buf, self.__pos) > > Please be aware, I don't have a need myself for this feature - my > interest is as a potential reader of others' code... I think that adding more magical format characters is probably not doing the readers of this code a service. I do like the idea of not introducing an extra level of tuple to accommodate the position return value but instead make it the last item in the tuple when using unpack_at(). Then the definition would be: def unpack_at(fmt, buf, pos): size = calcsize(fmt) end = pos + size data = buf[pos:end] if len(data) < size: raise struct.error("not enough data for format") # if data is too long that would be a bug in buf[pos:size] and cause an error below ret = unpack(fmt, data) ret = ret + (end,) return ret -- --Guido van Rossum (home page: http://www.python.org/~guido/) From m.bless at gmx.de Sat Jan 8 18:40:33 2005 From: m.bless at gmx.de (Martin Bless) Date: Sat Jan 8 18:40:26 2005 Subject: [Python-Dev] Re: csv module TODO list References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> Message-ID: <3o50u0tmv1hbpt71jkre94n32q38cdpbdb@4ax.com> I'd love to see a 'split' and a 'join' function in the csv module to just convert between string and list without having to bother about files. Something like csv.split(aStr [, dialect='excel'[, fmtparam]]) -> list object and csv.join(aList, e[, dialect='excel'[, fmtparam]]) -> str object Feasible? mb - Martin From kbk at shore.net Sat Jan 8 20:15:36 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat Jan 8 20:16:04 2005 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200501081915.j08JFaqo021328@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 267 open ( +6) / 2727 closed ( +9) / 2994 total (+15) Bugs : 798 open ( -3) / 4748 closed (+15) / 5546 total (+12) RFE : 165 open ( +0) / 140 closed ( +1) / 305 total ( +1) New / Reopened Patches ______________________ Remove witty comment in pydoc.py (2005-01-01) CLOSED http://python.org/sf/1094007 opened by Reinhold Birkenfeld Docs for file() vs open() (2005-01-01) CLOSED http://python.org/sf/1094011 opened by Reinhold Birkenfeld Improvements for shutil.copytree() (2005-01-01) CLOSED http://python.org/sf/1094015 opened by Reinhold Birkenfeld xml.dom.minidom.Node.replaceChild(obj, x, x) removes child x (2005-01-01) http://python.org/sf/1094164 opened by Felix Rabe os.py: base class _Environ on dict instead of UserDict (2005-01-02) http://python.org/sf/1094387 opened by Matthias Klose add Bunch type to collections module (2005-01-02) http://python.org/sf/1094542 opened by Steven Bethard self.button.pack() in tkinter.tex example (2005-01-03) http://python.org/sf/1094815 opened by [N/A] fixes urllib2 digest to allow arbitrary methods (2005-01-03) http://python.org/sf/1095362 opened by John Reese Argument passing from /usr/bin/idle2.3 to idle.py (2003-11-30) http://python.org/sf/851459 reopened by jafo fix for trivial flatten bug in astgen (2005-01-04) http://python.org/sf/1095541 opened by DSM exclude CVS conflict files in sdist command (2005-01-04) http://python.org/sf/1095784 opened by Wummel Fix for wm_iconbitmap to allow .ico files under Windows. (2005-01-05) http://python.org/sf/1096231 opened by John Fouhy Info Associated with Merge to AST (2005-01-07) http://python.org/sf/1097671 opened by Kurt B. Kaiser Direct framework linking for MACOSX_DEPLOYMENT_TARGET < 10.3 (2005-01-07) http://python.org/sf/1097739 opened by Bob Ippolito Encoding for Code Page 273 used by EBCDIC Germany Austria (2005-01-07) http://python.org/sf/1097797 opened by Michael Bierenfeld Patches Closed ______________ locale.getdefaultlocale does not return tuple in some OS (2004-10-21) http://python.org/sf/1051395 closed by rhettinger imghdr -- identify JPEGs in EXIF format (2003-06-08) http://python.org/sf/751031 closed by rhettinger Remove witty comment in pydoc.py (2005-01-01) http://python.org/sf/1094007 closed by rhettinger Docs for file() vs open() (2005-01-01) http://python.org/sf/1094011 closed by rhettinger Improvements for shutil.copytree() (2005-01-01) http://python.org/sf/1094015 closed by jlgijsbers a new subprocess.call which raises an error on non-zero rc (2004-11-23) http://python.org/sf/1071764 closed by astrand Argument passing from /usr/bin/idle2.3 to idle.py (2003-11-30) http://python.org/sf/851459 closed by jafo @decorators, including classes (2004-08-12) http://python.org/sf/1007991 closed by jackdied Convert glob.glob to generator-based DFS (2004-04-27) http://python.org/sf/943206 closed by jlgijsbers Make cgi.py use email instead of rfc822 or mimetools (2004-12-06) http://python.org/sf/1079734 closed by jlgijsbers New / Reopened Bugs ___________________ marshal.dumps('hello',0) "Access violation" (2005-01-03) CLOSED http://python.org/sf/1094960 opened by Mark Brophy General FAW - incorrect "most stable version" (2005-01-03) http://python.org/sf/1095328 opened by Tim Delaney Python FAQ: list.sort() out of date (2005-01-03) CLOSED http://python.org/sf/1095342 opened by Tim Delaney Bug In Python (2005-01-04) CLOSED http://python.org/sf/1095789 opened by JastheAce "Macintosh" references in the docs need to be checked. (2005-01-04) http://python.org/sf/1095802 opened by Jack Jansen The doc for DictProxy is missing (2005-01-04) http://python.org/sf/1095821 opened by Colin J. Williams Apple-installed Python fails to build extensions (2005-01-04) http://python.org/sf/1095822 opened by Jack Jansen time.tzset() not built on Solaris (2005-01-05) http://python.org/sf/1096244 opened by Gregory Bond sys.__stdout__ doco isn't discouraging enough (2005-01-05) http://python.org/sf/1096310 opened by Just van Rossum _DummyThread() objects not freed from threading._active map (2004-12-22) http://python.org/sf/1089632 reopened by saravanand Example needed in os.stat() (2005-01-06) CLOSED http://python.org/sf/1097229 opened by Facundo Batista SimpleHTTPServer sends wrong Content-Length header (2005-01-06) http://python.org/sf/1097597 opened by David Schachter urllib2 doesn't handle urls without a scheme (2005-01-07) http://python.org/sf/1097834 opened by Jack Jansen getsource and getsourcelines in the inspect module (2005-01-07) CLOSED http://python.org/sf/1098134 opened by Bj?rn Lindqvist mailbox should use email not rfc822 (2003-06-19) http://python.org/sf/756982 reopened by jlgijsbers typo in "Python Tutorial": 1. Whetting your appetite (2005-01-08) http://python.org/sf/1098497 opened by Ludootje Bugs Closed ___________ Don't define _SGAPI on IRIX (2003-04-27) http://python.org/sf/728330 closed by loewis python24.msi install error (2004-12-01) http://python.org/sf/1076500 closed by loewis garbage collector still documented as optional (2004-12-27) http://python.org/sf/1091740 closed by rhettinger marshal.dumps('hello',0) "Access violation" (2005-01-03) http://python.org/sf/1094960 closed by rhettinger General FAQ: list.sort() out of date (2005-01-04) http://python.org/sf/1095342 closed by jlgijsbers Bug In Python (2005-01-04) http://python.org/sf/1095789 closed by rhettinger test_macostools fails when running from source (2004-07-16) http://python.org/sf/992185 closed by jackjansen Sate/Save typo in Mac/scripts/BuildApplication.py (2004-12-01) http://python.org/sf/1076490 closed by jackjansen _DummyThread() objects not freed from threading._active map (2004-12-22) http://python.org/sf/1089632 closed by bcannon Example needed in os.stat() (2005-01-06) http://python.org/sf/1097229 closed by facundobatista gethostbyaddr on redhat for multiple hostnames (2004-12-14) http://python.org/sf/1085069 closed by loewis Unable to see Python binary (2004-12-10) http://python.org/sf/1082874 closed by loewis Change in signal function in the signal module (2004-12-10) http://python.org/sf/1083177 closed by akuchling test_descr fails on win2k (2004-07-12) http://python.org/sf/989337 closed by rhettinger test_imp failure (2004-07-15) http://python.org/sf/991708 closed by rhettinger getsource and getsourcelines in the inspect module (2005-01-07) http://python.org/sf/1098134 closed by jlgijsbers crash (SEGV) in Py_EndInterpreter() (2002-11-17) http://python.org/sf/639611 closed by jlgijsbers shutil.copytree copies stat of files, but not of dirs (2004-10-18) http://python.org/sf/1048878 closed by jlgijsbers shutil.copytree uses os.mkdir instead of os.mkdirs (2004-06-19) http://python.org/sf/975763 closed by jlgijsbers RFE Closed __________ optparse .error() should print options list (2004-12-22) http://python.org/sf/1089955 closed by gward From skip at pobox.com Sat Jan 8 21:45:25 2005 From: skip at pobox.com (Skip Montanaro) Date: Sat Jan 8 21:45:30 2005 Subject: [Python-Dev] os.removedirs() vs. shutil.rmtree() Message-ID: <16864.18021.476235.551214@montanaro.dyndns.org> Is there a reason the standard library needs both os.removedirs and shutil.rmtree? They seem awful similar to me (I can see they aren't really identical). Ditto for os.renames and shutil.move. Presuming they are all really needed, is there some reason they don't all belong in the same module? Skip From jlg at dds.nl Sat Jan 8 22:05:29 2005 From: jlg at dds.nl (Johannes Gijsbers) Date: Sat Jan 8 22:02:10 2005 Subject: [Python-Dev] os.removedirs() vs. shutil.rmtree() In-Reply-To: <16864.18021.476235.551214@montanaro.dyndns.org> References: <16864.18021.476235.551214@montanaro.dyndns.org> Message-ID: <20050108210529.GA29102@authsmtp.dds.nl> On Sat, Jan 08, 2005 at 02:45:25PM -0600, Skip Montanaro wrote: > Is there a reason the standard library needs both os.removedirs and > shutil.rmtree? They seem awful similar to me (I can see they aren't really > identical). Ditto for os.renames and shutil.move. Presuming they are all > really needed, is there some reason they don't all belong in the same > module? os.removedirs() only removes directories, it will fail to remove a non-empty directory, for example. It also doesn't have the ignore_errors/onerror arguments [1]. os.renames() is different from shutil.move() in that it also creates intermediate directories (and deletes any left empty). So they're not identical, but I do agree they should be consolidated and moved into one module. I'd say shutil, both because the os module is already awfully crowded, and because these functions are "high-level operations on files and collections of files" rather than "a more portable way of using operating system dependent functionality [...]". Johannes [1] That may actually be a good thing, though. It was a pain to keep those working backwards-compatibly when shutil.rmtree was recently rewritten. From irmen at xs4all.nl Sun Jan 9 04:11:11 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Sun Jan 9 04:11:13 2005 Subject: [Python-Dev] Possible bug in codecs readline? It breaks lines apart. Message-ID: <41E0A0CF.1070502@xs4all.nl> Hello using current cvs Python on Linux, I observe this weird behavior of the readline() method on file-like objects returned from the codecs module: [irmen@atlantis ypage]$ cat testfile1.txt xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy offending line: ladfj askldfj klasdj fskla dfzaskdj fasklfj laskd fjasklfzzzzaa%whereisthis!!! next line. [irmen@atlantis ypage]$ cat testfile2.txt aaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbbb stillokay:bbbbxx broken!!!!badbad againokay. [irmen@atlantis ypage]$ cat bug.py import codecs for name in ("testfile1.txt","testfile2.txt"): f=codecs.open(name,encoding="iso-8859-1") # precise encoding doesn't matter print "----",name,"----" for line in f: print "LINE:"+repr(line) [irmen@atlantis ypage]$ python25 bug.py ---- testfile1.txt ---- LINE:u'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy\r\n' LINE:u'offendi' LINE:u'ng line: ladfj askldfj klasdj fskla dfzaskdj fasklfj laskd fjasklfzzzzaa' LINE:u'%whereisthis!!!\r\n' LINE:u'next line.\r\n' ---- testfile2.txt ---- LINE:u'aaaaaaaaaaaaaaaaaaaaaaaa\n' LINE:u'bbbbbbbbbbbbbbbbbbbbbbbb\n' LINE:u'stillokay:bbbbxx\n' LINE:u'broke' LINE:u'n!!!!badbad\n' LINE:u'againokay.\n' [irmen@atlantis ypage]$ See how it breaks certain lines in half? It only happens when a certain encoding is used, so regular file objects behave as expected. Also, readlines() works fine. Python 2.3.4 and Python 2.4 do not have this problem. Am I missing something or is this a bug? Thanks! --Irmen From s.percivall at chello.se Sun Jan 9 04:38:53 2005 From: s.percivall at chello.se (Simon Percivall) Date: Sun Jan 9 04:38:56 2005 Subject: [Python-Dev] Possible bug in codecs readline? It breaks lines apart. In-Reply-To: <41E0A0CF.1070502@xs4all.nl> References: <41E0A0CF.1070502@xs4all.nl> Message-ID: On 2005-01-09, at 04.11, Irmen de Jong wrote: > Hello > using current cvs Python on Linux, I observe this weird > behavior of the readline() method on file-like objects > returned from the codecs module: > > [...] > > See how it breaks certain lines in half? > It only happens when a certain encoding is used, so regular > file objects behave as expected. Also, readlines() works fine. > > Python 2.3.4 and Python 2.4 do not have this problem. > > Am I missing something or is this a bug? Thanks! It looks like the readline method broke at revision 1.36 of codecs.py, when it was modified, yes. //Simon From irmen at xs4all.nl Sun Jan 9 17:49:29 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Sun Jan 9 17:49:30 2005 Subject: [Python-Dev] Possible bug in codecs readline? It breaks lines apart. In-Reply-To: References: <41E0A0CF.1070502@xs4all.nl> Message-ID: <41E16099.9080804@xs4all.nl> Simon Percivall wrote: > It looks like the readline method broke at revision 1.36 of codecs.py, > when it was modified, yes. Okay. I've created a bug report 1098990: codec readline() splits lines apart --Irmen From ilya at bluefir.net Sun Jan 9 21:19:42 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Sun Jan 9 21:16:52 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: References: <002701c4f3c2$0a7d3800$e841fea9@oemcomputer> <79990c6b05010803092b1570d1@mail.gmail.com> Message-ID: > (a) A higher-level API can and should be constructed which acts like a > (binary) stream but has additional methods for reading and writing > values using struct format codes (or, preferably, somewhat > higher-level type names, as suggested). Instances of this API should > be constructable from a stream or from a "buffer" (e.g. a string). Ok, I think it's getting much bigger than what I was initially aiming for ;-)... One more comment though regarding unpack_at > Then the definition would be: > > def unpack_at(fmt, buf, pos): > size = calcsize(fmt) > end = pos + size > data = buf[pos:end] > if len(data) < size: > raise struct.error("not enough data for format") > ret = unpack(fmt, data) > ret = ret + (end,) > return ret While I see usefulness of this, I think it's a too limited, eg. result=unpack_at(fmt,buf, offset) offset=result.pop() feels quite unnatural... So my feeling is that adding this new API is not worth the trouble. Especially if there are plans for anything higher level... Instead, I would suggest that even a very limited initial implementation of StructReader() like object suggested by Raymond would be more useful... class StructReader: #or maybe call it Unpacker? def __init__(self, buf): self._buf=buf self._offset=0 def unpack(self, format): """unpack at current offset, advance internal offset accordingly""" size=struct.calcize(format) self._pos+=size ret=struct.unpack(format, self._buf[self._pos:self._pos+size) return ret #or may be just make _offset public?? def tell(self): "return current offset" return self._offset def seek(self, offset, whence=0): "set current offset" self._offset=offset This solves the original offset tracking problem completely (at least as far as inconvenience is concerned, improving unpack() perfomance would require struct reader to be written in C) , while allowing to add the rest later. E.g the original "hdr+variable number of data items" code would look: buf=StructReader(rec) hdr=buf.unpack("iiii") for i in range(hdr[0]): item=buf.unpack( "IIII") Ilya PS with unpack_at() this code would look like: offset=0 hdr=buf.unpack("iiii", offset) offset=hdr.pop() for i in range(hdr[0]): item=buf.unpack( "IIII",offset) offset=item.pop() On Sat, 8 Jan 2005, Guido van Rossum wrote: > First, let me say two things: > > (a) A higher-level API can and should be constructed which acts like a > (binary) stream but has additional methods for reading and writing > values using struct format codes (or, preferably, somewhat > higher-level type names, as suggested). Instances of this API should > be constructable from a stream or from a "buffer" (e.g. a string). > > (b) -1 on Ilya's idea of having a special object that acts as an > input-output integer; it is too unpythonic (no matter your objection). > > [Paul Moore] > > OTOH, Nick's idea of returning a tuple with the new offset might make > > your example shorter without sacrificing readability: > > > > result, newpos = struct.unpack('>l', self.__buf, self.__pos) > > self.__pos = newpos # retained "newpos" for readability... > > return result > > This is okay, except I don't want to overload this on unpack() -- > let's pick a different function name like unpack_at(). > > > A third possibility - rather than "magically" adding an additional > > return value because you supply a position, you could have a "where am > > I?" format symbol (say & by analogy with the C "address of" operator). > > Then you'd say > > > > result, newpos = struct.unpack('>l&', self.__buf, self.__pos) > > > > Please be aware, I don't have a need myself for this feature - my > > interest is as a potential reader of others' code... > > I think that adding more magical format characters is probably not > doing the readers of this code a service. > > I do like the idea of not introducing an extra level of tuple to > accommodate the position return value but instead make it the last > item in the tuple when using unpack_at(). > > Then the definition would be: > > def unpack_at(fmt, buf, pos): > size = calcsize(fmt) > end = pos + size > data = buf[pos:end] > if len(data) < size: > raise struct.error("not enough data for format") > # if data is too long that would be a bug in buf[pos:size] and > cause an error below > ret = unpack(fmt, data) > ret = ret + (end,) > return ret > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > From irmen at xs4all.nl Sun Jan 9 21:42:20 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Sun Jan 9 21:42:20 2005 Subject: [Python-Dev] Possible bug in codecs readline? It breaks lines apart. In-Reply-To: <41E16099.9080804@xs4all.nl> References: <41E0A0CF.1070502@xs4all.nl> <41E16099.9080804@xs4all.nl> Message-ID: <41E1972C.5010807@xs4all.nl> > Okay. I've created a bug report 1098990: codec readline() splits lines > apart Btw, I've set it to group Python 2.5, is that correct? Or should bugs that relate to the current CVS trunk have no group? Thx Irmen. From andrewm at object-craft.com.au Mon Jan 10 00:37:17 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Mon Jan 10 00:37:22 2005 Subject: [Python-Dev] Re: csv module TODO list In-Reply-To: <3o50u0tmv1hbpt71jkre94n32q38cdpbdb@4ax.com> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <3o50u0tmv1hbpt71jkre94n32q38cdpbdb@4ax.com> Message-ID: <20050109233717.8A3C33C8E5@coffee.object-craft.com.au> >I'd love to see a 'split' and a 'join' function in the csv module to >just convert between string and list without having to bother about >files. > >Something like > >csv.split(aStr [, dialect='excel'[, fmtparam]]) -> list object > >and > >csv.join(aList, e[, dialect='excel'[, fmtparam]]) -> str object > >Feasible? Yes, it's feasible, although newlines can be embedded in within fields of a CSV record, hence the use of the iterator, rather than working with strings. In your example above, if the parser gets to the end of the string and finds it's still within a field, I'd propose just raising an exception. No promises, however - I only have a finite ammount of time to work on this at the moment. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From python at rcn.com Mon Jan 10 01:19:03 2005 From: python at rcn.com (Raymond Hettinger) Date: Mon Jan 10 01:22:23 2005 Subject: [Python-Dev] an idea for improving struct.unpack api In-Reply-To: Message-ID: <001b01c4f6a9$fdf033e0$e841fea9@oemcomputer> > Instead, I would suggest that even a very limited initial > implementation of StructReader() like object suggested by Raymond would > be more useful... I have a draft patch also. Let's work out improvements off-list (perhaps on ASPN). Feel free to email me directly. Raymond From andrewm at object-craft.com.au Mon Jan 10 01:40:06 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Mon Jan 10 01:40:10 2005 Subject: [Python-Dev] Re: [Csv] Minor change to behaviour of csv module In-Reply-To: <48F57F83-60B3-11D9-ADA4-000A95EFAE9E@aleax.it> References: <1105105520.41de927049442@mcherm.com> <48F57F83-60B3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <20050110004006.88CB63C8E5@coffee.object-craft.com.au> >> Andrew explains that in the CSV module, escape characters are not >> properly removed. >> >> Magnus writes: >>> IMO this is the *only* reasonable behaviour. I don't understand why >>> the escape character should be left in; this is one of the reason why >>> UNIX-style colon-separated values don't work with the current module. >> >> Andrew writes back later: >>> Thinking about this further, I suspect we have to retain the current >>> behaviour, as broken as it is, as the default: it's conceivable that >>> someone somewhere is post-processing the result to remove the >>> backslashes, >>> and if we fix the csv module, we'll break their code. >> >> I'm with Magnus on this. No one has 4 year old code using the CSV >> module. >> The existing behavior is just simply WRONG. Sure, of course we should >> try to maintain backward compatibility, but surely SOME cases don't >> require it, right? Can't we treat this misbehavior as an outright bug? > >+1 -- the nonremoval of escape characters smells like a bug to me, too. Okay, I'm glad the community agrees (less work, less crustification). For what it's worth, it wasn't a bug so much as a misfeature. I was explicitly adding the escape character back in. The intention was to make the feature more forgiving on users who accidently set the escape character - in other words, only special (quoting, escaping, field delimiter) characters received special treatment. With the benefit of hindsight, that was an inadequately considered choice. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From andrewm at object-craft.com.au Mon Jan 10 05:44:41 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Mon Jan 10 05:44:45 2005 Subject: [Python-Dev] csv module and universal newlines In-Reply-To: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> Message-ID: <20050110044441.250103C889@coffee.object-craft.com.au> This item, from the TODO list, has been bugging me for a while: >* Reader and universal newlines don't interact well, reader doesn't > honour Dialect's lineterminator setting. All outstanding bug id's > (789519, 944890, 967934 and 1072404) are related to this - it's > a difficult problem and further discussion is needed. The csv parser consumes lines from an iterator, but it also has it's own idea of end-of-line conventions, which are currently only used by the writer, not the reader, which is a source of much confusion. The writer, by default, also attempts to emit a \r\n sequence, which results in more confusion unless the file is opened in binary mode. I'm looking for suggestions for how we can mitigate these problems (without breaking things for existing users). The standard file iterator includes the end-of-line characters in the returned string. One potentional solution is, then, to ignore the line chunking done by the file iterator, and logically concatenate the source lines until the csv parser's idea of lineterminator is seen - but this defeats negates the benefits of using an iterator. Another option might be to provide a new interface that relies on a file-like object being supplied. The lineterminator character would only be used with this interface, with the current interface falling back to using only \n. Rather a drastic solution. Any other ideas? -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From mcherm at mcherm.com Mon Jan 10 15:40:10 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Mon Jan 10 15:40:13 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates Message-ID: <1105368010.41e293ca97620@mcherm.com> Barry writes: > As an experiment, I just added a PEP topic to the python-checkins > mailing list. You could subscribe to this list and just select the PEP > topic (which matches the regex "PEP" in the Subject header or first few > lines of the body). > > Give it a shot and let's see if that does the trick. I just got notification of the change to PEP 246 (and I haven't received other checkin notifications), so I guess I can report that this is working. Thanks, Barry. Should we now mention this on c.l.py for others who may be interested? -- Michael Chermside From aleax at aleax.it Mon Jan 10 15:42:11 2005 From: aleax at aleax.it (Alex Martelli) Date: Mon Jan 10 16:08:58 2005 Subject: [Python-Dev] PEP 246, redux Message-ID: I had been promising to rewrite PEP 246 to incorporate the last several years' worth of discussions &c about it, and Guido's recent "stop the flames" artima blog post finally pushed me to complete the work. Feedback is of course welcome, so I thought I had better repost it here, rather than relying on would-be commenters to get it from CVS... I'm also specifically CC'ing Clark, the co-author, since he wasn't involved in this rewrite and of course I owe it to him to change or clearly attribute to myself anything he doesn't like to have "under his own name"! Thanks, Alex PEP: 246 Title: Object Adaptation Version: $Revision: 1.6 $ Author: aleax@aleax.it (Alex Martelli), cce@clarkevans.com (Clark C. Evans) Status: Draft Type: Standards Track Created: 21-Mar-2001 Python-Version: 2.5 Post-History: 29-Mar-2001, 10-Jan-2005 Abstract This proposal puts forth an extensible cooperative mechanism for the adaptation of an incoming object to a context which expects an object supporting a specific protocol (say a specific type, class, or interface). This proposal provides a built-in "adapt" function that, for any object X and any protocol Y, can be used to ask the Python environment for a version of X compliant with Y. Behind the scenes, the mechanism asks object X: "Are you now, or do you know how to wrap yourself to provide, a supporter of protocol Y?". And, if this request fails, the function then asks protocol Y: "Does object X support you, or do you know how to wrap it to obtain such a supporter?" This duality is important, because protocols can be developed after objects are, or vice-versa, and this PEP lets either case be supported non-invasively with regard to the pre-existing component[s]. Lastly, if neither the object nor the protocol know about each other, the mechanism may check a registry of adapter factories, where callables able to adapt certain objects to certain protocols can be registered dynamically. This part of the proposal is optional: the same effect could be obtained by ensuring that certain kinds of protocols and/or objects can accept dynamic registration of adapter factories, for example via suitable custom metaclasses. However, this optional part allows adaptation to be made more flexible and powerful in a way that is not invasive to either protocols or other objects, thereby gaining for adaptation much the same kind of advantage that Python standard library's "copy_reg" module offers for serialization and persistence. This proposal does not specifically constrain what a protocol _is_, what "compliance to a protocol" exactly _means_, nor what precisely a wrapper is supposed to do. These omissions are intended to leave this proposal compatible with both existing categories of protocols, such as the existing system of type and classes, as well as the many concepts for "interfaces" as such which have been proposed or implemented for Python, such as the one in PEP 245 [1], the one in Zope3 [2], or the ones discussed in the BDFL's Artima blog in late 2004 and early 2005 [3]. However, some reflections on these subjects, intended to be suggestive and not normative, are also included. Motivation Currently there is no standardized mechanism in Python for checking if an object supports a particular protocol. Typically, existence of certain methods, particularly special methods such as __getitem__, is used as an indicator of support for a particular protocol. This technique works well for a few specific protocols blessed by the BDFL (Benevolent Dictator for Life). The same can be said for the alternative technique based on checking 'isinstance' (the built-in class "basestring" exists specifically to let you use 'isinstance' to check if an object "is something like a string"). Neither approach is easily and generally extensible to other protocols, defined by applications and third party frameworks, outside of the standard Python core. Even more important than checking if an object already supports a given protocol can be the task of obtaining a suitable adapter (wrapper or proxy) for the object, if the support is not already there. For example, a string does not support the file protocol, but you can wrap it into a StringIO instance to obtain an object which does support that protocol and gets its data from the string it wraps; that way, you can pass the string (suitably wrapped) to subsystems which require as their arguments objects that are readable as files. Unfortunately, there is currently no general, standardized way to automate this extremely important kind of "adaptation by wrapping" operations. Typically, today, when you pass objects to a context expecting a particular protocol, either the object knows about the context and provides its own wrapper or the context knows about the object and wraps it appropriately. The difficulty with these approaches is that such adaptations are one-offs, are not centralized in a single place of the users code, and are not executed with a common technique, etc. This lack of standardization increases code duplication with the same adapter occurring in more than one place or it encourages classes to be re-written instead of adapted. In either case, maintainability suffers. It would be very nice to have a standard function that can be called upon to verify an object's compliance with a particular protocol and provide for a wrapper if one is readily available -- all without having to hunt through each library's documentation for the incantation appropriate to that particular, specific case. Requirements When considering an object's compliance with a protocol, there are several cases to be examined: a) When the protocol is a type or class, and the object has exactly that type or is an instance of exactly that class (not a subclass). In this case, compliance is automatic. b) When the object knows about the protocol, and either considers itself compliant, or knows how to wrap itself suitably. c) When the protocol knows about the object, and either the object already complies or the protocol knows how to suitably wrap the object. d) When the protocol is a type or class, and the object is a member of a subclass. This is distinct from the first case (a) above, since inheritance (unfortunately) does not necessarily imply substitutability, and thus must be handled carefully. e) When the context knows about the object and the protocol and knows how to adapt the object so that the required protocol is satisfied. This could use an adapter registry or similar approaches. The fourth case above is subtle. A break of substitutability can occur when a subclass changes a method's signature, or restricts the domains accepted for a method's argument ("co-variance" on arguments types), or extends the co-domain to include return values which the base class may never produce ("contra-variance" on return types). While compliance based on class inheritance _should_ be automatic, this proposal allows an object to signal that it is not compliant with a base class protocol. If Python gains some standard "official" mechanism for interfaces, however, then the "fast-path" case (a) can and should be extended to the protocol being an interface, and the object an instance of a type or class claiming compliance with that interface. For example, if the "interface" keyword discussed in [3] is adopted into Python, the "fast path" of case (a) could be used, since instantiable classes implementing an interface would not be allowed to break substitutability. Specification This proposal introduces a new built-in function, adapt(), which is the basis for supporting these requirements. The adapt() function has three parameters: - `obj', the object to be adapted - `protocol', the protocol requested of the object - `alternate', an optional object to return if the object could not be adapted A successful result of the adapt() function returns either the object passed `obj', if the object is already compliant with the protocol, or a secondary object `wrapper', which provides a view of the object compliant with the protocol. The definition of wrapper is deliberately vague, and a wrapper is allowed to be a full object with its own state if necessary. However, the design intention is that an adaptation wrapper should hold a reference to the original object it wraps, plus (if needed) a minimum of extra state which it cannot delegate to the wrapper object. An excellent example of adaptation wrapper is an instance of StringIO which adapts an incoming string to be read as if it was a textfile: the wrapper holds a reference to the string, but deals by itself with the "current point of reading" (from _where_ in the wrapped strings will the characters for the next, e.g., "readline" call come from), because it cannot delegate it to the wrapped object (a string has no concept of "current point of reading" nor anything else even remotely related to that concept). A failure to adapt the object to the protocol raises an AdaptationError (which is a subclass of TypeError), unless the alternate parameter is used, in this case the alternate argument is returned instead. To enable the first case listed in the requirements, the adapt() function first checks to see if the object's type or the object's class are identical to the protocol. If so, then the adapt() function returns the object directly without further ado. To enable the second case, when the object knows about the protocol, the object must have a __conform__() method. This optional method takes two arguments: - `self', the object being adapted - `protocol, the protocol requested Just like any other special method in today's Python, __conform__ is meant to be taken from the object's class, not from the object itself (for all objects, except instances of "classic classes" as long as we must still support the latter). This enables a possible 'tp_conform' slot to be added to Python's type objects in the future, if desired. The object may return itself as the result of __conform__ to indicate compliance. Alternatively, the object also has the option of returning a wrapper object compliant with the protocol. If the object knows it is not compliant although it belongs to a type which is a subclass of the protocol, then __conform__ should raise a LiskovViolation exception (a subclass of AdaptationError). Finally, if the object cannot determine its compliance, it should return None to enable the remaining mechanisms. If __conform__ raises any other exception, "adapt" just propagates it. To enable the third case, when the protocol knows about the object, the protocol must have an __adapt__() method. This optional method takes two arguments: - `self', the protocol requested - `obj', the object being adapted If the protocol finds the object to be compliant, it can return obj directly. Alternatively, the method may return a wrapper compliant with the protocol. If the protocol knows the object is not compliant although it belongs to a type which is a subclass of the protocol, then __adapt__ should raise a LiskovViolation exception (a subclass of AdaptationError). Finally, when compliance cannot be determined, this method should return None to enable the remaining mechanisms. If __adapt__ raises any other exception, "adapt" just propagates it. The fourth case, when the object's class is a sub-class of the protocol, is handled by the built-in adapt() function. Under normal circumstances, if "isinstance(object, protocol)" then adapt() returns the object directly. However, if the object is not substitutable, either the __conform__() or __adapt__() methods, as above mentioned, may raise an LiskovViolation (a subclass of AdaptationError) to prevent this default behavior. If none of the first four mechanisms worked, as a last-ditch attempt, 'adapt' falls back to checking a registry of adapter factories, indexed by the protocol and the type of `obj', to meet the fifth case. Adapter factories may be dynamically registered and removed from that registry to provide "third party adaptation" of objects and protocols that have no knowledge of each other, in a way that is not invasive to either the object or the protocols. Intended Use The typical intended use of adapt is in code which has received some object X "from the outside", either as an argument or as the result of calling some function, and needs to use that object according to a certain protocol Y. A "protocol" such as Y is meant to indicate an interface, usually enriched with some semantics constraints (such as are typically used in the "design by contract" approach), and often also some pragmatical expectation (such as "the running time of a certain operation should be no worse than O(N)", or the like); this proposal does not specify how protocols are designed as such, nor how or whether compliance to a protocol is checked, nor what the consequences may be of claiming compliance but not actually delivering it (lack of "syntactic" compliance -- names and signatures of methods -- will often lead to exceptions being raised; lack of "semantic" compliance may lead to subtle and perhaps occasional errors [imagine a method claiming to be threadsafe but being in fact subject to some subtle race condition, for example]; lack of "pragmatic" compliance will generally lead to code that runs ``correctly'', but too slowly for practical use, or sometimes to exhaustion of resources such as memory or disk space). When protocol Y is a concrete type or class, compliance to it is intended to mean that an object allows all of the operations that could be performed on instances of Y, with "comparable" semantics and pragmatics. For example, a hypothetical object X that is a singly-linked list should not claim compliance with protocol 'list', even if it implements all of list's methods: the fact that indexing X[n] takes time O(n), while the same operation would be O(1) on a list, makes a difference. On the other hand, an instance of StringIO.StringIO does comply with protocol 'file', even though some operations (such as those of module 'marshal') may not allow substituting one for the other because they perform explicit type-checks: such type-checks are "beyond the pale" from the point of view of protocol compliance. While this convention makes it feasible to use a concrete type or class as a protocol for purposes of this proposal, such use will often not be optimal. Rarely will the code calling 'adapt' need ALL of the features of a certain concrete type, particularly for such rich types as file, list, dict; rarely can all those features be provided by a wrapper with good pragmatics, as well as syntax and semantics that are really the same as a concrete type's. Rather, once this proposal is accepted, a design effort needs to start to identify the essential characteristics of those protocols which are currently used in Python, particularly within the standard library, and to formalize them using some kind of "interface" construct (not necessarily requiring any new syntax: a simple custom metaclass would let us get started, and the results of the effort could later be migrated to whatever "interface" construct is eventually accepted into the Python language). With such a palette of more formally designed protocols, the code using 'adapt' will be able to ask for, say, adaptation into "a filelike object that is readable and seekable", or whatever else it specifically needs with some decent level of "granularity", rather than too-generically asking for compliance to the 'file' protocol. Adaptation is NOT "casting". When object X itself does not conform to protocol Y, adapting X to Y means using some kind of wrapper object Z, which holds a reference to X, and implements whatever operation Y requires, mostly by delegating to X in appropriate ways. For example, if X is a string and Y is 'file', the proper way to adapt X to Y is to make a StringIO(X), *NOT* to call file(X) [which would try to open a file named by X]. Numeric types and protocols may need to be an exception to this "adaptation is not casting" mantra, however. Guido's "Optional Static Typing: Stop the Flames" Blog Entry A typical simple use case of adaptation would be: def f(X): X = adapt(X, Y) # continue by using X according to protocol X In [4], the BDFL has proposed introducing the syntax: def f(X: Y): # continue by using X according to protocol X to be a handy shortcut for exactly this typical use of adapt, and, as a basis for experimentation until the parser has been modified to accept this new syntax, a semantically equivalent decorator: @arguments(Y) def f(X): # continue by using X according to protocol X These BDFL ideas are fully compatible with this proposal, as are other of Guido's suggestions in the same blog. Reference Implementation and Test Cases The following reference implementation does not deal with classic classes: it consider only new-style classes. If classic classes need to be supported, the additions should be pretty clear, though a bit messy (x.__class__ vs type(x), getting boundmethods directly from the object rather than from the type, and so on). ----------------------------------------------------------------- adapt.py ----------------------------------------------------------------- class AdaptationError(TypeError): pass class LiskovViolation(AdaptationError): pass _adapter_factory_registry = {} def registerAdapterFactory(objtype, protocol, factory): _adapter_factory_registry[objtype, protocol] = factory def unregisterAdapterFactory(objtype, protocol): del _adapter_factory_registry[objtype, protocol] def _adapt_by_registry(obj, protocol, alternate): factory = _adapter_factory_registry.get((type(obj), protocol)) if factory is None: adapter = alternate else: adapter = factory(obj, protocol, alternate) if adapter is AdaptationError: raise AdaptationError else: return adapter def adapt(obj, protocol, alternate=AdaptationError): t = type(obj) # (a) first check to see if object has the exact protocol if t is protocol: return obj try: # (b) next check if t.__conform__ exists & likes protocol conform = getattr(t, '__conform__', None) if conform is not None: result = conform(obj, protocol) if result is not None: return result # (c) then check if protocol.__adapt__ exists & likes obj adapt = getattr(type(protocol), '__adapt__', None) if adapt is not None: result = adapt(protocol, obj) if result is not None: return result except LiskovViolation: pass else: # (d) check if object is instance of protocol if isinstance(obj, protocol): return obj # (e) last chance: try the registry return _adapt_by_registry(obj, protocol, alternate) ----------------------------------------------------------------- test.py ----------------------------------------------------------------- from adapt import AdaptationError, LiskovViolation, adapt from adapt import registerAdapterFactory, unregisterAdapterFactory import doctest class A(object): ''' >>> a = A() >>> a is adapt(a, A) # case (a) True ''' class B(A): ''' >>> b = B() >>> b is adapt(b, A) # case (d) True ''' class C(object): ''' >>> c = C() >>> c is adapt(c, B) # case (b) True >>> c is adapt(c, A) # a failure case Traceback (most recent call last): ... AdaptationError ''' def __conform__(self, protocol): if protocol is B: return self class D(C): ''' >>> d = D() >>> d is adapt(d, D) # case (a) True >>> d is adapt(d, C) # case (d) explicitly blocked Traceback (most recent call last): ... AdaptationError ''' def __conform__(self, protocol): if protocol is C: raise LiskovViolation class MetaAdaptingProtocol(type): def __adapt__(cls, obj): return cls.adapt(obj) class AdaptingProtocol: __metaclass__ = MetaAdaptingProtocol @classmethod def adapt(cls, obj): pass class E(AdaptingProtocol): ''' >>> a = A() >>> a is adapt(a, E) # case (c) True >>> b = A() >>> b is adapt(b, E) # case (c) True >>> c = C() >>> c is adapt(c, E) # a failure case Traceback (most recent call last): ... AdaptationError ''' @classmethod def adapt(cls, obj): if isinstance(obj, A): return obj class F(object): pass def adapt_F_to_A(obj, protocol, alternate): if isinstance(obj, F) and issubclass(protocol, A): return obj else: return alternate def test_registry(): ''' >>> f = F() >>> f is adapt(f, A) # a failure case Traceback (most recent call last): ... AdaptationError >>> registerAdapterFactory(F, A, adapt_F_to_A) >>> f is adapt(f, A) # case (e) True >>> unregisterAdapterFactory(F, A) >>> f is adapt(f, A) # a failure case again Traceback (most recent call last): ... AdaptationError >>> registerAdapterFactory(F, A, adapt_F_to_A) ''' doctest.testmod() Relationship To Microsoft's QueryInterface Although this proposal has some similarities to Microsoft's (COM) QueryInterface, it differs by a number of aspects. First, adaptation in this proposal is bi-directional, allowing the interface (protocol) to be queried as well, which gives more dynamic abilities (more Pythonic). Second, there is no special "IUnknown" interface which can be used to check or obtain the original unwrapped object identity, although this could be proposed as one of those "special" blessed interface protocol identifiers. Third, with QueryInterface, once an object supports a particular interface it must always there after support this interface; this proposal makes no such guarantee, since, in particular, adapter factories can be dynamically added to the registried and removed again later. Fourth, implementations of Microsoft's QueryInterface must support a kind of equivalence relation -- they must be reflexive, symmetrical, and transitive, in specific senses. The equivalent conditions for protocol adaptation according to this proposal would also represent desirable properties: # given, to start with, a successful adaptation: X_as_Y = adapt(X, Y) # reflexive: assert adapt(X_as_Y, Y) is X_as_Y # transitive: X_as_Z = adapt(X, Z, None) X_as_Y_as_Z = adapt(X_as_Y, Z, None) assert (X_as_Y_as_Z is None) == (X_as_Z is None) # symmetrical: X_as_Z_as_Y = adapt(X_as_Z, Y, None) assert (X_as_Y_as_Z is None) == (X_as_Z_as_Y is None) However, while these properties are desirable, it may not be possible to guarantee them in all cases. QueryInterface can impose their equivalents because it dictates, to some extent, how objects, interfaces, and adapters are to be coded; this proposal is meant to be not necessarily invasive, usable and to "retrofit" adaptation between two frameworks coded in mutual ignorance of each other without having to modify either framework. Transitivity of adaptation is in fact somewhat controversial, as is the relationship (if any) between adaptation and inheritance. The latter would not be controversial if we knew that inheritance always implies Liskov substitutability, which, unfortunately we don't. If some special form, such as the interfaces proposed in [4], could indeed ensure Liskov substitutability, then for that kind of inheritance, only, we could perhaps assert that if X conforms to Y and Y inherits from Z then X conforms to Z... but only if substitutability was taken in a very strong sense to include semantics and pragmatics, which seems doubtful. (For what it's worth: in QueryInterface, inheritance does not require nor imply conformance). This proposal does not include any "strong" effects of inheritance, beyond the small ones specifically detailed above. Similarly, transitivity might imply multiple "internal" adaptation passes to get the result of adapt(X, Z) via some intermediate Y, intrinsically like adapt(adapt(X, Y), Z), for some suitable and automatically chosen Y. Again, this may perhaps be feasible under suitably strong constraints, but the practical implications of such a scheme are still unclear to this proposal's authors. Thus, this proposal does not include any automatic or implicit transitivity of adaptation, under whatever circumstances. For an implementation of the original version of this proposal which performs more advanced processing in terms of transitivity, and of the effects of inheritance, see Phillip J. Eby's PyProtocols [5]. The documentation accompanying PyProtocols is well worth studying for its considerations on how adapters should be coded and used, and on how adaptation can remove any need for typechecking in application code. Questions and Answers Q: What benefit does this proposal provide? A: The typical Python programmer is an integrator, someone who is connecting components from various suppliers. Often, to interface between these components, one needs intermediate adapters. Usually the burden falls upon the programmer to study the interface exposed by one component and required by another, determine if they are directly compatible, or develop an adapter. Sometimes a supplier may even include the appropriate adapter, but even then searching for the adapter and figuring out how to deploy the adapter takes time. This technique enables supplierrs to work with each other directly, by implementing __conform__ or __adapt__ as necessary. This frees the integrator from making their own adapters. In essence, this allows the components to have a simple dialogue among themselves. The integrator simply connects one component to another, and if the types don't automatically match an adapting mechanism is built-in. Moreover, thanks to the adapter registry, a "fourth party" may supply adapters to allow interoperation of frameworks which are totally unaware of each other, non-invasively, and without requiring the integrator to do anything more than install the appropriate adapter factories in the registry at start-up. As long as libraries and frameworks cooperate with the adaptation infrastructure proposed here (essentially by defining and using protocols appropriately, and calling 'adapt' as needed on arguments received and results of call-back factory functions), the integrator's work thereby becomes much simpler. For example, consider SAX1 and SAX2 interfaces: there is an adapter required to switch between them. Normally, the programmer must be aware of this; however, with this adaptation proposal in place, this is no longer the case -- indeed, thanks to the adapter registry, this need may be removed even if the framework supplying SAX1 and the one requiring SAX2 are unaware of each other. Q: Why does this have to be built-in, can't it be standalone? A: Yes, it does work standalone. However, if it is built-in, it has a greater chance of usage. The value of this proposal is primarily in standardization: having libraries and frameworks coming from different suppliers, including the Python standard library, use a single approach to adaptation. Furthermore: 0. The mechanism is by its very nature a singleton. 1. If used frequently, it will be much faster as a built-in. 2. It is extensible and unassuming. 3. Once 'adapt' is built-in, it can support syntax extensions and even be of some help to a type inference system. Q: Why the verbs __conform__ and __adapt__? A: conform, verb intransitive 1. To correspond in form or character; be similar. 2. To act or be in accord or agreement; comply. 3. To act in accordance with current customs or modes. adapt, verb transitive 1. To make suitable to or fit for a specific use or situation. Source: The American Heritage Dictionary of the English Language, Third Edition Backwards Compatibility There should be no problem with backwards compatibility unless someone had used the special names __conform__ or __adapt__ in other ways, but this seems unlikely, and, in any case, user code should never use special names for non-standard purposes. This proposal could be implemented and tested without changes to the interpreter. Credits This proposal was created in large part by the feedback of the talented individuals on the main Python mailing lists and the type-sig list. To name specific contributors (with apologies if we missed anyone!), besides the proposal's authors: the main suggestions for the proposal's first versions came from Paul Prescod, with significant feedback from Robin Thomas, and we also borrowed ideas from Marcin 'Qrczak' Kowalczyk and Carlos Ribeiro. Other contributors (via comments) include Michel Pelletier, Jeremy Hylton, Aahz Maruch, Fredrik Lundh, Rainer Deyke, Timothy Delaney, and Huaiyu Zhu. The current version owes a lot to discussions with (among others) Phillip J. Eby, Guido van Rossum, Bruce Eckel, Jim Fulton, and Ka-Ping Yee, and to study and reflection of their proposals, implementations, and documentation about use and adaptation of interfaces and protocols in Python. References and Footnotes [1] PEP 245, Python Interface Syntax, Pelletier http://www.python.org/peps/pep-0245.html [2] http://www.zope.org/Wikis/Interfaces/FrontPage [3] http://www.artima.com/weblogs/index.jsp?blogger=guido [4] http://www.artima.com/weblogs/viewpost.jsp?thread=87182 [5] http://peak.telecommunity.com/PyProtocols.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From FBatista at uniFON.com.ar Mon Jan 10 16:08:03 2005 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Jan 10 16:10:28 2005 Subject: [Python-Dev] os.removedirs() vs. shutil.rmtree() Message-ID: [Johannes Gijsbers] #- So they're not identical, but I do agree they should be consolidated #- and moved into one module. I'd say shutil, both because the os #- module is already awfully crowded, and because these functions are #- "high-level operations on files and collections of files" rather #- than "a more portable way of using operating system dependent #- functionality [...]". +1. We should be keeping this "should change this way" for when we restructure the std lib. There's already a wiki somewhere for this? . Facundo Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog PyAr - Python Argentina: http://pyar.decode.com.ar/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA. La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050110/e5dfab1a/attachment.htm From barry at python.org Mon Jan 10 16:13:13 2005 From: barry at python.org (Barry Warsaw) Date: Mon Jan 10 16:13:21 2005 Subject: [Python-Dev] Re: Subscribing to PEP updates In-Reply-To: <1105368010.41e293ca97620@mcherm.com> References: <1105368010.41e293ca97620@mcherm.com> Message-ID: <1105369993.29934.4.camel@geddy.wooz.org> On Mon, 2005-01-10 at 09:40, Michael Chermside wrote: > Barry writes: > > As an experiment, I just added a PEP topic to the python-checkins > > mailing list. You could subscribe to this list and just select the PEP > > topic (which matches the regex "PEP" in the Subject header or first few > > lines of the body). > > > > Give it a shot and let's see if that does the trick. > > I just got notification of the change to PEP 246 (and I haven't received > other checkin notifications), so I guess I can report that this is > working. Excellent! > Thanks, Barry. Should we now mention this on c.l.py for others who > may be interested? Sure, I think that would be great. Thanks. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050110/5b83fbed/attachment.pgp From gvanrossum at gmail.com Mon Jan 10 16:46:39 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 10 16:46:43 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: Message-ID: > I had been promising to rewrite PEP 246 to incorporate the last several > years' worth of discussions &c about it, and Guido's recent "stop the > flames" artima blog post finally pushed me to complete the work. > Feedback is of course welcome, so I thought I had better repost it > here, rather than relying on would-be commenters to get it from CVS... Thanks for doing this, Alex! I yet have to read the whole thing [will attempt do so later today] but the few snippets I caught make me feel this is a big step forward. I'm wondering if someone could do a similar thing for PEP 245, interfaces syntax? Alex hinted that it's a couple of rounds behind the developments in Zope and Twisted. I'm personally not keen on needing *two* new keywords (interface and implements) so I hope that whoever does the rewrite could add a section on the advantages and disadvantages of the 'implements' keyword (my simplistic alternative proposal is to simply include interfaces in the list of bases in the class statement; the metaclass can then sort it out). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Mon Jan 10 18:43:44 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 10 18:42:32 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: Message-ID: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> At 03:42 PM 1/10/05 +0100, Alex Martelli wrote: > The fourth case above is subtle. A break of substitutability can > occur when a subclass changes a method's signature, or restricts > the domains accepted for a method's argument ("co-variance" on > arguments types), or extends the co-domain to include return > values which the base class may never produce ("contra-variance" > on return types). While compliance based on class inheritance > _should_ be automatic, this proposal allows an object to signal > that it is not compliant with a base class protocol. -1 if this introduces a performance penalty to a wide range of adaptations (i.e. those using abstract base classes), just to support people who want to create deliberate Liskov violations. I personally don't think that we should pander to Liskov violators, especially since Guido seems to be saying that there will be some kind of interface objects available in future Pythons. > Just like any other special method in today's Python, __conform__ > is meant to be taken from the object's class, not from the object > itself (for all objects, except instances of "classic classes" as > long as we must still support the latter). This enables a > possible 'tp_conform' slot to be added to Python's type objects in > the future, if desired. One note here: Zope and PEAK sometimes use interfaces that a function or module may implement. PyProtocols' implementation does this by adding a __conform__ object to the function's dictionary so that the function can conform to a particular signature. If and when __conform__ becomes tp_conform, this may not be necessary any more, at least for functions, because there will probably be some way for an interface to tell if the function at least conforms to the appropriate signature. But for modules this will still be an issue. I am not saying we shouldn't have a tp_conform; just suggesting that it may be appropriate for functions and modules (as well as classic classes) to have their tp_conform delegate back to self.__dict__['__conform__'] instead of a null implementation. > The object may return itself as the result of __conform__ to > indicate compliance. Alternatively, the object also has the > option of returning a wrapper object compliant with the protocol. > If the object knows it is not compliant although it belongs to a > type which is a subclass of the protocol, then __conform__ should > raise a LiskovViolation exception (a subclass of AdaptationError). > Finally, if the object cannot determine its compliance, it should > return None to enable the remaining mechanisms. If __conform__ > raises any other exception, "adapt" just propagates it. > > To enable the third case, when the protocol knows about the > object, the protocol must have an __adapt__() method. This > optional method takes two arguments: > > - `self', the protocol requested > > - `obj', the object being adapted > > If the protocol finds the object to be compliant, it can return > obj directly. Alternatively, the method may return a wrapper > compliant with the protocol. If the protocol knows the object is > not compliant although it belongs to a type which is a subclass of > the protocol, then __adapt__ should raise a LiskovViolation > exception (a subclass of AdaptationError). Finally, when > compliance cannot be determined, this method should return None to > enable the remaining mechanisms. If __adapt__ raises any other > exception, "adapt" just propagates it. > The fourth case, when the object's class is a sub-class of the > protocol, is handled by the built-in adapt() function. Under > normal circumstances, if "isinstance(object, protocol)" then > adapt() returns the object directly. However, if the object is > not substitutable, either the __conform__() or __adapt__() > methods, as above mentioned, may raise an LiskovViolation (a > subclass of AdaptationError) to prevent this default behavior. I don't see the benefit of LiskovViolation, or of doing the exact type check vs. the loose check. What is the use case for these? Is it to allow subclasses to say, "Hey I'm not my superclass?" It's also a bit confusing to say that if the routines "raise any other exceptions" they're propagated. Are you saying that LiskovViolation is *not* propagated? > If none of the first four mechanisms worked, as a last-ditch > attempt, 'adapt' falls back to checking a registry of adapter > factories, indexed by the protocol and the type of `obj', to meet > the fifth case. Adapter factories may be dynamically registered > and removed from that registry to provide "third party adaptation" > of objects and protocols that have no knowledge of each other, in > a way that is not invasive to either the object or the protocols. This should either be fleshed out to a concrete proposal, or dropped. There are many details that would need to be answered, such as whether "type" includes subtypes and whether it really means type or __class__. (Note that isinstance() now uses __class__, allowing proxy objects to lie about their class; the adaptation system should support this too, and both the Zope and PyProtocols interface systems and PyProtocols' generic functions support it.) One other issue: it's not possible to have standalone interoperable PEP 246 implementations using a registry, unless there's a standardized place to put it, and a specification for how it gets there. Otherwise, if someone is using both say Zope and PEAK in the same application, they would have to take care to register adaptations in both places. This is actually a pretty minor issue since in practice both frameworks' interfaces handle adaptation, so there is no *need* for this extra registry in such cases. > Adaptation is NOT "casting". When object X itself does not > conform to protocol Y, adapting X to Y means using some kind of > wrapper object Z, which holds a reference to X, and implements > whatever operation Y requires, mostly by delegating to X in > appropriate ways. For example, if X is a string and Y is 'file', > the proper way to adapt X to Y is to make a StringIO(X), *NOT* to > call file(X) [which would try to open a file named by X]. > > Numeric types and protocols may need to be an exception to this > "adaptation is not casting" mantra, however. The issue isn't that adaptation isn't casting; why would casting a string to a file mean that you should open that filename? I don't think that "adaptation isn't casting" is enough to explain appropriate use of adaptation. For example, I think it's quite valid to adapt a filename to a *factory* for opening files, or a string to a "file designator". However, it doesn't make any sense (to me at least) to adapt from a file designator to a file, which IMO is the reason it's wrong to adapt from a string to a file in the way you suggest. However, casting doesn't come into it anywhere that I can see. If I were going to say anything about that case, I'd say that adaptation should not be "lossy"; adapting from a designator to a file loses information like what mode the file should be opened in. (Similarly, I don't see adapting from float to int; if you want a cast to int, cast it.) Or to put it another way, adaptability should imply substitutability: a string may be used as a filename, a filename may be used to designate a file. But a filename cannot be used as a file; that makes no sense. >Reference Implementation and Test Cases > > The following reference implementation does not deal with classic > classes: it consider only new-style classes. If classic classes > need to be supported, the additions should be pretty clear, though > a bit messy (x.__class__ vs type(x), getting boundmethods directly > from the object rather than from the type, and so on). Please base a reference implementation off of either Zope or PyProtocols' field-tested implementations which deal correctly with __class__ vs. type(), and can detect whether they're calling a __conform__ or __adapt__ at the wrong metaclass level, etc. Then, if there is a reasonable use case for LiskovViolation and the new type checking rules that justifies adding them, let's do so. > Transitivity of adaptation is in fact somewhat controversial, as > is the relationship (if any) between adaptation and inheritance. The issue is simply this: what is substitutability? If you say that interface B is substitutable for A, and C is substitutable for B, then C *must* be substitutable for A, or we have inadequately defined "substitutability". If adaptation is intended to denote substitutability, then there can be absolutely no question that it is transitive, or else it is not possible to have any meaning for interface inheritance! Thus, the controversies are: 1) whether adaptation should be required to indicate substitutability (and I think that your own presentation of the string->file example supports this), and 2) whether the adaptation system should automatically provide an A when provided with a C. Existing implementations of interfaces for Python all do this where interface C is a subclass of A. However, they differ as to whether *all* adaptation should indicate substitutability. The Zope and Twisted designers believe that adaptation should not be required to imply substitutability, and that only interface and implementation inheritance imply substitutability. (Although, as you point out, the latter is not always the case.) PyProtocols OTOH believes that *all* adaptation must imply substitutability; non-substitutable adaptation or inheritance is a design error: "adaptation abuse", if you will. So, in the PyProtocols view, it would never make sense to define an adaptation from float or decimal to integer that would permit loss of precision. If you did define such an adaptation, it must refuse to adapt a float or decimal with a fractional part, since the number would no longer be substitutable if data loss occurred. Of course, this is a separate issue from automatic transitive adaptation, in the sense that even if you agree that adaptation must imply substitutability, you can still disagree as to whether automatically locating a multi-step adaptation is desirable enough to be worth implementing. However, if substitutability is guaranteed, then such multi-step adaptation cannot result in anything "controversial" occurring. > The latter would not be controversial if we knew that inheritance > always implies Liskov substitutability, which, unfortunately we > don't. If some special form, such as the interfaces proposed in > [4], could indeed ensure Liskov substitutability, then for that > kind of inheritance, only, we could perhaps assert that if X > conforms to Y and Y inherits from Z then X conforms to Z... but > only if substitutability was taken in a very strong sense to > include semantics and pragmatics, which seems doubtful. As a practical matter, all of the existing interface systems (Zope, PyProtocols, and even the defunct Twisted implementation) treat interface inheritance as guaranteeing substitutability for the base interface, and do so transitively. However, it seems to me to be a common programming error among people new to interfaces to inherit from an interface when they intend to *require* the base interface's functionality, rather than *offer* the base interface's functionality. It may be worthwhile to address this issue in the design of "standard" interfaces for Python. This educational issue regarding substitutability is I believe inherent to the concept of interfaces, however, and does not go away simply by making non-inheritance adaptation non-transitive in the implementation. It may, however, make it take longer for people to encounter the issue, thereby slowing their learning process. ;) >Backwards Compatibility > > There should be no problem with backwards compatibility unless > someone had used the special names __conform__ or __adapt__ in > other ways, but this seems unlikely, and, in any case, user code > should never use special names for non-standard purposes. Production implementations of the old version of PEP 246 exist, so the changes in semantics you've proposed may introduce backward compatibility issues. More specifically, some field code may not work correctly with your proposed reference implementation, in the sense that code that worked with Zope or PyProtocols before, may not work with the reference implementation's adapt(), resulting in failure of adaptation where success occurred before, or in exceptions raised where no exception was raised before. From pje at telecommunity.com Mon Jan 10 18:59:10 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 10 18:57:57 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> References: Message-ID: <5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com> At 12:43 PM 1/10/05 -0500, Phillip J. Eby wrote: >As a practical matter, all of the existing interface systems (Zope, >PyProtocols, and even the defunct Twisted implementation) treat interface >inheritance as guaranteeing substitutability for the base interface, and >do so transitively. An additional data point, by the way: the Eclipse Java IDE has an adaptation system that works very much like PEP 246 does, and it appears that in a future release they intend to support automatic adapter transitivity, so as to avoid requiring each provider of an interface to "provide O(n^2) adapters when writing the nth version of an interface." IOW, their current release is transitive only for interface inheritance ala Zope or Twisted; their future release will be transitive for adapter chains ala PyProtocols. From cce at clarkevans.com Mon Jan 10 19:19:22 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Mon Jan 10 19:19:25 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> Message-ID: <20050110181922.GC47082@prometheusresearch.com> Alex, This is wonderful work, thank you for keeping the ball in the air; I'm honored to keep my name as a co-author -- kinda like a free lunch. Phillip, Once again, thank you! Without PyProtocols and your advocacy, this proposal might have been buried in the historical bit-bucket. On Mon, Jan 10, 2005 at 12:43:44PM -0500, Phillip J. Eby wrote: | -1 if this introduces a performance penalty to a wide range of | adaptations (i.e. those using abstract base classes), just to support | people who want to create deliberate Liskov violations. I personally | don't think that we should pander to Liskov violators, especially since | Guido seems to be saying that there will be some kind of interface | objects available in future Pythons. I particularly like Alex's Liskov violation error; although it is not hugely common, it does happen, and there should be a way for a class to indicate that it's only being used for implementation. Perhaps... if the class doesn't have a __conform__ method, then its adaptation is automatic (that is, only the class can raise this case). The rationale for only enabling one of the two paths is that the base class would have been in-place before the derived class was created; therefore, it is highly unlikely that __adapt__ would ever be of help. Therefore, there might be a performance penalty, but it'd be really small, simply checking to see if the slot is filled in. Best, Clark From pje at telecommunity.com Mon Jan 10 19:34:59 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 10 19:33:48 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050110181922.GC47082@prometheusresearch.com> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> At 01:19 PM 1/10/05 -0500, Clark C. Evans wrote: >Alex, > > This is wonderful work, thank you for keeping the ball in the air; > I'm honored to keep my name as a co-author -- kinda like a free lunch. > >Phillip, > > Once again, thank you! Without PyProtocols and your advocacy, > this proposal might have been buried in the historical bit-bucket. > >On Mon, Jan 10, 2005 at 12:43:44PM -0500, Phillip J. Eby wrote: >| -1 if this introduces a performance penalty to a wide range of >| adaptations (i.e. those using abstract base classes), just to support >| people who want to create deliberate Liskov violations. I personally >| don't think that we should pander to Liskov violators, especially since >| Guido seems to be saying that there will be some kind of interface >| objects available in future Pythons. > >I particularly like Alex's Liskov violation error; although it is >not hugely common, it does happen, and there should be a way for a >class to indicate that it's only being used for implementation. > >Perhaps... if the class doesn't have a __conform__ method, then its >adaptation is automatic (that is, only the class can raise this >case). The rationale for only enabling one of the two paths is that >the base class would have been in-place before the derived class was >created; therefore, it is highly unlikely that __adapt__ would ever >be of help. Therefore, there might be a performance penalty, but it'd >be really small, simply checking to see if the slot is filled in. The performance penalty I was talking about was for using an abstract base class, in a subclass with a __conform__ method for conformance to other protocols. In this case, __conform__ will be uselessly called every time the object is adapted to the abstract base class. IMO it's more desirable to support abstract base classes than to allow classes to "opt out" of inheritance when testing conformance to a base class. If you don't have an "is-a" relationship to your base class, you should be using delegation, not inheritance. (E.g. 'set' has-a 'dict', not 'set' is-a 'dict', so 'adapt(set,dict)' should fail, at least on the basis of isinstance checking.) The other problem with a Liskov opt-out is that you have to explicitly do a fair amount of work to create a LiskovViolation-raising subclass; that work would be better spent migrating to use delegation instead of inheritance, which would also be cleaner and more comprehensible code than writing a __conform__ hack to announce your bad style in having chosen to use inheritance where delegation is more appropriate. ;) This latter problem is actually much worse than the performance issue, which was just my initial impression. Now that I've thought about it some more, I think I'm against supporting Liskov violations even if it were somehow *faster*. :) From aleax at aleax.it Mon Jan 10 19:42:11 2005 From: aleax at aleax.it (Alex Martelli) Date: Mon Jan 10 19:42:16 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> Message-ID: <5691D454-6337-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 10, at 18:43, Phillip J. Eby wrote: ... > At 03:42 PM 1/10/05 +0100, Alex Martelli wrote: >> The fourth case above is subtle. A break of substitutability can >> occur when a subclass changes a method's signature, or restricts >> the domains accepted for a method's argument ("co-variance" on >> arguments types), or extends the co-domain to include return >> values which the base class may never produce ("contra-variance" >> on return types). While compliance based on class inheritance >> _should_ be automatic, this proposal allows an object to signal >> that it is not compliant with a base class protocol. > > -1 if this introduces a performance penalty to a wide range of > adaptations (i.e. those using abstract base classes), just to support > people who want to create deliberate Liskov violations. I personally > don't think that we should pander to Liskov violators, especially > since Guido seems to be saying that there will be some kind of > interface objects available in future Pythons. If interfaces can ensure against Liskov violations in instances of their subclasses, then they can follow the "case (a)" fast path, sure. Inheriting from an interface (in Guido's current proposal, as per his Artima blog) is a serious commitment from the inheritor's part; inheriting from an ordinary type, in real-world current practice, need not be -- too many cases of assumed covariance, for example, are around in the wild, to leave NO recourse in such cases and just assume compliance. >> Just like any other special method in today's Python, __conform__ >> is meant to be taken from the object's class, not from the object >> itself (for all objects, except instances of "classic classes" as >> long as we must still support the latter). This enables a >> possible 'tp_conform' slot to be added to Python's type objects in >> the future, if desired. > > One note here: Zope and PEAK sometimes use interfaces that a function > or module may implement. PyProtocols' implementation does this by > adding a __conform__ object to the function's dictionary so that the > function can conform to a particular signature. If and when > __conform__ becomes tp_conform, this may not be necessary any more, at > least for functions, because there will probably be some way for an > interface to tell if the function at least conforms to the appropriate > signature. But for modules this will still be an issue. > > I am not saying we shouldn't have a tp_conform; just suggesting that > it may be appropriate for functions and modules (as well as classic > classes) to have their tp_conform delegate back to > self.__dict__['__conform__'] instead of a null implementation. I have not considered conformance of such objects as functions or modules; if that is important, I need to add it to the reference implementation in the PEP. I'm reluctant to just get __conform__ from the object, though; it leads to all sort of issues with a *class* conforming vs its *instances*, etc. Maybe Guido can Pronounce a little on this sub-issue... > I don't see the benefit of LiskovViolation, or of doing the exact type > check vs. the loose check. What is the use case for these? Is it to > allow subclasses to say, "Hey I'm not my superclass?" It's also a bit > confusing to say that if the routines "raise any other exceptions" > they're propagated. Are you saying that LiskovViolation is *not* > propagated? Indeed I am -- I thought that was very clearly expressed! LiskovViolation means to skip the loose isinstance check, but it STILL allows explicitly registered adapter factories a chance (if somebody registers such an adapter factory, presumably they've coded a suitable adapter object type to deal with some deuced Liskov violation, see...). On the other hand, if some random exception occurs in __conform__ or __adapt__, that's a bug somewhere, so the exception propagates in order to help debugging. The previous version treated TypeError specially, but I think (on the basis of just playing around a bit, admittedly) that offers no real added value and sometimes will hide bugs. >> If none of the first four mechanisms worked, as a last-ditch >> attempt, 'adapt' falls back to checking a registry of adapter >> factories, indexed by the protocol and the type of `obj', to meet >> the fifth case. Adapter factories may be dynamically registered >> and removed from that registry to provide "third party adaptation" >> of objects and protocols that have no knowledge of each other, in >> a way that is not invasive to either the object or the protocols. > > This should either be fleshed out to a concrete proposal, or dropped. > There are many details that would need to be answered, such as whether > "type" includes subtypes and whether it really means type or > __class__. (Note that isinstance() now uses __class__, allowing proxy > objects to lie about their class; the adaptation system should support > this too, and both the Zope and PyProtocols interface systems and > PyProtocols' generic functions support it.) I disagree: I think the strawman-level proposal as fleshed out in the pep's reference implementation is far better than nothing. I mention the issue of subtypes explicitly later, including why the pep does NOT do anything special with them -- the reference implementation deals with specific types. And I use type(X) consistently, explicitly mentioning in the reference implementation that old-style classes are not covered. I didn't know about the "let the object lie" quirk in isinstance. If that quirk is indeed an intended design feature, rather than an implementation 'oops', it might perhaps be worth documenting it more clearly; I do not find that clearly spelled out in the place I'd expect it to be, namely under 'isinstance'. If the "let the object lie" quirk is indeed a designed-in feature, then, I agree, using x.__class__ rather than type(x) is mandatory in the PEP and its reference implementation; however, I'll wait for confirmation of design intent before I change the PEP accordingly. > One other issue: it's not possible to have standalone interoperable > PEP 246 implementations using a registry, unless there's a > standardized place to put it, and a specification for how it gets > there. Otherwise, if someone is using both say Zope and PEAK in the > same application, they would have to take care to register adaptations > in both places. This is actually a pretty minor issue since in > practice both frameworks' interfaces handle adaptation, so there is no > *need* for this extra registry in such cases. I'm not sure I understand this issue, so I'm sure glad it's "pretty minor". >> Adaptation is NOT "casting". When object X itself does not >> conform to protocol Y, adapting X to Y means using some kind of >> wrapper object Z, which holds a reference to X, and implements >> whatever operation Y requires, mostly by delegating to X in >> appropriate ways. For example, if X is a string and Y is 'file', >> the proper way to adapt X to Y is to make a StringIO(X), *NOT* to >> call file(X) [which would try to open a file named by X]. >> >> Numeric types and protocols may need to be an exception to this >> "adaptation is not casting" mantra, however. > > The issue isn't that adaptation isn't casting; why would casting a > string to a file mean that you should open that filename? Because, in most contexts, "casting" object X to type Y means calling Y(X). > I don't think that "adaptation isn't casting" is enough to explain > appropriate use of adaptation. For example, I think it's quite valid > to adapt a filename to a *factory* for opening files, or a string to a > "file designator". However, it doesn't make any sense (to me at > least) to adapt from a file designator to a file, which IMO is the > reason it's wrong to adapt from a string to a file in the way you > suggest. However, casting doesn't come into it > nywhere that I can see. Maybe we're using different definitions of "casting"? > If I were going to say anything about that case, I'd say that > adaptation should not be "lossy"; adapting from a designator to a file > loses information like what mode the file should be opened in. > (Similarly, I don't see adapting from float to int; if you want a cast > to int, cast it.) Or to put it another way, adaptability should imply > substitutability: a string may be used as a filename, a filename may > be used to designate a file. But a filename cannot be used as a file; > that makes no sense. I don't understand this "other way" -- nor, to be honest, what you "would say" earlier, either. I think it's pretty normal for adaptation to be "lossy" -- to rely on some but not all of the information in the original object: that's the "facade" design pattern, after all. It doesn't mean that some info in the original object is lost forever, since the original object need not be altered; it just means that not ALL of the info that's in the original object used in the adapter -- and, what's wrong with that?! For example, say that I have some immutable "record" types. One, type Person, defined in some framework X, has a huge lot of immutable data fields, including firstName, middleName, lastName, and many, many others. Another, type Employee, defines in some separate framework Y (that has no knowlege of X, and viceversa), has fewer data fields, and in particular one called 'fullName' which is supposed to be a string such as 'Firstname M. Lastname'. I would like to register an adapter factory from type Person to protocol Employeee. Since we said Person has many more data fields, adaptation will be "lossy" -- it will look upon Employee essentially as a "facade" (a simplified-interface) for Person. Given the immutability, we MIGHT as well 'cast' here...: def adapt_Person_to_Employee(person, protocol, alternate): assert issubclass(protocol, Y.Employee) return protocol(fullName='%s %s. %s' % ( person.firstName, person.middleName[0], person.lastName), ... although the canonical approach would be to make a wrapper: class adapt_Person_to_Employee(object): def __init__(self, person, protocol, alternate): assert issubclass(protocol, Y.Employee) self.p = person def getFullName(self): return '%s %s. %s' % ( self.p.firstName, self.p.middleName[0], self.p.lastName) fullName = property(getFullName) which would be more general (work fine even for a mutable Person). So, can you please explain your objections to what I said about adapting vs casting in terms of this example? Do you think the example, or some variation thereof, should go in the PEP? >> Reference Implementation and Test Cases >> >> The following reference implementation does not deal with classic >> classes: it consider only new-style classes. If classic classes >> need to be supported, the additions should be pretty clear, though >> a bit messy (x.__class__ vs type(x), getting boundmethods directly >> from the object rather than from the type, and so on). > > Please base a reference implementation off of either Zope or > PyProtocols' field-tested implementations which deal correctly with > __class__ vs. type(), and can detect whether they're calling a > __conform__ or __adapt__ at the wrong metaclass level, etc. Then, if > there is a reasonable use case for LiskovViolation and the new type > checking rules that justifies adding them, let's do so. I think that if a PEP includes a reference implementation, it should be self-contained rather than require some other huge package. If you can critique specific problems in the reference implementation, I'll be very grateful and eager to correct them. >> Transitivity of adaptation is in fact somewhat controversial, as >> is the relationship (if any) between adaptation and inheritance. > > The issue is simply this: what is substitutability? If you say that > interface B is substitutable for A, and C is substitutable for B, then > C *must* be substitutable for A, or we have inadequately defined > "substitutability". Not necessarily, depending on the pragmatics involved. > If adaptation is intended to denote substitutability, then there can > be absolutely no question that it is transitive, or else it is not > possible to have any meaning for interface inheritance! If interface inheritance is intended to express ensured substitutability (finessing pragmatics), fine. I'm not willing to commit to that meaning in the PEP. Dinnertime -- I'd better send this already-long answer, and deal with the highly controversial remaining issues later. Thanks, BTW, for your highly detailed feedback. Alex From mwh at python.net Mon Jan 10 19:53:33 2005 From: mwh at python.net (Michael Hudson) Date: Mon Jan 10 19:53:35 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5691D454-6337-11D9-ADA4-000A95EFAE9E@aleax.it> (Alex Martelli's message of "Mon, 10 Jan 2005 19:42:11 +0100") References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5691D454-6337-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <2moefxf74i.fsf@starship.python.net> Alex Martelli writes: > I didn't know about the "let the object lie" quirk in isinstance. If > that quirk is indeed an intended design feature, rather than an > implementation 'oops', it might perhaps be worth documenting it more > clearly; I do not find that clearly spelled out in the place I'd > expect it to be, namely > under 'isinstance'. Were you not at the PyPy sprint where bugs in some __getattr__ method caused infinite recursions on the isinstance's code attempting to access __class__? The isinstance code then silently eats the error, so we had (a) a massive slowdown and (b) isinstance failing in an "impossible" way. A clue was that if you ran the code on OS X with its silly default stack limits the code dumped core instead of going slowly insane. This is on quirk I'm not likely to forget in a hurry... Cheers, mwh -- If trees could scream, would we be so cavalier about cutting them down? We might, if they screamed all the time, for no good reason. -- Jack Handey From cce at clarkevans.com Mon Jan 10 20:27:23 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Mon Jan 10 20:27:26 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> Message-ID: <20050110192723.GA94340@prometheusresearch.com> On Mon, Jan 10, 2005 at 01:34:59PM -0500, Phillip J. Eby wrote: | The performance penalty I was talking about was for using an abstract | base class, in a subclass with a __conform__ method for conformance to | other protocols. In this case, __conform__ will be uselessly called | every time the object is adapted to the abstract base class. *nod* If this proposal was "packaged" with an "interface" mechanism, would this address your concern? In this scenerio, there are two cases: - Older classes will most likely not have a __conform__ method. - Newer classes will use the 'interface' mechanism. In this scenerio, there isn't a performance penalty for the usual case; and for migration purposes, a flag could be added to disable the checking. Best, Clark From michel at dialnetwork.com Tue Jan 11 01:16:04 2005 From: michel at dialnetwork.com (Michel Pelletier) Date: Mon Jan 10 22:46:41 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050110175818.429661E4002@bag.python.org> References: <20050110175818.429661E4002@bag.python.org> Message-ID: <200501101616.05123.michel@dialnetwork.com> On Monday 10 January 2005 09:58 am, python-dev-request@python.org wrote: > Message: 3 > Date: Mon, 10 Jan 2005 07:46:39 -0800 > From: Guido van Rossum > Subject: Re: [Python-Dev] PEP 246, redux > To: Alex Martelli > Cc: "Clark C.Evans" , Python Dev > > Message-ID: > Content-Type: text/plain; charset=US-ASCII > > > I had been promising to rewrite PEP 246 to incorporate the last several > > years' worth of discussions &c about it, and Guido's recent "stop the > > flames" artima blog post finally pushed me to complete the work. > > Feedback is of course welcome, so I thought I had better repost it > > here, rather than relying on would-be commenters to get it from CVS... > > Thanks for doing this, Alex! I yet have to read the whole thing [will > attempt do so later today] but the few snippets I caught make me feel > this is a big step forward. Me too! I didn't realize it the first time 246 came around how important adaptation was and how interfaces just aren't as useful without it. > > I'm wondering if someone could do a similar thing for PEP 245, > interfaces syntax? Alex hinted that it's a couple of rounds behind the > developments in Zope and Twisted. Nothing implements 245, which is just about the syntax, I intended to write another PEP describing an implementation, at the time Jim's original straw-man; which I'm glad I didn't do as it would have been a waste of time. Had I written that document, then it would be a copule of rounds behind Zope and Twisted. But as it stands now nothing need be based on 245. > I'm personally not keen on needing > *two* new keywords (interface and implements) so I hope that whoever > does the rewrite could add a section on the advantages and > disadvantages of the 'implements' keyword (my simplistic alternative > proposal is to simply include interfaces in the list of bases in the > class statement; the metaclass can then sort it out). I like implements, but any spelling works for me. "implements" strikes me as an elegant counterpart to "interface" and risks minimal breakage. Can we still import and say "implements()" for b/w compatibility and for those of us who do want an explicit statement like that? -Michel From m.bless at gmx.de Mon Jan 10 23:04:25 2005 From: m.bless at gmx.de (Martin Bless) Date: Mon Jan 10 23:04:11 2005 Subject: [Python-Dev] Re: csv module TODO list References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <3o50u0tmv1hbpt71jkre94n32q38cdpbdb@4ax.com> <20050109233717.8A3C33C8E5@coffee.object-craft.com.au> Message-ID: <0ns5u0t0jihihultsjulm5nicosb02c6uj@4ax.com> On Mon, 10 Jan 2005 10:37:17 +1100, Andrew McNamara wrote: >>csv.join(aList, e[, dialect='excel'[, fmtparam]]) -> str object Oops, should have been csv.join(aList [, dialect='excel'[, fmtparam]]) -> str object >Yes, it's feasible, Good! >although newlines can be embedded in within fields >of a CSV record, hence the use of the iterator, rather than working with >strings. In my use cases newlines usually don't come into play. It would be ok for me if they were treated as any other char. > In your example above, if the parser gets to the end of the >string and finds it's still within a field, I'd propose just raising >an exception. Yes, that seems to be "the right answer". >No promises, however - I only have a finite ammount of time to work on >this at the moment. Sure! To my feeling these "intelligent split and join" functions most naturally should actually be string methods. I can see that - considering the conceivable variety of dialects - this can't be done. One more reason to have 'split' and 'join' available from the csv module! mb - Martin From pje at telecommunity.com Mon Jan 10 23:12:40 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 10 23:11:28 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <200501101616.05123.michel@dialnetwork.com> References: <20050110175818.429661E4002@bag.python.org> <20050110175818.429661E4002@bag.python.org> Message-ID: <5.1.1.6.0.20050110170408.039f12e0@mail.telecommunity.com> At 04:16 PM 1/10/05 -0800, Michel Pelletier wrote: > > From: Guido van Rossum > > Subject: Re: [Python-Dev] PEP 246, redux > > > > I'm wondering if someone could do a similar thing for PEP 245, > > interfaces syntax? Alex hinted that it's a couple of rounds behind the > > developments in Zope and Twisted. > >Nothing implements 245, which is just about the syntax, The comment Guido's alluding to was mine; I was referring to PEP 245's use of '__implements__', and the difference between what a "class implements" and an "instance provides". Twisted and Zope's early implementations just looked for ob.__implements__, which leads to issues with distinguishing between what a "class provides" from what its "instances provide". So, I was specifically saying that this aspect of PEP 245 (and Guido's basing a Python interface implementation thereon) should be re-examined in the light of current practices that avoid this issue. (I don't actually know what Zope currently does; it was changed after I had moved to using PyProtocols. But the PyProtocols test suite tests that Zope does in fact have correct behavior for instances versus classes, because it's needed to exercise the PyProtocols-Zope interop tests.) >I like implements, but any spelling works for me. "implements" strikes me as >an elegant counterpart to "interface" and risks minimal breakage. Can we >still import and say "implements()" for b/w compatibility and for those of us >who do want an explicit statement like that? If I understand Guido's proposal correctly, it should be possible to make a backward-compatible 'implements()' declaration function. Maybe not *easy*, but certainly possible. From theller at python.net Mon Jan 10 23:15:38 2005 From: theller at python.net (Thomas Heller) Date: Mon Jan 10 23:14:16 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: (Alex Martelli's message of "Mon, 10 Jan 2005 15:42:11 +0100") References: Message-ID: Alex Martelli writes: > PEP: 246 > Title: Object Adaptation Minor nit (or not?): You could provide a pointer to the Liskov substitution principle, for those reader that aren't too familiar with that term. Besides, the text mentions three times that LiskovViolation is a subclass of AdaptionError (plus once in the ref impl section). Thomas From pje at telecommunity.com Mon Jan 10 23:19:02 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 10 23:17:50 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050110192723.GA94340@prometheusresearch.com> References: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050110171247.039f35d0@mail.telecommunity.com> At 02:27 PM 1/10/05 -0500, Clark C. Evans wrote: >If this proposal was "packaged" with an "interface" mechanism, would >this address your concern? In this scenerio, there are two cases: > > - Older classes will most likely not have a __conform__ method. > - Newer classes will use the 'interface' mechanism. > >In this scenerio, there isn't a performance penalty for the >usual case; and for migration purposes, a flag could be added >to disable the checking. As I said, after more thought, I'm actually less concerned about the performance than I am about even remotely encouraging the combination of Liskov violation *and* concrete adaptation targets. But, if "after the dust settles" it turns out this is going to be supported after all, then we can worry about the performance if need be. Note, however, that your statements actually support the idea of *not* adding a special case for Liskov violators. If newer code uses interfaces, the Liskov-violation mechanism is useless. If older code doesn't have __conform__, it cannot possibly *use* the Liskov-violation mechanism. So, if neither old code nor new code will use the mechanism, why have it? :) From pje at telecommunity.com Mon Jan 10 22:38:55 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 10 23:28:38 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5691D454-6337-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> At 07:42 PM 1/10/05 +0100, Alex Martelli wrote: >On 2005 Jan 10, at 18:43, Phillip J. Eby wrote: > ... >>At 03:42 PM 1/10/05 +0100, Alex Martelli wrote: >>> The fourth case above is subtle. A break of substitutability can >>> occur when a subclass changes a method's signature, or restricts >>> the domains accepted for a method's argument ("co-variance" on >>> arguments types), or extends the co-domain to include return >>> values which the base class may never produce ("contra-variance" >>> on return types). While compliance based on class inheritance >>> _should_ be automatic, this proposal allows an object to signal >>> that it is not compliant with a base class protocol. >> >>-1 if this introduces a performance penalty to a wide range of >>adaptations (i.e. those using abstract base classes), just to support >>people who want to create deliberate Liskov violations. I personally >>don't think that we should pander to Liskov violators, especially since >>Guido seems to be saying that there will be some kind of interface >>objects available in future Pythons. > >If interfaces can ensure against Liskov violations in instances of their >subclasses, then they can follow the "case (a)" fast path, sure. >Inheriting from an interface (in Guido's current proposal, as per his >Artima blog) is a serious commitment from the inheritor's part; inheriting >from an ordinary type, in real-world current practice, need not be -- too >many cases of assumed covariance, for example, are around in the wild, to >leave NO recourse in such cases and just assume compliance. I understand that, sure. But I don't understand why we should add complexity to PEP 246 to support not one but *two* bad practices: 1) implementing Liskov violations and 2) adapting to concrete classes. It is only if you are doing *both* of these that this extra feature is needed. If it were to support some kind of backward compatibility, that would be understandable. However, in practice, I don't know of anybody using adapt(x,ConcreteClass), and even if they did, the person subclassing ConcreteClass will need to change their subclass to raise LiskovViolation, so why not just switch to delegation? Anyway, it seems to me a bad idea to add complexity to support this case. Do you have a more specific example of a situation in which a Liskov violation coupled to concrete class adaptation is a good idea? Or am I missing something here? >>I am not saying we shouldn't have a tp_conform; just suggesting that it >>may be appropriate for functions and modules (as well as classic classes) >>to have their tp_conform delegate back to self.__dict__['__conform__'] >>instead of a null implementation. > >I have not considered conformance of such objects as functions or modules; >if that is important, It's used in at least Zope and PEAK; I don't know if it's in use in Twisted. > I need to add it to the reference implementation in the PEP. I'm > reluctant to just get __conform__ from the object, though; it leads to > all sort of issues with a *class* conforming vs its *instances*, > etc. Maybe Guido can Pronounce a little on this sub-issue... Actually, if you looked at the field-tested implementations of the old PEP 246, they actually have code that deals with this issue effectively, by recognizing TypeError when raised by attempting to invoke __adapt__ or __conform__ with the wrong number of arguments or argument types. (The traceback for such errors does not include a frame for the called method, versus a TypeError raised *within* the function, which does have such a frame. AFAIK, this technique should be compatible with any Python implementation that has traceback objects and does signature validation in its "native" code rather than in a new Python frame.) >>I don't see the benefit of LiskovViolation, or of doing the exact type >>check vs. the loose check. What is the use case for these? Is it to >>allow subclasses to say, "Hey I'm not my superclass?" It's also a bit >>confusing to say that if the routines "raise any other exceptions" >>they're propagated. Are you saying that LiskovViolation is *not* propagated? > >Indeed I am -- I thought that was very clearly expressed! The PEP just said that it would be raised by __conform__ or __adapt__, not that it would be caught by adapt() or that it would be used to control the behavior in that way. Re-reading, I see that you do mention it much farther down. But at the point where __conform__ and __adapt__ are explained, it has not been explained that adapt() should catch the error or do anything special with it. It is simply implied by the "to prevent this default behavior" at the end of the section. If this approach is accepted, the description should be made explicit, becausse for me at least it required a retroactive re-interpretation of the earlier part of the spec. >The previous version treated TypeError specially, but I think (on the >basis of just playing around a bit, admittedly) that offers no real added >value and sometimes will hide bugs. See http://peak.telecommunity.com/protocol_ref/node9.html for an analysis of the old PEP 246 TypeError behavior, and the changes made the by PyProtocols and Zope to deal with the situation better, while still respecting the fact that __conform__ and __adapt__ may be retrieved from the wrong "meta level" of descriptor. Your new proposal does not actually fix this problem in the absence of tp_conform/tp_adapt slots; it merely substitutes possible confusion at the metaclass/class level for confusion at the class/instance level. The only way to actually fix this is to detect when you have called the wrong level, and that is what the PyProtocols and Zope implementations of "old PEP 246" do. (PyProtocols also introduces a special descriptor for methods defined on metaclasses, to help avoid creating this possible confusion in the first place, but that is a separate issue.) >>> If none of the first four mechanisms worked, as a last-ditch >>> attempt, 'adapt' falls back to checking a registry of adapter >>> factories, indexed by the protocol and the type of `obj', to meet >>> the fifth case. Adapter factories may be dynamically registered >>> and removed from that registry to provide "third party adaptation" >>> of objects and protocols that have no knowledge of each other, in >>> a way that is not invasive to either the object or the protocols. >> >>This should either be fleshed out to a concrete proposal, or dropped. >>There are many details that would need to be answered, such as whether >>"type" includes subtypes and whether it really means type or >>__class__. (Note that isinstance() now uses __class__, allowing proxy >>objects to lie about their class; the adaptation system should support >>this too, and both the Zope and PyProtocols interface systems and >>PyProtocols' generic functions support it.) > >I disagree: I think the strawman-level proposal as fleshed out in the >pep's reference implementation is far better than nothing. I'm not proposing to flesh out the functionality, just the specification; it should not be necessary to read the reference implementation and try to infer intent from it. What part is implementation accident, and what is supposed to be the specification? That's all I'm talking about here. As currently written, the proposal is just, "we should have a registry", and is not precise enough to allow someone to implement it based strictly on the specification. > I mention the issue of subtypes explicitly later, including why the pep > does NOT do anything special with them -- the reference implementation > deals with specific types. And I use type(X) consistently, explicitly > mentioning in the reference implementation that old-style classes are not > covered. As a practical matter, classic classes exist and are useful, and PEP 246 implementations already exist that work with them. Dropping that functionality is a major step backward for PEP 246, IMO. >I didn't know about the "let the object lie" quirk in isinstance. If that >quirk is indeed an intended design feature, It is; it's in one of the "what's new" feature highlights for either 2.3 or 2.4, I forget which. It was intended to allow proxy objects (like security proxies in Zope 3) to pretend to be an instance of the class they are proxying. >>One other issue: it's not possible to have standalone interoperable PEP >>246 implementations using a registry, unless there's a standardized place >>to put it, and a specification for how it gets there. Otherwise, if >>someone is using both say Zope and PEAK in the same application, they >>would have to take care to register adaptations in both places. This is >>actually a pretty minor issue since in practice both frameworks' >>interfaces handle adaptation, so there is no *need* for this extra >>registry in such cases. > >I'm not sure I understand this issue, so I'm sure glad it's "pretty minor". All I was saying is that if you have two 'adapt()' implementations, each using its own registry, you have a possible interoperability problem. Two 'adapt()' implementations that conform strictly to PEP 246 *without* a registry are interoperable because their behavior is the same. As a practical matter, all this means is that standalone PEP 246 implementations for older versions of Python either shouldn't implement a registry, or they need to have a standard place to put it that they can share with each other, and it needs to be implemented the same way. (This is one reason I think the registry specification should be more formal; it may be necessary for existing PEP 246 implementations to be forward-compatible with the spec as implemented in later Python versions.) >>The issue isn't that adaptation isn't casting; why would casting a string >>to a file mean that you should open that filename? > >Because, in most contexts, "casting" object X to type Y means calling Y(X). Ah; I had not seen that called "casting" in Python, at least not to my immediate recollection. However, if that is what you mean, then why not say it? :) >Maybe we're using different definitions of "casting"? I'm most accustomed to the C and Java definitions of casting, so that's probably why I can't see how it relates at all. :) >>If I were going to say anything about that case, I'd say that adaptation >>should not be "lossy"; adapting from a designator to a file loses >>information like what mode the file should be opened in. >>(Similarly, I don't see adapting from float to int; if you want a cast to >>int, cast it.) Or to put it another way, adaptability should imply >>substitutability: a string may be used as a filename, a filename may be >>used to designate a file. But a filename cannot be used as a file; that >>makes no sense. > >I don't understand this "other way" -- nor, to be honest, what you "would >say" earlier, either. I think it's pretty normal for adaptation to be >"lossy" -- to rely on some but not all of the information in the original >object: that's the "facade" design pattern, after all. It doesn't mean >that some info in the original object is lost forever, since the original >object need not be altered; it just means that not ALL of the info that's >in the original object used in the adapter -- and, what's wrong with that?! I think we're using different definitions of "lossy", too. I mean that defining an adaptation relationship between two types when there is more than one "sensible" way to get from one to the other is "lossy" of semantics/user choice. If I have a file designator (such as a filename), I can choose how to open it. If I adapt directly from string to file by way of filename, I lose this choice (it is "lossy" adaptation). Here's a better way of phrasing it (I hope): adaptation should be unambiguous. There should only be one sensible way to interpret a thing as implementing a particular interface, otherwise, adaptation itself has no meaning. Whether an adaptation adds or subtracts behavior, it does not really change the underlying *intended* meaning of a thing, or else it is not really adaptation. Adapting 12.0 to 12 does not change the meaning of the value, but adapting from 12.1 to 12 does. Does that make more sense? I think that some people start using adaptation and want to use it for all kinds of crazy things because it seems cool. However, it takes a while to see that adaptation is just about removing unnecessary accidents-of-incompatibility; it's not a license to transform arbitrary things into arbitrary things. There has to be some *meaning* to a particular adaptation, or the whole concept rapidly degenerates into an undifferentiated mess. (Or else, you decide to "fix" it by disallowing transitive adaptation, which IMO is like cutting off your hand because it hurts when you punch a brick wall. Stop punching brick walls (i.e. using semantic-lossy adaptations), and the problem goes away. But I realize that I'm in the minority here with regards to this opinion.) >For example, say that I have some immutable "record" types. One, type >Person, defined in some framework X, has a huge lot of immutable data >fields, including firstName, middleName, lastName, and many, many >others. Another, type Employee, defines in some separate framework Y >(that has no knowlege of X, and viceversa), has fewer data fields, and in >particular one called 'fullName' which is supposed to be a string such as >'Firstname M. Lastname'. I would like to register an adapter factory from >type Person to protocol Employeee. Since we said Person has many more >data fields, adaptation will be "lossy" -- it will look upon Employee >essentially as a "facade" (a simplified-interface) for Person. But it doesn't change the *meaning*. I realize that "meaning" is not an easy concept to pin down into a nice formal definition. I'm just saying that adaptation is about semantics-preserving transformations, otherwise you could just tack an arbitrary object on to something and call it an adapter. Adapters should be about exposing an object's *existing semantics* in terms of a different interface, whether the interface is a subset or superset of the original object's interface. However, they should not add or remove arbitrary semantics that are not part of the difference in interfaces. For example, adding a "current position" to a string to get a StringIO is a difference that is reflected in the difference in interface: a StringIO *is* just a string of characters with a current position that can be used in place of slicing. But interpreting a string as a *file* doesn't make sense because of added semantics that have to be "made up", and are not merely interpreting the string's semantics "as a" file. I suppose you could say that this is "noisy" adaptation rather than "lossy". That is, to treat a string as a file by using it as a filename, you have to make up things that aren't present in the string. (Versus the StringIO, where there's a sensible interpretation of a string "as a" StringIO.) IOW, adaptation is all about "as a" relationships from concrete objects to abstract roles, and between abstract roles. Although one may colloquially speak of using a screwdriver "as a" hammer, this is not the case in adaptation. One may use a screwdriver "as a" pounder-of-nails. The difference is that a hammer might also be usable "as a" remover-of-nails. Therefore, there is no general "as a" relationship between pounder-of-nails and remover-of-nails, even though a hammer is usable "as" either one. Thus, it does not make sense to say that a screwdriver is usable "as a" hammer, because this would imply it's also usable to remove nails. This is why I don't believe it makes sense in the general case to adapt to concrete classes; such classes usually have many roles where they are usable. I think the main difference in your position and mine is that I think one should adapt primarily to interfaces, and interface-to-interface adaptation should be reserved for non-lossy, non-noisy adapters. Where if I understand the opposing position correctly, it is instead that one should avoid transitivity so that loss and noise do not accumulate too badly. >So, can you please explain your objections to what I said about adapting >vs casting in terms of this example? Do you think the example, or some >variation thereof, should go in the PEP? I'm not sure I see how that helps. I think it might be more useful to say that adaptation is not *conversion*, which is not the same thing (IME) as casting. Casting in C and Java does not actually "convert" anything; it simply treats a value or object as if it were of a different type. ISTM that bringing casting into the terminology just complicates the picture, because e.g. casting in Java actually corresponds to the subset of PEP 246 adaptation for cases where adapt() returns the original object or raises an error. (That is, if adapt() could only ever return the original object or raise an error, it would be precisely equivalent to Java casting, if I understand it correctly.) Thus, at least with regard to object casting in Java, adaptation is a superset, and saying that it's not casting is just confusing. >>>Reference Implementation and Test Cases >>> >>> The following reference implementation does not deal with classic >>> classes: it consider only new-style classes. If classic classes >>> need to be supported, the additions should be pretty clear, though >>> a bit messy (x.__class__ vs type(x), getting boundmethods directly >>> from the object rather than from the type, and so on). >> >>Please base a reference implementation off of either Zope or PyProtocols' >>field-tested implementations which deal correctly with __class__ vs. >>type(), and can detect whether they're calling a __conform__ or __adapt__ >>at the wrong metaclass level, etc. Then, if there is a reasonable use >>case for LiskovViolation and the new type checking rules that justifies >>adding them, let's do so. > >I think that if a PEP includes a reference implementation, it should be >self-contained rather than require some other huge package. If you can >critique specific problems in the reference implementation, I'll be very >grateful and eager to correct them. Sure, I've got some above (e.g. your implementation will raise a spurious TypeError if it calls an __adapt__ or __conform__ at the wrong metaclass level, and getting them from the type does *not* fix this issue, it just bumps it up by one metalevel). I wasn't proposing you pull in either whole package, though; just adapt() itself. Here's the existing Python one from PyProtocols (there's also a more low-level one using the Python/C API, but it's probably not appropriate for the spec): from sys import exc_info from types import ClassType ClassTypes = type, ClassType def adapt(obj, protocol, default=_marker): """PEP 246-alike: Adapt 'obj' to 'protocol', return 'default' If 'default' is not supplied and no implementation is found, the result of 'factory(obj,protocol)' is returned. If 'factory' is also not supplied, 'NotImplementedError' is then raised.""" if isinstance(protocol,ClassTypes) and isinstance(obj,protocol): return obj try: _conform = obj.__conform__ except AttributeError: pass else: try: result = _conform(protocol) if result is not None: return result except TypeError: if exc_info()[2].tb_next is not None: raise try: _adapt = protocol.__adapt__ except AttributeError: pass else: try: result = _adapt(obj) if result is not None: return result except TypeError: if exc_info()[2].tb_next is not None: raise if default is _marker: raise AdaptationFailure("Can't adapt", obj, protocol) return default Obviously, some changes would need to be made to implement your newly proposed functionality, but this one does support classic classes, modules, and functions, and it has neither the TypeError-hiding problem of the original PEP 246 nor the TypeError-raising problem of your new version. >>> Transitivity of adaptation is in fact somewhat controversial, as >>> is the relationship (if any) between adaptation and inheritance. >> >>The issue is simply this: what is substitutability? If you say that >>interface B is substitutable for A, and C is substitutable for B, then C >>*must* be substitutable for A, or we have inadequately defined >>"substitutability". > >Not necessarily, depending on the pragmatics involved. In that case, I generally prefer to be explicit and use conversion rather than using adaptation. For example, if I really mean to truncate the fractional part of a number, I believe it's then appropriate to use 'int(someNumber)' and make it clear that I'm intentionally using a lossy conversion rather than simply treating a number "as an" integer without changing its meaning. >Thanks, BTW, for your highly detailed feedback. No problem; talking this out helps me clarify my own thoughts on these matters. I haven't had much occasion to clarify these matters, and when they come up, it's usually in the context of arguing some specific inappropriate use of adaptation, so I can easily present an alternative that makes sense in that context. This discussion is helping me clarify the general principle, since I have to try to argue the general case, not just N specific cases. :) From bob at redivi.com Mon Jan 10 23:42:52 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 10 23:43:05 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> Message-ID: On Jan 10, 2005, at 16:38, Phillip J. Eby wrote: > At 07:42 PM 1/10/05 +0100, Alex Martelli wrote: > >> On 2005 Jan 10, at 18:43, Phillip J. Eby wrote: >> ... >>> I am not saying we shouldn't have a tp_conform; just suggesting that >>> it may be appropriate for functions and modules (as well as classic >>> classes) to have their tp_conform delegate back to >>> self.__dict__['__conform__'] instead of a null implementation. >> >> I have not considered conformance of such objects as functions or >> modules; if that is important, > > It's used in at least Zope and PEAK; I don't know if it's in use in > Twisted. SVN trunk of Twisted (what will be 2.0) uses zope.interface. It still has the older stuff implemented as a wrapper on top of zope.interface, but I think the guideline is to just use zope.interface directly for new code dependent on Twisted 2.0. -bob From pje at telecommunity.com Mon Jan 10 23:59:40 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 10 23:58:30 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050110175734.02dfcda0@mail.telecommunity.com> At 05:42 PM 1/10/05 -0500, Bob Ippolito wrote: >On Jan 10, 2005, at 16:38, Phillip J. Eby wrote: > >>At 07:42 PM 1/10/05 +0100, Alex Martelli wrote: >> >>>On 2005 Jan 10, at 18:43, Phillip J. Eby wrote: >>> ... >>>>I am not saying we shouldn't have a tp_conform; just suggesting that it >>>>may be appropriate for functions and modules (as well as classic >>>>classes) to have their tp_conform delegate back to >>>>self.__dict__['__conform__'] instead of a null implementation. >>> >>>I have not considered conformance of such objects as functions or >>>modules; if that is important, >> >>It's used in at least Zope and PEAK; I don't know if it's in use in Twisted. > >SVN trunk of Twisted (what will be 2.0) uses zope.interface. What I meant was, I don't know if Twisted actually *uses* interface declarations for modules and functions. It has the ability to do so, certainly. I was just saying I didn't know if the ability is actually used. PEAK uses some interfaces for functions, but I don't think I've ever used them for modules, and can think of only one place in PEAK where it would make sense to declare a module as supporting an interface. Zope policy is to use interfaces for *everything*, though, including documenting the interface provided by modules. From phil at secdev.org Mon Jan 10 17:17:49 2005 From: phil at secdev.org (Philippe Biondi) Date: Tue Jan 11 01:53:33 2005 Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support Message-ID: Hi, I've done a small patch to use linux AF_NETLINK sockets (see below). Please comment! Is there a reason for recvmsg() and sendmsg() not to be implemented yet in socketmodule ? The integration with autoconf has not been done, even if this patch should be ok : --- configure.in.ori 2005-01-10 17:09:32.000000000 +0100 +++ configure.in 2005-01-06 18:53:18.000000000 +0100 @@ -967,7 +967,7 @@ sys/audioio.h sys/bsdtty.h sys/file.h sys/loadavg.h sys/lock.h sys/mkdev.h \ sys/modem.h \ sys/param.h sys/poll.h sys/select.h sys/socket.h sys/time.h sys/times.h \ -sys/un.h sys/utsname.h sys/wait.h pty.h libutil.h \ +sys/un.h linux/netlink.h sys/utsname.h sys/wait.h pty.h libutil.h \ sys/resource.h netpacket/packet.h sysexits.h bluetooth.h \ bluetooth/bluetooth.h) AC_HEADER_DIRENT --- pyconfig.h.ori 2005-01-10 17:11:11.000000000 +0100 +++ pyconfig.h 2005-01-06 19:27:33.000000000 +0100 @@ -559,6 +559,9 @@ /* Define to 1 if you have the header file. */ #define HAVE_SYS_UN_H 1 +/* Define to 1 if you have the header file. */ +#define HAVE_LINUX_NETLINK_H 1 + /* Define to 1 if you have the header file. */ #define HAVE_SYS_UTSNAME_H 1 --- socketmodule.h.ori 2005-01-07 19:25:18.000000000 +0100 +++ socketmodule.h 2005-01-06 18:20:54.000000000 +0100 @@ -32,6 +32,12 @@ # undef AF_UNIX #endif +#ifdef HAVE_LINUX_NETLINK_H +# include +#else +# undef AF_NETLINK +#endif + #ifdef HAVE_BLUETOOTH_BLUETOOTH_H #include #include @@ -87,6 +93,9 @@ typedef struct { #ifdef AF_UNIX struct sockaddr_un un; #endif +#ifdef AF_NETLINK + struct sockaddr_nl nl; +#endif #ifdef ENABLE_IPV6 struct sockaddr_in6 in6; struct sockaddr_storage storage; --- socketmodule.c.ori 2005-01-07 19:25:19.000000000 +0100 +++ socketmodule.c 2005-01-10 17:04:38.000000000 +0100 @@ -948,6 +948,14 @@ makesockaddr(int sockfd, struct sockaddr } #endif /* AF_UNIX */ +#if defined(AF_NETLINK) + case AF_NETLINK: + { + struct sockaddr_nl *a = (struct sockaddr_nl *) addr; + return Py_BuildValue("ii", a->nl_pid, a->nl_groups); + } +#endif /* AF_NETLINK */ + #ifdef ENABLE_IPV6 case AF_INET6: { @@ -1084,6 +1092,31 @@ getsockaddrarg(PySocketSockObject *s, Py } #endif /* AF_UNIX */ +#if defined(AF_NETLINK) + case AF_NETLINK: + { + struct sockaddr_nl* addr; + int pid, groups; + addr = (struct sockaddr_nl *)&(s->sock_addr).nl; + if (!PyTuple_Check(args)) { + PyErr_Format( + PyExc_TypeError, + "getsockaddrarg: " + "AF_NETLINK address must be tuple, not %.500s", + args->ob_type->tp_name); + return 0; + } + if (!PyArg_ParseTuple(args, "II", &pid, &groups)) + return 0; + addr->nl_family = AF_NETLINK; + addr->nl_pid = pid; + addr->nl_groups = groups; + *addr_ret = (struct sockaddr *) addr; + *len_ret = sizeof(*addr); + return 1; + } +#endif + case AF_INET: { struct sockaddr_in* addr; @@ -1280,6 +1313,13 @@ getsockaddrlen(PySocketSockObject *s, so return 1; } #endif /* AF_UNIX */ +#if defined(AF_NETLINK) + case AF_NETLINK: + { + *len_ret = sizeof (struct sockaddr_nl); + return 1; + } +#endif case AF_INET: { @@ -3938,8 +3978,20 @@ init_socket(void) PyModule_AddIntConstant(m, "AF_KEY", AF_KEY); #endif #ifdef AF_NETLINK - /* */ + /* Netlink socket */ PyModule_AddIntConstant(m, "AF_NETLINK", AF_NETLINK); + PyModule_AddIntConstant(m, "NETLINK_ROUTE", NETLINK_ROUTE); + PyModule_AddIntConstant(m, "NETLINK_SKIP", NETLINK_SKIP); + PyModule_AddIntConstant(m, "NETLINK_USERSOCK", NETLINK_USERSOCK); + PyModule_AddIntConstant(m, "NETLINK_FIREWALL", NETLINK_FIREWALL); + PyModule_AddIntConstant(m, "NETLINK_TCPDIAG", NETLINK_TCPDIAG); + PyModule_AddIntConstant(m, "NETLINK_NFLOG", NETLINK_NFLOG); + PyModule_AddIntConstant(m, "NETLINK_XFRM", NETLINK_XFRM); + PyModule_AddIntConstant(m, "NETLINK_ARPD", NETLINK_ARPD); + PyModule_AddIntConstant(m, "NETLINK_ROUTE6", NETLINK_ROUTE6); + PyModule_AddIntConstant(m, "NETLINK_IP6_FW", NETLINK_IP6_FW); + PyModule_AddIntConstant(m, "NETLINK_DNRTMSG", NETLINK_DNRTMSG); + PyModule_AddIntConstant(m, "NETLINK_TAPBASE", NETLINK_TAPBASE); #endif #ifdef AF_ROUTE /* Alias to emulate 4.4BSD */ -- Philippe Biondi SecDev.org Security Consultant/R&D http://www.secdev.org PGP KeyID:3D9A43E2 FingerPrint:C40A772533730E39330DC0985EE8FF5F3D9A43E2 From neal at metaslash.com Tue Jan 11 00:31:26 2005 From: neal at metaslash.com (Neal Norwitz) Date: Tue Jan 11 01:53:34 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <20050105110849.CBA843C8E5@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050105110849.CBA843C8E5@coffee.object-craft.com.au> Message-ID: <20050110233126.GA14363@janus.swcomplete.com> On Wed, Jan 05, 2005 at 10:08:49PM +1100, Andrew McNamara wrote: > >Also, review comments from Neal Norwitz, 22 Mar 2003 (some of these should > >already have been addressed): > > I should apologise to Neal here for not replying to him at the time. Hey, I'm impressed you got to them. :-) I completely forgot about it. > >* rather than use PyErr_BadArgument, should you use assert? > > (first example, Dialect_set_quoting, line 218) > > You mean C assert()? I don't think I'm really following you here - > where would the type of the object be checked in a way the user could > recover from? IIRC, I meant C assert(). This goes back to a discussion a long time ago about what is the preferred way to handle invalid arguments. I doubt it's important to change. > >* I think you need PyErr_NoMemory() before returning on line 768, 1178 > > The examples I looked at in the Python core didn't do this - are you sure? > (now lines 832 and 1280). Originally, they were a plain PyObject_NEW(). Now they are a PyObject_GC_New() so it seems no further change is necessary. > >* is PyString_AsString(self->dialect->lineterminator) on line 994 > > guaranteed not to return NULL? If not, it could crash by > > passing to memmove. > >* PyString_AsString() can return NULL on line 1048 and 1063, > > the result is passed to join_append() > > Looking at the PyString_AsString implementation, it looks safe (we ensure > it's really a string elsewhere)? Ok. Then it should be fine. I spot checked lineterminator and it looked ok. > >* iteratable should be iterable? (line 1088) > > Sorry, I don't know what you're getting at here? (now line 1162). Heh, I had to read that twice myself. It was a typo (assuming I wasn't completely wrong)--an extra "at", but it doesn't exist any longer. I don't think there are any changes remaining to be done from my original code review. BTW, I always try to run valgrind before a release, especially major releases. Neal From dw-python.org at botanicus.net Tue Jan 11 02:32:52 2005 From: dw-python.org at botanicus.net (David Wilson) Date: Tue Jan 11 02:32:56 2005 Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support In-Reply-To: References: Message-ID: <20050111013252.GA216@thailand.botanicus.net> On Mon, Jan 10, 2005 at 05:17:49PM +0100, Philippe Biondi wrote: > I've done a small patch to use linux AF_NETLINK sockets (see below). > Please comment! As of 2.6.10, a very useful new netlink family was merged - NETLINK_KOBJECT_UEVENT. I'd imagine quite a lot of interest from Python developers for NETLINK support will come from this new interface in the coming years. http://lwn.net/Articles/101210/ http://lkml.org/lkml/2004/9/10/315 http://vrfy.org/projects/kevents/ http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.10 I would like to see (optional?) support for this before your patch is merged. I have a long-term interest in a Python-based service control / init replacement / system management application, for use in specialised environments. I could definately use this. :) Thanks, David. -- Harmless - and in its harmlessness, diabolical. -- The Mold Of Yancy (Philip K. Dick) From martin at v.loewis.de Tue Jan 11 08:54:42 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 11 08:54:41 2005 Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support In-Reply-To: References: Message-ID: <41E38642.9080108@v.loewis.de> Philippe Biondi wrote: > I've done a small patch to use linux AF_NETLINK sockets (see below). > Please comment! I have a high-level comment - python-dev is normally the wrong place for patches; please submit them to sf.net/projects/python instead. Apart from that, the patch looks fine. > Is there a reason for recvmsg() and sendmsg() not to be implemented > yet in socketmodule ? I'm not sure what you mean by "implemented": these functions are implemented by the operating system, not in the socketmodule. If you ask "why are they not exposed to Python yet?": There has been no need to do so, so far. What do I get with recvmsg that I cannot get with recv/recvfrom just as well? Regards, Martin From aleax at aleax.it Tue Jan 11 10:34:08 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 10:34:16 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> Message-ID: The volume of these discussions is (as expected) growing beyond any reasonable bounds; I hope the BDFL can find time to read them but I'm starting to doubt he will. Since obviously we're not going to convince each other, and it seems to me we're at least getting close to pinpointing our differences, maybe we should try to jointly develop an "executive summary" of our differences and briefly stated pros and cons -- a PEP is _supposed_ to have such a section, after all. Anyway, for now, here goes another LONG mail...: On 2005 Jan 10, at 22:38, Phillip J. Eby wrote: ... >> If interfaces can ensure against Liskov violations in instances of >> their subclasses, then they can follow the "case (a)" fast path, >> sure. >> Inheriting from an interface (in Guido's current proposal, as per his >> Artima blog) is a serious commitment from the inheritor's part; >> inheriting from an ordinary type, in real-world current practice, >> need not be -- too many cases of assumed covariance, for example, are >> around in the wild, to leave NO recourse in such cases and just >> assume compliance. > > I understand that, sure. But I don't understand why we should add > complexity to PEP 246 to support not one but *two* bad practices: 1) > implementing Liskov violations and 2) adapting to concrete classes. > It is only if you are doing *both* of these that this extra feature is > needed. s/support/deal with/ . If I was designing a "greenfield" system, I'd love nothing better than making a serious split between concrete classes (directly instantiable), abstract classes (subclassable), and protocols (adaptable-to). Scott Meyer's suggestion, in one of his Effective C++ books, to never subclass a concrete class, has much to recommend itself, in particular. But in the real world people do want to subclass concrete classes, just as much as they want covariance AND, I'll bet, adapting to classes, too, not just to interfaces seen as a separate category from classes. I think PEP 246 should deal with the real world. If the BDFL thinks otherwise, and believes that in this case Python should impose best practices rather than pragmatically deal with the way people's minds (as opposed to type-system maths) appear to work, I'll be glad to recast the PEP in that light. Contrary to what you state, adapting to concrete (instantiable) classes rather than abstract (not directly instantiable) one is not necessary to make the mechanism required, by the way. Consider an abstract class such as: class Abstract(object): def tp1(self): ''' template method 1 (calls hook method 1) ''' def tp2(self): ''' template method 2 (calls hook method 2) ''' def hook1(self): raise NotImplementedError def hook2(self): raise NotImplementedError One could subclass it just to get tp1...: class Dubious(Abstract): def hook1(self): ''' implementing just hook1 ''' Now, instantiating d=Dubious() is dubious practice, but, absent specific checks, it "works", as long as only d.hook1() and d.tp1() are ever called -- never d.hook2() nor d.tp2(). I would like adapt(d, Abstract) to fail. I'm not claiming that the ability to have a __conform__ method in Dubious to specifically block this adaptation is anywhere like a perfect solution, mind you -- it does require some change to the source of Dubious, for example. I'm just saying that I think it's better than nothing, and that is where we seem to disagree. > If it were to support some kind of backward compatibility, that would > be understandable. However, in practice, I don't know of anybody > using adapt(x,ConcreteClass), and even if they did, the person > subclassing ConcreteClass will need to change their subclass to raise > LiskovViolation, so why not just switch to delegation? Because delegation doesn't give you _easy_ access to Template Method design patterns which may well be the one reason you're subclassing Abstract in the first place. TP hinges on a method calling some self.dothis(), self.dothat() hook methods; to get it via delegation rather than inheritance requires a more complicated arrangement where that 'self' actually belongs to a "private" concrete class which delegates some things back to the "real" class. In practice, inheritance as a means of code reuse (rather than as a pristine Liskovian assertion of purity) is quite popular because of that. C++ essentially acknowledges this fact by allowing _private_ inheritance, essentially meaning "I'm reusing that code but don't really mean to assert IS-A"; the effects of private inheritance could be simulated by delegation to a private auxiliary class, but the extra indirections and complications aren't negligible costs in terms of code complexity and maintainability. In Python, we don't distinguish between private inheritance (to just reuse code) and ordinary inheritance (assumed to imply Liskov sub.), but that doesn't make the need go away. The __conform__ raising LiskovViolation could be seen as a way to give the subclass the ability to say "this inheritance here is ``private'', an issue of implementation only and not Liskov compliant". Maybe the ability to ``fake'' __class__ can help, but right now I don't see how, because setting __class__ isn't fake at all -- it really affects object behavior and type: >>> class A(object): ... def x(self): print "A" ... >>> a = A() >>> class B(object): ... def x(self): print "B" ... >>> a.__class__ = B >>> a.x() B >>> type(a) >>> So, it doesn't seem to offer a way to fake out isinstance only, without otherwise affecting behavior. > Anyway, it seems to me a bad idea to add complexity to support this > case. Do you have a more specific example of a situation in which a > Liskov violation coupled to concrete class adaptation is a good idea? > Or am I missing something here? I can give no example at all in which adapting to a concrete class is a _good_ idea, and I tried to indicate that in the PEP. I just believe that if adaptation does not offer the possibility of using concrete classes as protocols, but rather requires the usage as protocols of some specially blessed 'interface' objects or whatever, then PEP 246 will never fly, (a) because it would then require waiting for the interface thingies to appear, and (b) because people will find it pragmatically useful to just reuse the same classes as protocols too, and too limiting to have to design protocols specifically instead. So, I see the ability to adapt to concrete (or, for that matter, abstract) classes as a "practicality beats purity" idea, needed to deal with the real world and the way people's minds work. In practice we need covariance at least until a perfect system of parameterized interfaces is in place, and you can see from the Artima discussion just how difficult that is. I want to reuse (say) DictMixin on my mappings which restrict keys to be strings, for example, even though such a mapping does not conform to an unrestricted, unparameterized Mapping protocol. >> I need to add it to the reference implementation in the PEP. I'm >> reluctant to just get __conform__ from the object, though; it leads >> to all sort of issues with a *class* conforming vs its *instances*, >> etc. Maybe Guido can Pronounce a little on this sub-issue... > > Actually, if you looked at the field-tested implementations of the old > PEP 246, they actually have code that deals with this issue > effectively, by recognizing TypeError when raised by attempting to > invoke __adapt__ or __conform__ with the wrong number of arguments or > argument types. (The traceback for such errors does not include a > frame for the called method, versus a TypeError raised *within* the > function, which does have such a frame. AFAIK, this technique should > be compatible with any Python implementation that has traceback > objects and does signature validation in its "native" code rather than > in a new Python frame.) I do not like the idea of making 'adapt' such a special case compared with other built-in functions which internally call special methods. What I mean is, for example: >>> class H(object): ... def __hash__(self): return 23 ... >>> h = H() >>> h.__hash__ = lambda: 42 >>> hash(h) 23 For hash, and all kinds of other built-in functions and operations, it *does not matter* whether instance h has its own per-instance __hash__ -- H.__hash__ is what gets called anyway. Making adapt work differently gives me the shivers. Moreover, the BDFL is thinking of removing the "unbound method" concept and having such accesses as Aclass.somemethod return just a plain function. The internal typechecks done by unbound methods, on which such techniques as you mention may depend, might therefore be about to go away; this doesn't make it look nice to depend on them in a reference implementation. If Guido pronounces otherwise, I'll gladly change the reference implementation accordingly (or remove said reference implementation, as you appear to suggest elsewhere), but unless and until this happens, I'm not convinced. >>> I don't see the benefit of LiskovViolation, or of doing the exact >>> type check vs. the loose check. What is the use case for these? Is >>> it to allow subclasses to say, "Hey I'm not my superclass?" It's >>> also a bit confusing to say that if the routines "raise any other >>> exceptions" they're propagated. Are you saying that LiskovViolation >>> is *not* propagated? >> >> Indeed I am -- I thought that was very clearly expressed! > > The PEP just said that it would be raised by __conform__ or __adapt__, > not that it would be caught by adapt() or that it would be used to > control the behavior in that way. Re-reading, I see that you do > mention it much farther down. But at the point where __conform__ and > __adapt__ are explained, it has not been explained that adapt() should > catch the error or do anything special with it. It is simply implied > by the "to prevent this default behavior" at the end of the section. > If this approach is accepted, the description should be made explicit, > becausse for me at least it required a retroactive re-interpretation > of the earlier part of the spec. OK, I'll add more repetition to the specs, trying to make it more "sequentially readable", even though there were already criticized because they do repeat some aspects more than once. >> The previous version treated TypeError specially, but I think (on the >> basis of just playing around a bit, admittedly) that offers no real >> added value and sometimes will hide bugs. > > See http://peak.telecommunity.com/protocol_ref/node9.html for an > analysis of the old PEP 246 TypeError behavior, and the changes made > the by PyProtocols and Zope to deal with the situation better, while > still respecting the fact that __conform__ and __adapt__ may be > retrieved from the wrong "meta level" of descriptor. I've read that, and I'm not convinced, see above. > Your new proposal does not actually fix this problem in the absence of > tp_conform/tp_adapt slots; it merely substitutes possible confusion at > the metaclass/class level for confusion at the class/instance level. > The only way to actually fix this is to detect when you have called > the wrong level, and that is what the PyProtocols and Zope > implementations of "old PEP 246" do. (PyProtocols also introduces a > special descriptor for methods defined on metaclasses, to help avoid > creating this possible confusion in the first place, but that is a > separate issue.) Can you give an example of "confusion at metaclass/class level"? I can't see it. >>> This should either be fleshed out to a concrete proposal, or dropped. >>> There are many details that would need to be answered, such as >>> whether "type" includes subtypes and whether it really means type or >>> __class__. (Note that isinstance() now uses __class__, allowing >>> proxy objects to lie about their class; the adaptation system should >>> support this too, and both the Zope and PyProtocols interface >>> systems and PyProtocols' generic functions support it.) >> >> I disagree: I think the strawman-level proposal as fleshed out in the >> pep's reference implementation is far better than nothing. > > I'm not proposing to flesh out the functionality, just the > specification; it should not be necessary to read the reference > implementation and try to infer intent from it. What part is > implementation accident, and what is supposed to be the specification? > That's all I'm talking about here. As currently written, the > proposal is just, "we should have a registry", and is not precise > enough to allow someone to implement it based strictly on the > specification. Wasn't python supposed to be executable pseudocode, and isn't pseudocode an acceptable way to express specs?-) Ah well, I see your point, so that may well require more repetitious expansion, too. >> I mention the issue of subtypes explicitly later, including why the >> pep does NOT do anything special with them -- the reference >> implementation deals with specific types. And I use type(X) >> consistently, explicitly mentioning in the reference implementation >> that old-style classes are not covered. > > As a practical matter, classic classes exist and are useful, and PEP > 246 implementations already exist that work with them. Dropping that > functionality is a major step backward for PEP 246, IMO. I disagree that entirely new features of Python (as opposed to external third party add-ons) should add complications to deal with old-style classes. Heh, shades of the "metatype conflict removal" recipe discussion a couple months ago, right?-) But then that recipe WAS a "third-party add-on". If Python grew an intrinsic way to deal with metaclass conflicts, I'd be DELIGHTED if it didn't work for old-style classes, as long as this simplified it. Basically, we both agree that adaptation must accept some complication to deal with practical real-world issues that are gonna stay around, we just disagree on what those issues are. You appear to think old-style classes will stay around and need to be supported by new core Python functionality, while I think they can be pensioned off; you appear to think that programmers' minds will miraculously shift into a mode where they don't need covariance or other Liskov violations, and programmers will happily extract the protocol-ish aspects of their classes into neat pristine protocol objects rather than trying to double-use the classes as protocols too, while I think human nature won't budge much on this respect in the near future. Having, I hope, amply clarified the roots of our disagreements, so we can wait for BDFL input before the needed PEP 246 rewrites. If his opinions are much closer to yours than to mine, then perhaps the best next step would be to add you as the first author of the PEP and let you perform the next rewrite -- would you be OK with that? >> I didn't know about the "let the object lie" quirk in isinstance. If >> that quirk is indeed an intended design feature, > > It is; it's in one of the "what's new" feature highlights for either > 2.3 or 2.4, I forget which. It was intended to allow proxy objects > (like security proxies in Zope 3) to pretend to be an instance of the > class they are proxying. I just grepped through whatsnew23.tex and whatsnew24.tex and could not find it. Can you please help me find the exact spot? Thanks! >>> The issue isn't that adaptation isn't casting; why would casting a >>> string to a file mean that you should open that filename? >> >> Because, in most contexts, "casting" object X to type Y means calling >> Y(X). > > Ah; I had not seen that called "casting" in Python, at least not to my > immediate recollection. However, if that is what you mean, then why > not say it? :) What _have_ you seen called "casting" in Python? >> Maybe we're using different definitions of "casting"? > > I'm most accustomed to the C and Java definitions of casting, so > that's probably why I can't see how it relates at all. :) Well, in C++ you can call (int)x or int(x) with the same semantics -- they're both casts. In C or Java you must use the former syntax, in Python the latter, but they still relate. >>> If I were going to say anything about that case, I'd say that >>> adaptation should not be "lossy"; adapting from a designator to a >>> file loses information like what mode the file should be opened in. >>> (Similarly, I don't see adapting from float to int; if you want a >>> cast to int, cast it.) Or to put it another way, adaptability >>> should imply substitutability: a string may be used as a filename, a >>> filename may be used to designate a file. But a filename cannot be >>> used as a file; that makes no sense. >> >> I don't understand this "other way" -- nor, to be honest, what you >> "would say" earlier, either. I think it's pretty normal for >> adaptation to be "lossy" -- to rely on some but not all of the >> information in the original object: that's the "facade" design >> pattern, after all. It doesn't mean that some info in the original >> object is lost forever, since the original object need not be >> altered; it just means that not ALL of the info that's in the >> original object used in the adapter -- and, what's wrong with that?! > > I think we're using different definitions of "lossy", too. I mean > that defining an adaptation relationship between two types when there > is more than one "sensible" way to get from one to the other is > "lossy" of semantics/user choice. If I have a file designator (such > as a filename), I can choose how to open it. If I adapt directly from > string to file by way of filename, I lose this choice (it is "lossy" > adaptation). You could have specified some options (such as the mode) but they took their default value instead ('r' in this case). What's ``lossy'' about accepting defaults?! The adjective "lossy" is overwhelmingly often used in describing compression, and in that context it means, can every bit of the original be recovered (then the compression is lossless) or not (then it's lossy). I can't easily find "lossy" used elsewhere than in compression, it's not even in American Heritage. Still, when you describe a transformation such as 12.3 -> 12 as "lossy", the analogy is quite clear to me. When you so describe the transformation 'foo.txt' -> file('foo.txt'), you've lost me completely: every bit of the original IS still there, as the .name attribute of the file object, so by no stretch of the imagination can I see the "lossiness" -- what bits of information are LOST? I'm not just belaboring a term, I think the concept is very important, see later. > > Here's a better way of phrasing it (I hope): adaptation should be > unambiguous. There should only be one sensible way to interpret a > thing as implementing a particular interface, otherwise, adaptation > itself has no meaning. Whether an adaptation adds or subtracts > behavior, it does not really change the underlying *intended* meaning > of a thing, or else it is not really adaptation. Adapting 12.0 to 12 > does not change the meaning of the value, but adapting from 12.1 to 12 > does. > > Does that make more sense? I think that some people start using > adaptation and want to use Definitely more sense than 'lossy', but that's only because the latter didn't make any sense to me at all (when stretched to include, e.g., opening files). Again, see later. > it for all kinds of crazy things because it seems cool. However, it > takes a while to see that adaptation is just about removing > unnecessary accidents-of-incompatibility; it's not a license to > transform arbitrary things into arbitrary things. There has to be > some *meaning* to a particular adaptation, or the whole concept > rapidly degenerates into an undifferentiated mess. We agree, philosophically. Not sure how the PEP could be enriched to get this across. We still disagree, pragmatically, see later. > (Or else, you decide to "fix" it by disallowing transitive adaptation, > which IMO is like cutting off your hand because it hurts when you > punch a brick wall. Stop punching brick walls (i.e. using > semantic-lossy adaptations), and the problem goes away. But I realize > that I'm in the minority here with regards to this opinion.) I'm not so sure about your being in the minority, having never read for example Guido's opinion in the matter. But, let's take an example of Facade. (Here's the 'later' I kept pointing to;-). I have three data types / protocols: LotsOfInfo has a bazillion data fields, including personFirstName, personMiddleName, personLastName, ... PersonName has just two data fields, theFirstName and theLastName. FullName has three, itsFirst, itsMiddle, itsLast. The adaptation between such types/protocols has meaning: drop/ignore redundant fields, rename relevant fields, make up missing ones by some convention (empty strings if they have to be strings, None to mean "I dunno" like SQL NULL, etc). But, this *IS* lossy in some cases, in the normal sense: through the facade (simplified interface) I can't access ALL of the bits in the original (information-richer). Adapting LotsOfInfo -> PersonName is fine; so does LotsOfInfo -> FullName. Adapting PersonName -> FullName is iffy, because I don't have the deuced middlename information. But that's what NULL aka None is for, so if that's allowed, I can survive. But going from LotsOfInfo to FullName transitively, by way of PersonName, cannot give the same result as going directly -- the middle name info disappears, because there HAS been a "lossy" step. So the issue of "lossy" DOES matter, and I think you muddy things up when you try to apply it to a string -> file adaptation ``by casting'' (opening the file thus named). Forbidding lossy adaptation means forbidding facade here; not being allowed to get adaptation from a rich source of information when what's needed is a subset of that info with some renaming and perhaps mixing. I would not like that *AT ALL*; I believe it's unacceptable. Forbidding indications of "I don't know" comparable to SQL's NULL (thus forbidding the adaptation PersonName -> FullName) might make the whole scheme incompatible with the common use of relational databases and the like -- probably not acceptable, either. Allowing both lossy adaptations, NULLs, _and_ transitivity inevitably leads sooner or later to ACCIDENTAL info loss -- the proper adapter to go directly LotsOfInfo -> FullName was not registered, and instead of getting an exception to point out that error, your program limps along having accidentally dropped a piece of information, here the middle-name. So, I'd like to disallow transitivity. >> For example, say that I have some immutable "record" types. One, >> type Person, defined in some framework X, has a huge lot of immutable >> data fields, including firstName, middleName, lastName, and many, >> many others. Another, type Employee, defines in some separate >> framework Y (that has no knowlege of X, and viceversa), has fewer >> data fields, and in particular one called 'fullName' which is >> supposed to be a string such as 'Firstname M. Lastname'. I would >> like to register an adapter factory from type Person to protocol >> Employeee. Since we said Person has many more data fields, >> adaptation will be "lossy" -- it will look upon Employee essentially >> as a "facade" (a simplified-interface) for Person. > > But it doesn't change the *meaning*. I realize that "meaning" is not > an easy concept to pin down into a nice formal definition. I'm just > saying that adaptation is about semantics-preserving transformations, > otherwise you could just tack an arbitrary object on to something and > call it an adapter. Adapters should be about exposing an object's > *existing semantics* > in terms of a different interface, whether the interface is a subset > or superset of the original object's interface. However, they should > not add or remove arbitrary semantics that are not part of the > difference in interfaces. OK, but then 12.3 -> 12 should be OK, since the loss of the fractionary part IS part of the difference in interfaces, right? And yet it doesn't SMELL like adaptation to me -- which is why I tried to push the issue away with the specific disclaimer about numbers. > For example, adding a "current position" to a string to get a StringIO > is a difference that is reflected in the difference in interface: a > StringIO *is* just a string of characters with a current position that > can be used in place of slicing. > > But interpreting a string as a *file* doesn't make sense because of > added semantics that have to be "made up", and are not merely > interpreting the string's semantics "as a" file. I suppose you could > say that this is "noisy" adaptation rather than "lossy". That is, to > treat a string as a file by using it as a filename, you have to make > up things that aren't present in the string. (Versus the StringIO, > where there's a sensible interpretation of a string "as a" StringIO.) > > IOW, adaptation is all about "as a" relationships from concrete > objects to abstract roles, and between abstract roles. Although one > may colloquially speak of using a screwdriver "as a" hammer, this is > not the case in adaptation. One may use a screwdriver "as a" > pounder-of-nails. The difference is that a hammer might also be > usable "as a" remover-of-nails. Therefore, there is no general "as a" > relationship between pounder-of-nails and remover-of-nails, even > though a hammer is usable "as" either one. Thus, it does not make > sense to say that a screwdriver is usable "as a" hammer, because this > would imply it's also usable to remove nails. I like the "as a" -- but it can't ignore Facade, I think. > > This is why I don't believe it makes sense in the general case to > adapt to concrete classes; such classes usually have many roles where > they are usable. I think the main difference in your position and > mine is that I think one should adapt primarily to interfaces, and I fully agree here. I see the need to adapt to things that aren't protocols as an unpleasant reality we have to (heh) adapt to, not ideal by any means. > interface-to-interface adaptation should be reserved for non-lossy, > non-noisy adapters. No Facade, no NULLs? Yes, we disagree about this one: I believe adaptation that occurs by showing just a subset of the info, with renaming etc, is absolutely fine (Facade); and adaptation by using an allowed NULL (say None) to mean "missing information", when going to a "wider" interface, is not pleasant but is sometimes indispensable in the real world -- that's why SQL works in the real world, even though SQL beginners and a few purists hate NULLs with a vengeance. > Where if I understand the opposing position correctly, it is instead > that one should avoid transitivity so that loss and noise do not > accumulate too badly. In a sense -- but that has nothing to do with concrete classes etc, in this context. All of the "records"-like datatypes I'm using around here may perfectly well be as interfacey as you please, as long as interfaces/protocols let you access attributes property-like, and if they don't just transliterate to getThis, getThat, getTheOther, no big deal. The points are rather that adaptation that "loses" (actually "hides") some information is something we MUST have; and adaptation that supplies "I don't know" markers (NULL-like) for some missing information, where that's allowed, is really very desirable. Call this lossy and noisy if you wish, we still can't do without. Transitivity is a nice convenience, IF it could be something that an adapter EXPLICITLY claims rather than something just happening by default. I might live with it, grudgingly, if it was the default with some nice easy way to turn it off; my problem with that is -- even if 90% of the cases could afford to be transitive, people will routinely forget to mark the other 10% and mysterious, hard-to-find bugs will result. The identical objection can be raised about the LiskovViolation mechanism, which is why I say it's not perfect by any stretch of the imagination, btw (I just think SOME mechanism to turn off the default is needed and can't think of a better one yet). In PyProtocols docs you specifically warn against adapting from an adapter... yet that's what transitivity intrinsically does! >> So, can you please explain your objections to what I said about >> adapting vs casting in terms of this example? Do you think the >> example, or some variation thereof, should go in the PEP? > > I'm not sure I see how that helps. I think it might be more useful to > say that adaptation is not *conversion*, which is not the same thing > (IME) as casting. Casting in C and Java does not actually "convert" > anything; it simply treats a value or object as if it were of a Uh? (int)12.34 DOES "convert" to the integer 12, creating an entirely new object. SOME casting does convert, other doesn't (C++ clears up this mess by introducing many separate casts such as reinterpret_cast when you specifically want reinterpretation of bits, etc, etc, but for backwards compatibility keeps supporting the mess too). > different type. ISTM that bringing casting into the terminology just > complicates the picture, because e.g. casting in Java actually > corresponds to the subset of PEP 246 adaptation for cases where > adapt() returns the original object or raises an error. (That is, if > adapt() could only ever return the original object or raise an error, > it would be precisely equivalent to Java casting, if I understand it > correctly.) Thus, at least with regard to object casting in Java, > adaptation is a superset, and saying that it's not casting is just > confusing. OK, I'll try to rephrase that part. Obviously "casting" is too overloaded. > Obviously, some changes would need to be made to implement your newly > proposed functionality, but this one does support classic classes, > modules, and functions, and it has neither the TypeError-hiding > problem of the original PEP 246 nor the TypeError-raising problem of > your new version. ...but it DOES break the normal semantics of relationship between builtins and special methods, as I exemplified above with hash and __hash__. >>>> Transitivity of adaptation is in fact somewhat controversial, as >>>> is the relationship (if any) between adaptation and inheritance. >>> >>> The issue is simply this: what is substitutability? If you say that >>> interface B is substitutable for A, and C is substitutable for B, >>> then C *must* be substitutable for A, or we have inadequately >>> defined "substitutability". >> >> Not necessarily, depending on the pragmatics involved. > > In that case, I generally prefer to be explicit and use conversion > rather than using adaptation. For example, if I really mean to > truncate the fractional part of a number, I believe it's then > appropriate to use 'int(someNumber)' and make it clear that I'm > intentionally using a lossy conversion rather than simply treating a > number "as an" integer without changing its meaning. That's how it feels to me FOR NUMBERS, but I can't generalize the feeling to the general case of facade between "records" with many fields of information, see above. Alex From aleax at aleax.it Tue Jan 11 10:39:30 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 10:39:40 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110171247.039f35d0@mail.telecommunity.com> References: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> <5.1.1.6.0.20050110171247.039f35d0@mail.telecommunity.com> Message-ID: On 2005 Jan 10, at 23:19, Phillip J. Eby wrote: ... > As I said, after more thought, I'm actually less concerned about the > performance than I am about even remotely encouraging the combination > of Liskov violation *and* concrete adaptation As per other msg, abstract classes have just the same issues as concrete ones. If Ka-Ping Yee's idea (per Artima cmts on BDFL blog) of having interfaces supply template methods too ever flies, the issue would arise there, too (a BDFL comment with a -1 suggests it won't fly, though). > targets. But, if "after the dust settles" it turns out this is going > to be supported after all, then we can worry about the performance if > need be. > > Note, however, that your statements actually support the idea of *not* > adding a special case for Liskov violators. If newer code uses > interfaces, the Liskov-violation mechanism is useless. If older code > doesn't have __conform__, it cannot possibly *use* the > Liskov-violation mechanism. Adding __conform__ to a class to raise a LiskovViolation when needed is a TINY change compared to the refactoring needed to use template-methods without subclassing. Alex From aleax at aleax.it Tue Jan 11 10:40:35 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 10:40:39 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: Message-ID: On 2005 Jan 10, at 23:15, Thomas Heller wrote: > Alex Martelli writes: > >> PEP: 246 >> Title: Object Adaptation > > Minor nit (or not?): You could provide a pointer to the Liskov > substitution principle, for those reader that aren't too familiar with > that term. Excellent idea, thanks. > Besides, the text mentions three times that LiskovViolation is a > subclass of AdaptionError (plus once in the ref impl section). Always hard to strike a balance between what's repeated and what isn't, I'll try to get a better one on this point on the next edit. Alex From aleax at aleax.it Tue Jan 11 10:59:07 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 10:59:12 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> Message-ID: <6EA3D69F-63B7-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 10, at 19:34, Phillip J. Eby wrote: ... > IMO it's more desirable to support abstract base classes than to allow > classes to "opt out" of inheritance when testing conformance to a base > class. If you don't have an "is-a" relationship to your base class, > you should be using delegation, not inheritance. (E.g. 'set' has-a > 'dict', not 'set' is-a 'dict', so 'adapt(set,dict)' should fail, at > least on the basis of isinstance checking.) C++'s private inheritance explicitly acknowledges how HANDY subclassing can be for pure implementation purposes; we don't have private inheritance but that doesn't mean subclassing becomes any less handy;-) > The other problem with a Liskov opt-out is that you have to explicitly > do a fair amount of work to create a LiskovViolation-raising subclass; A TINY amount of work. Specifically, starting from: class X(Abstract): """ most of X omitted """ def hook1(self): """ implement hook1 so as to get tm1 from Abstract """ if self.foo() > self.bar(): self.baz() else: self.fie = self.flup - self.flum return self.zap() all you have to do is ADD def __conform__(self, protocol): if issubclass(protocol, Abstract): raise LiskovViolation that's all. (See my big post about what Abstract is, with template methods tm1 and tm2 respectively using hook methods hook1 and hook2: X doesn't implement hook2). > that work would be better spent migrating to use delegation instead > of inheritance, which would also be cleaner and more comprehensible > code than writing a __conform__ hack to announce your bad style in > having chosen to use inheritance where delegation is more appropriate. > ;) The amount of effort is MUCH vaster. Essentially RECODE everything so s/thing like: class X(object): """ most of X omitted """ class PrivateAuxiliaryClass(Abstract): def __init__(self, x): self.x = x def hook1(self): return self.x.hook1() def __init__(self): self.pac = self.PrivateAuxiliaryClass(self) # rest of X.__init__ omitted def tm1(self): return self.pac.tm1() this isn't just a tiny band-aid to say "I really wish the language had private inheritance because I'm using Abstract as a base just for code reuse" -- it's a rich and complex restructuring, and in fact it's just the beginning; now you have a deuced reference loop between each instance x of X, and its x.pac, so you'll probably want to pull in weakref, too, to avoid giving too much work to the cyclical garbage collector. Basically, rephrasing private inheritance with containment and delegation is a lot of messy work, and results in far more complicated structures. And instead of paying the tiny price of a __conform__ call at adaptation time, you pay the price of delegating calls over and over at each x.tm1() call, so it's unlikely performance will improve. By pushing Liskov conformance without supporting "private inheritance" or its equivalent, you're really pushing people to use much more complicated and sophisticated structures of objects than "private inheritance" affords when properly used... and the LAST thing OO programmers need is any encouragement towards more complicated structures!-) Alex From aleax at aleax.it Tue Jan 11 11:01:29 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 11:01:35 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com> References: <5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com> Message-ID: On 2005 Jan 10, at 18:59, Phillip J. Eby wrote: > At 12:43 PM 1/10/05 -0500, Phillip J. Eby wrote: >> As a practical matter, all of the existing interface systems (Zope, >> PyProtocols, and even the defunct Twisted implementation) treat >> interface inheritance as guaranteeing substitutability for the base >> interface, and do so transitively. > > An additional data point, by the way: the Eclipse Java IDE has an > adaptation system that works very much like PEP 246 does, and it > appears that in a future release they intend to support automatic > adapter transitivity, so as to avoid requiring each provider of an > interface to "provide O(n^2) adapters when writing the nth version of > an interface." IOW, their current release is transitive only for > interface inheritance ala Zope or Twisted; their future release will > be transitive for adapter chains ala PyProtocols. This is definitely relevant prior art, so thanks for pointing it out. If interfaces change so often that 'n' can become worryingly high, this is a valid concern. In my world, though, published interfaces do NOT change as often as to require such remedies;-). Alex From aleax at aleax.it Tue Jan 11 11:59:06 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 11:59:15 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com> Message-ID: On 2005 Jan 11, at 11:01, Alex Martelli wrote: > > On 2005 Jan 10, at 18:59, Phillip J. Eby wrote: > >> At 12:43 PM 1/10/05 -0500, Phillip J. Eby wrote: >>> As a practical matter, all of the existing interface systems (Zope, >>> PyProtocols, and even the defunct Twisted implementation) treat >>> interface inheritance as guaranteeing substitutability for the base >>> interface, and do so transitively. >> >> An additional data point, by the way: the Eclipse Java IDE has an >> adaptation system that works very much like PEP 246 does, and it >> appears that in a future release they intend to support automatic >> adapter transitivity, so as to avoid requiring each provider of an >> interface to "provide O(n^2) adapters when writing the nth version of >> an interface." IOW, their current release is transitive only for >> interface inheritance ala Zope or Twisted; their future release will >> be transitive for adapter chains ala PyProtocols. > > This is definitely relevant prior art, so thanks for pointing it out. > If interfaces change so often that 'n' can become worryingly high, > this is a valid concern. In my world, though, published interfaces do > NOT change as often as to require such remedies;-). ...that was a bit too flippant -- I apologize. It DOES happen that interfaces keep changing, and other situations where "adapter-chain transitivity" is quite handy do, absolutely!, occur, too. Reflecting on Microsoft's QI (QueryInterface), based on a very strong injunction against changing interfaces and yet mandating transitivity, points that out -- that's prior art, too, and a LOT more of it than Eclipse can accumulate any time soon, considering how long COM has been at the heart of Microsoft's components strategy, how many millions of programmers have used or abused it. Still, QI's set of constraints, amounting to a full-fledged equivalence relationship among all the "adapters" for a single underlying "object", is, I fear, stronger than we can impose (so it may be that Eclipse is a better parallel, but I know little of it while COM is in my bones, so that's what I keep thinking of;-). So, I see transitivity as a nice thing to have _IF_ it's something that gets explicitly asserted for a certain adapter -- if the adapter has to explicitly state to the system that it "isn't lossy" (maybe), or "isn't noisy" (perhaps more useful), or something like that... some amount of reassurance about the adapter that makes it fully safe to use in such a chain. Maybe it might suffice to let an adapter which IS 'lossy' (or, more likely, one that is 'noisy') state the fact. I'm always reluctant by instinct to default to convenient but risky behavior, trusting programmers to explicitly assert otherwise when needed; but in many ways this kind of design is a part of Python and works fine (_with_ the BDFL's fine nose/instinct for making the right compromise between convenience and safety in each case, of course). I'm still pondering the "don't adapt an adapter" suggestion, which seems a sound one, and yet also seems to be, intrinsically, what transitivity-by-chaining does. Note that QI does not suffer from this, because it lets you get the underlying object identity (IUnknown) from any interface adapter. Maybe, just maybe, we should also consider that -- a distinguished protocol bereft of any real substance but acting as a flag for "real unadapted object identity". Perhaps we could use 'object' for that, at least if the flow of logic in 'adapt' stays as in the current PEP 246 draft (i.e., __conform__ is given a chance before isinstance triggers -- so, all adapters could __conform__ to object by returning the underlying object being adapted, while other objects without such a feature in __conform__ would end up with 'adapt(x, object) is x'). Or, if object in this role turns out to be confusing, IDentity (;-) or some other specially designed protocol. If we had this ability to "get at the underlying object" we could at least write clearer axioms about what transitivity must mean, as well as, help out with the "adapting an adapter" problems. E.g., imagine: def f(x: IFoo, y: IFoo): if x is y: ... that wouldn't work if adapt(x, IFoo) returns a separate adapter each time, which is the most likely situation (think, again, of str->file adaptation by StringIO wrapping); but recovering underlying identities by "adapt(x, object) is adapt(y, object)" would work. I don't think that IUnknown or an equivalent, per se, can do *instead* of the need to have an adapter explicitly state it's non-noisy (or VV). Besides the need to check for object identity, which is pretty rare except when writing axioms, invariants or pre/post-conds;-), the IUnknown equivalent would perhaps be more of a conceptual/philosophical 'prop' than a practically useful feature -- while I see the ability to block unintended consequences of inheritance and transitivity (or even better, state explicitly when those consequences are wanted, even if that should be 90% of the time...) as practically very, VERY useful, even if "conceptually" or "philosophically" dubious. Alex From arigo at tunes.org Tue Jan 11 13:41:57 2005 From: arigo at tunes.org (Armin Rigo) Date: Tue Jan 11 13:53:17 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> Message-ID: <20050111124157.GA16642@vicky.ecs.soton.ac.uk> Hi Phillip, On Mon, Jan 10, 2005 at 04:38:55PM -0500, Phillip J. Eby wrote: > Your new proposal does not actually fix this problem in the absence of > tp_conform/tp_adapt slots; it merely substitutes possible confusion at the > metaclass/class level for confusion at the class/instance level. I think that what Alex has in mind is that the __adapt__() and __conform__() methods should work just like all other special methods for new-style classes. The confusion comes from the fact that the reference implementation doesn't do that. It should be fixed by replacing: conform = getattr(type(obj), '__conform__', None) with: for basecls in type(obj).__mro__: if '__conform__' in basecls.__dict__: conform = basecls.__dict__['__conform__'] break else: # not found and the same for '__adapt__'. The point about tp_xxx slots is that when implemented in C with slots, you get the latter (correct) effect for free. This is how metaconfusion is avoided in post-2.2 Python. Using getattr() for that is essentially broken. Trying to call the method and catching TypeErrors seems pretty fragile -- e.g. if you are calling a __conform__() which is implemented in C you won't get a Python frame in the traceback either. A bientot, Armin From exarkun at divmod.com Tue Jan 11 15:14:13 2005 From: exarkun at divmod.com (Jp Calderone) Date: Tue Jan 11 15:14:18 2005 Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support In-Reply-To: <41E38642.9080108@v.loewis.de> Message-ID: <20050111141413.32125.2127123498.divmod.quotient.6072@ohm> On Tue, 11 Jan 2005 08:54:42 +0100, "\"Martin v. Löwis\"" wrote: >Philippe Biondi wrote: > > I've done a small patch to use linux AF_NETLINK sockets (see below). > > Please comment! > > I have a high-level comment - python-dev is normally the wrong place > for patches; please submit them to sf.net/projects/python instead. > > Apart from that, the patch looks fine. > > > Is there a reason for recvmsg() and sendmsg() not to be implemented > > yet in socketmodule ? > > I'm not sure what you mean by "implemented": these functions are > implemented by the operating system, not in the socketmodule. > > If you ask "why are they not exposed to Python yet?": There has been no > need to do so, so far. What do I get with recvmsg that I cannot get with > recv/recvfrom just as well? Everything that recvmsg() does. recv() and recvfrom() give you "regular" bytes - data sent to the socket using send() or sendto(). recvmsg() gives you messages - data sent to the socket using sendmsg(). There is no way to receive messages by using recv() or recvfrom() (and no way to send them using send() or sendto()). Inversely, I believe send() and recv() can be implemented in terms of sendmsg() and recvmsg(). Perhaps we should get rid of socket.send() and socket.recv()? Other things that sendmsg() and recvmsg() can do include passing file descriptors, receiving notice of OOB TCP data, peek at bytes from the kernel's buffer without actually reading them, and implement scatter/gather IO functions (although you'd probably just want to wrap and use writev() and readv() instead). Jp From exarkun at divmod.com Tue Jan 11 15:15:23 2005 From: exarkun at divmod.com (Jp Calderone) Date: Tue Jan 11 15:15:26 2005 Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support In-Reply-To: <20050111013252.GA216@thailand.botanicus.net> Message-ID: <20050111141523.32125.1902928401.divmod.quotient.6074@ohm> On Tue, 11 Jan 2005 01:32:52 +0000, David Wilson wrote: >On Mon, Jan 10, 2005 at 05:17:49PM +0100, Philippe Biondi wrote: > > > I've done a small patch to use linux AF_NETLINK sockets (see below). > > Please comment! > > As of 2.6.10, a very useful new netlink family was merged - > NETLINK_KOBJECT_UEVENT. I'd imagine quite a lot of interest from Python > developers for NETLINK support will come from this new interface in the > coming years. > > [snip] > > I would like to see (optional?) support for this before your patch is > merged. I have a long-term interest in a Python-based service control / > init replacement / system management application, for use in specialised > environments. I could definately use this. :) Useful indeed, but I'm not sure why basic NETLINK support should be held up for it? Jp From phil at secdev.org Tue Jan 11 09:45:09 2005 From: phil at secdev.org (Philippe Biondi) Date: Tue Jan 11 16:13:26 2005 Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support In-Reply-To: <41E38642.9080108@v.loewis.de> Message-ID: On Tue, 11 Jan 2005, [ISO-8859-1] "Martin v. L?wis" wrote: > Philippe Biondi wrote: > > I've done a small patch to use linux AF_NETLINK sockets (see below). > > Please comment! > > I have a high-level comment - python-dev is normally the wrong place > for patches; please submit them to sf.net/projects/python instead. OK, I'll post it here. > > Apart from that, the patch looks fine. Fine! > > > Is there a reason for recvmsg() and sendmsg() not to be implemented > > yet in socketmodule ? > > I'm not sure what you mean by "implemented": these functions are > implemented by the operating system, not in the socketmodule. > > If you ask "why are they not exposed to Python yet?": There has been no > need to do so, so far. What do I get with recvmsg that I cannot get with > recv/recvfrom just as well? You can have access to ancillary messages. You can, for example transmit credentials or file descriptors through unix sockets, which is very interesting for privilege separation. -- Philippe Biondi SecDev.org Security Consultant/R&D http://www.secdev.org PGP KeyID:3D9A43E2 FingerPrint:C40A772533730E39330DC0985EE8FF5F3D9A43E2 From pje at telecommunity.com Tue Jan 11 16:34:20 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 16:33:16 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050111124157.GA16642@vicky.ecs.soton.ac.uk> References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111102823.03a8f8c0@mail.telecommunity.com> At 12:41 PM 1/11/05 +0000, Armin Rigo wrote: >The point about tp_xxx slots is that when implemented in C with slots, you get >the latter (correct) effect for free. This is how metaconfusion is avoided in >post-2.2 Python. Using getattr() for that is essentially broken. Trying to >call the method and catching TypeErrors seems pretty fragile -- e.g. if you >are calling a __conform__() which is implemented in C you won't get a Python >frame in the traceback either. An excellent point. The issue hasn't come up before now, though, because there aren't any __conform__ methods written in C in the field that I know of. Presumably, if there are any added to CPython in future, it will be because there's a tp_conform slot and it's needed for built-in types, in which case the problem is again moot for the implementation. (FYI, C methods implemented in Pyrex add a dummy frame to the traceback such that you see the file and line number of the original Pyrex source code. Very handy for debugging.) Anyway, I agree that your version of the code should be used to form the reference implementation, since the purpose of the reference implementation is to show the complete required semantics. From aleax at aleax.it Tue Jan 11 17:52:57 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 17:53:05 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050111102823.03a8f8c0@mail.telecommunity.com> References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050111102823.03a8f8c0@mail.telecommunity.com> Message-ID: <3F03D3E4-63F1-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 11, at 16:34, Phillip J. Eby wrote: ... > Anyway, I agree that your version of the code should be used to form > the reference implementation, since the purpose of the reference > implementation is to show the complete required semantics. Great, one point at last on which we fully agree -- thanks Armin!-) I was waiting for BDFL feedback before editing the PEP again, but if none is forthcoming I guess at some point I'll go ahead and at least to the edits that are apparently not controversial, like this one. I'd like to have a summary of controversial points and short pro and con args, too, but I'm not unbiased enough to write it all by myself...;-) Alex From mcherm at mcherm.com Tue Jan 11 18:27:26 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Tue Jan 11 18:27:27 2005 Subject: [Python-Dev] PEP 246, redux Message-ID: <1105464446.41e40c7e27114@mcherm.com> Phillip: I think you must inhabit a far more perfect world than I do. You say, for instance, that: > ...-1 if this introduces a performance penalty [...] just to > support people who want to create deliberate Liskov violations. > I personally don't think that we should pander to Liskov > violators ... but in my world, people violate Liskov all the time, even in languages that attempt (unsuccessfully) to enforce it. [1] You say that: > I think one should adapt primarily to interfaces, and > interface-to-interface adaptation should be reserved for > non-lossy, non-noisy adapters. ... but in my world, half the time I'm using adaptation to correct for the fact that someone else's poorly-written code requests some class where it should have just used an interface. You seem to inhabit a world in which transitivity of adaptation can be enforced. But in my world, people occasionally misuse adaptation because they think they know what they're doing or because they're in a big hurry and it's the most convenient tool at hand. I wish I lived in your world, but I don't. -- Michael Chermside [1] - Except for Eiffel. Eiffel seems to do a pretty good job of enforcing it. From stephan.stapel at web.de Tue Jan 11 18:26:56 2005 From: stephan.stapel at web.de (Stephan Stapel) Date: Tue Jan 11 18:28:06 2005 Subject: [Python-Dev] logging class submission Message-ID: <41E40C60.1090707@web.de> Dear people on the dev list! I hope that this is the right environment to post my submission request (I'm new to the scene). I have modified the RotatingFileHandler of the logging module to create a daily rolling file handler. As it works quite good, I would like to suggest inclusion into the standard logging module of Python. I know that the code is quite trivial but the class solves the problem of the RotatingFileHandler that you don't know where to find a certain log entry. By using dates within the log file name, one can exactly determine which log file to observe when searching for specific errors. I hope you like to code and/ or point to improvements on it and finally move it into the logging module. cheers, Stephan Here comes the code: # Copyright 2004-2005 by Stephan Stapel . All Rights Reserved. # # Permission to use, copy, modify, and distribute this software and its # documentation for any purpose and without fee is hereby granted, # provided that the above copyright notice appear in all copies and that # both that copyright notice and this permission notice appear in # supporting documentation, and that the name of Stephan Stapel # not be used in advertising or publicity pertaining to distribution # of the software without specific, written prior permission. # # STEPHAN STAPEL DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING # ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL # STEPHAN STAPEL BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR # ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER # IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT # OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. import logging from datetime import date import string class DailyRollingFileHandler(logging.FileHandler): """ The class is based on the standard RotatingFileHandler class from the official logging module. It rolls over each day, thus one log file per day is created in the form myapp-2005-01-07.log, myapp-2005-01-08.log etc. """ def __init__(self, filename, mode="a"): """ Open the specified file and use it as the stream for logging. Rollover occurs whenever the day changes. The names of the log files each contain the date when they were created. Thus if the filename "myapp.log" was used, the log files each have look like myapp-2005-01-07.log etc. The date is inserted at the position of the last '.' in the filename if any or simply appended to the given name if no dot was present. """ self.currentDay = date.today() # create the logfile name parts (base part and extension) nameparts = string.split(string.strip(filename), ".") self.filestub = "" self.fileext = "" # remove empty items while nameparts.count("") > 0: nameparts.remove("") if len(nameparts) < 2: self.filestub = nameparts[0] else: # construct the filename for part in nameparts[0:-2]: self.filestub += part + "." self.filestub += nameparts[-2] self.fileext = "." + nameparts[-1] logging.FileHandler.__init__(self, self.getFilename(), mode) def getFilename(self): return self.filestub + "-" + self.currentDay.isoformat() + self.fileext def doRollover(self): """ Do a rollover, as described in __init__(). """ self.stream.close() self.currentDay = date.today() self.baseFilename = self.getFilename() self.stream = open(self.baseFilename, "w") def emit(self, record): """ Emit a record. Output the record to the file, catering for rollover as described in doRollover(). """ msg = "%s\n" % self.format(record) self.stream.seek(0, 2) #due to non-posix-compliant Windows feature if date.today() != self.currentDay: self.doRollover() logging.FileHandler.emit(self, record) From FBatista at uniFON.com.ar Tue Jan 11 18:38:00 2005 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Tue Jan 11 18:40:40 2005 Subject: [Python-Dev] logging class submission Message-ID: [Stephan Stapel] #- # Copyright 2004-2005 by Stephan Stapel . All #- Rights Reserved. #- # #- # Permission to use, copy, modify, and distribute this #- software and its #- # documentation for any purpose and without fee is hereby granted, #- # provided that the above copyright notice appear in all #- copies and that #- # both that copyright notice and this permission notice appear in #- # supporting documentation, and that the name of Stephan Stapel #- # not be used in advertising or publicity pertaining to distribution #- # of the software without specific, written prior permission. There's a license issue here? . Facundo Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog PyAr - Python Argentina: http://pyar.decode.com.ar/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA. La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050111/10a47150/attachment.html From aleax at aleax.it Tue Jan 11 18:43:48 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 18:43:56 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <1105464446.41e40c7e27114@mcherm.com> References: <1105464446.41e40c7e27114@mcherm.com> Message-ID: <59332A98-63F8-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 11, at 18:27, Michael Chermside wrote: ... > ... but in my world, people violate Liskov all the time, even > in languages that attempt (unsuccessfully) to enforce it. [1] ... > [1] - Except for Eiffel. Eiffel seems to do a pretty good job > of enforcing it. ...has Eiffel stopped its heroic efforts to support covariance...? It's been years since I last looked seriously into Eiffel (it was one of the languages we considered as a successor to Fortran and C as main application language, at my previous employer), but at that time that was one of the main differences between Eiffel (then commercial-only) and its imitator (freeware) Sather: Sather succumbed to mathematical type-theory and enforced contravariance, Effel still tried to pander for how the human mind works by allowing covariance (which implies a Liskov violation and is probably the main serious reason for it) and striving horrendously to shoehorn it in. So what's the score now...? Alex From pje at telecommunity.com Tue Jan 11 18:54:36 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 18:53:34 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> At 10:34 AM 1/11/05 +0100, Alex Martelli wrote: >The volume of these discussions is (as expected) growing beyond any >reasonable bounds; I hope the BDFL can find time to read them but I'm >starting to doubt he will. Since obviously we're not going to convince >each other, and it seems to me we're at least getting close to pinpointing >our differences, maybe we should try to jointly develop an "executive >summary" of our differences and briefly stated pros and cons -- a PEP is >_supposed_ to have such a section, after all. Yes, hopefully we will have sufficient convergence to do that soon. For example, I'm going to stop arguing against the use case for Liskov violation, and try looking at alternative implementations. If those don't work out, I'll stop objecting to that item altogether. >the effects of private inheritance could be simulated by delegation to a >private auxiliary class, but the extra indirections and complications >aren't negligible costs in terms of code complexity and maintainability. Ah. Well, in PEAK, delegation of methods or even read-only attributes is trivial: class SomeObj(object): meth1 = meth2 = meth3 = binding.Delegate('_delegatee') _delegatee = binding.Make(OtherClass) This class will create a private instance of OtherClass for a given SomeObj instance the first time meth1, meth2, or meth3 are retrieved from that instance. I bring this up not to say that people should use PEAK for this, just explaining why my perspective was biased; I'm so used to doing this that I tend to forget it's nontrivial if you don't already have these sorts of descriptors available. >Maybe the ability to ``fake'' __class__ can help, but right now I don't >see how, because setting __class__ isn't fake at all -- it really affects >object behavior and type: > >... > >So, it doesn't seem to offer a way to fake out isinstance only, without >otherwise affecting behavior. Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> class Phony(object): def getClass(self): return Dummy __class__ = property(getClass) >>> class Dummy: pass >>> Phony().__class__ >>> isinstance(Phony(),Dummy) True Unfortunately, this still doesn't really help, because isinstance() seems to apply to a union of __class__ and type: >>> isinstance(Phony(),Phony) True So, lying about __class__ doesn't fix the issue because you're still considered isinstance, unless adapt() just uses __class__ and doesn't use isinstance(). >I can give no example at all in which adapting to a concrete class is a >_good_ idea, and I tried to indicate that in the PEP. I just believe that >if adaptation does not offer the possibility of using concrete classes as >protocols, but rather requires the usage as protocols of some specially >blessed 'interface' objects or whatever, then PEP 246 will never fly, (a) >because it would then require waiting for the interface thingies to >appear, and (b) because people will find it pragmatically useful to just >reuse the same classes as protocols too, and too limiting to have to >design protocols specifically instead. Okay, I strongly disagree on this point, because there are people using zope.interface and PyProtocols today, and they are *not* using concrete classes. If PEP 246 were to go into Python 2.5 without interface types, all that would change is that Zope and PyProtocols would check to see if there is an adapt() in builtins and, if not, install their own version. PEP 246 would certainly be more useful *with* some kind of interface type, but Guido has strongly implied that PEP 246 won't be going in *without* some kind of interface type, so it seems to me academic to say that PEP 246 needs adaptation to concrete types based on isinstance(). In fact, maybe we should drop isinstance() from PEP 246 altogether, and only use __conform__ and __adapt__ to implement adaptation. Thus, to say that you conform to a concrete type, you have to implement __conform__. If this is done, then an abstract base used as an interface can have a __conform__ that answers 'self' for the abstract base used as a protocol, and a Liskov-violating subclass can return 'None' for the abstract base. Inheritance of __conform__ will do the rest. This approach allows concrete classes and Liskov violations, but simplifies adapt() since it drops the need for isinstance and for the Liskov exception. Further, we could have a default object.__conform__ that does the isinstance check. Then, a Liskov-violating subclass just overrides that __conform__ to block the inheritance it wants to block. This approach can't work with a separately-distributed PEP 246 implementation, but it should work quite well for a built-in implementation and it's backward compatible with the semantics expected by "old" PEP 246 implementations. It means that all objects will have a tp_conform slot that will have to be called, but in most cases it's just going to be a roundabout way of calling isisntance. >For hash, and all kinds of other built-in functions and operations, it >*does not matter* whether instance h has its own per-instance __hash__ -- >H.__hash__ is what gets called anyway. Making adapt work differently >gives me the shivers. It's only different because of metaclasses and the absence of tp_conform/tp_adapt issues (assuming the function and module use cases are taken care of by having their tp_conform slots invoke self.__dict__['__conform__'] first). Anyway, if you adapt a *class* that defines __conform__, you really want to be invoking the *metaclass* __conform__. See Armin Rigo's post re: "metaconfusion" as he calls it. >>The PEP just said that it would be raised by __conform__ or __adapt__, >>not that it would be caught by adapt() or that it would be used to >>control the behavior in that way. Re-reading, I see that you do mention >>it much farther down. But at the point where __conform__ and __adapt__ >>are explained, it has not been explained that adapt() should catch the >>error or do anything special with it. It is simply implied by the "to >>prevent this default behavior" at the end of the section. >>If this approach is accepted, the description should be made explicit, >>becausse for me at least it required a retroactive re-interpretation of >>the earlier part of the spec. > >OK, I'll add more repetition to the specs, trying to make it more >"sequentially readable", even though there were already criticized because >they do repeat some aspects more than once. It might not be necessary if we agree that the isinstance check should be moved to an object.__conform__ method, and there is no longer a need for a LiskovViolation error to exist. >Basically, we both agree that adaptation must accept some complication to >deal with practical real-world issues that are gonna stay around, we just >disagree on what those issues are. You appear to think old-style classes >will stay around and need to be supported by new core Python >functionality, while I think they can be pensioned off; Currently, exceptions must be classic classes. Do you want to disallow adaptation of exceptions? Are you proposing that ClassType.tp_conform not invoke self.__conform__? I don't see any benefit to omitting that functionality. > you appear to think that programmers' minds will miraculously shift into > a mode where they don't need covariance or other Liskov violations, and > programmers will happily extract the protocol-ish aspects of their > classes into neat pristine protocol objects rather than trying to > double-use the classes as protocols too, while I think human nature won't > budge much on this respect in the near future. Well, they're doing it now with Zope and PyProtocols, so it didn't seem like such a big assumption to me. :) >Having, I hope, amply clarified the roots of our disagreements, so we can >wait for BDFL input before the needed PEP 246 rewrites. If his opinions >are much closer to yours than to mine, then perhaps the best next step >would be to add you as the first author of the PEP and let you perform the >next rewrite -- would you be OK with that? Sure, although I think that if you're willing to not *object* to classic class support, and if we reach agreement on the other issues, it might not be necessary. >>>I didn't know about the "let the object lie" quirk in isinstance. If >>>that quirk is indeed an intended design feature, >> >>It is; it's in one of the "what's new" feature highlights for either 2.3 >>or 2.4, I forget which. It was intended to allow proxy objects (like >>security proxies in Zope 3) to pretend to be an instance of the class >>they are proxying. > >I just grepped through whatsnew23.tex and whatsnew24.tex and could not >find it. Can you please help me find the exact spot? Thanks! Googling "isinstance __class__" returns this as the first hit: http://mail.python.org/pipermail/python-bugs-list/2003-February/016098.html Adding "2.3 new" to the query returns this: http://www.python.org/2.3/highlights.html which is the "highlights" document I alluded to. >What _have_ you seen called "casting" in Python? Er, I haven't seen anything called casting in Python, which is why I was confused. :) >>>Maybe we're using different definitions of "casting"? >> >>I'm most accustomed to the C and Java definitions of casting, so that's >>probably why I can't see how it relates at all. :) > >Well, in C++ you can call (int)x or int(x) with the same semantics -- >they're both casts. In C or Java you must use the former syntax, in >Python the latter, but they still relate. Okay, but if you get your definition of "cast" from C and Java then what C++ and Python do are *conversion*, not casting, and what PEP 246 does *is* "casting". That's why I think there should be no mention of "casting" in the PEP unless you explicitly mention what language you're talking about -- and Python shouldn't be a candidate language. I've been trying to Google references to type casting in Python, and have so far mainly found arguments that Python does not have casting, and one that further asserts that even in C++, "conversion by constructor is not considered a cast." Also, "cast" is of relatively recent vintage in Python documentation; outside of the C API and optional static typing discussions, it seems to have made its debut in a presentation about Python 2.2's changing 'int' and 'str' to type objects. So, IMO the term has too many uses to add any clarification; it confused me because I thought that in C++ the things you're talking about were called "conversions", not casts. >You could have specified some options (such as the mode) but they took >their default value instead ('r' in this case). What's ``lossy'' about >accepting defaults?! Because it means you're making stuff up and tacking it onto the object, not "adapting" the object. As discussed later, this would probably be better called "noisy" adaptation than "lossy". >The adjective "lossy" is overwhelmingly often used in describing >compression, and in that context it means, can every bit of the original >be recovered (then the compression is lossless) or not (then it's >lossy). I can't easily find "lossy" used elsewhere than in compression, >it's not even in American Heritage. Still, when you describe a >transformation such as 12.3 -> 12 as "lossy", the analogy is quite clear >to me. When you so describe the transformation 'foo.txt' -> >file('foo.txt'), you've lost me completely: every bit of the original IS >still there, as the .name attribute of the file object, so by no stretch >of the imagination can I see the "lossiness" -- what bits of information >are LOST? Right, "noisy" is a better word for this; let's move on. >>it for all kinds of crazy things because it seems cool. However, it >>takes a while to see that adaptation is just about removing unnecessary >>accidents-of-incompatibility; it's not a license to transform arbitrary >>things into arbitrary things. There has to be some *meaning* to a >>particular adaptation, or the whole concept rapidly degenerates into an >>undifferentiated mess. > >We agree, philosophically. Not sure how the PEP could be enriched to get >this across. A few examples of "good" vs. "bad" adaptation might suffice, if each is accompanied by a brief justification for its classification. The filename/file thing is a good one, int/float or decimal/float is good too. We should present "bad" first, then show how to fix the example to accomplish the intent in a good way. (Like filename->file factory + file->file factory, explicit type conversion for precision-losing conversion, etc.) >>(Or else, you decide to "fix" it by disallowing transitive adaptation, >>which IMO is like cutting off your hand because it hurts when you punch a >>brick wall. Stop punching brick walls (i.e. using semantic-lossy >>adaptations), and the problem goes away. But I realize that I'm in the >>minority here with regards to this opinion.) > >I'm not so sure about your being in the minority, having never read for >example Guido's opinion in the matter. I don't know if he has one; I mean that Jim Fulton, Glyph Lefkowitz, and yourself have been outspoken about the "potential danger" of transitive adaptation, apparently based on experience with other systems. (Which seems to me a lot like the "potential danger" of whitespace that people speak of based on bad experiences with Make or Fortran.) There have been comparatively few people who have had been outspoken about the virtues of transitive adaptation, perhaps because for those who use it, it seems quite natural. (I have seen one blog post by someone that was like, "What do you mean those other systems aren't transitive? I thought that was the whole point of adaptation. How else would you do it?") >But, let's take an example of Facade. (Here's the 'later' I kept pointing >to;-). > >I have three data types / protocols: LotsOfInfo has a bazillion data >fields, including personFirstName, personMiddleName, personLastName, ... >PersonName has just two data fields, theFirstName and theLastName. >FullName has three, itsFirst, itsMiddle, itsLast. > >The adaptation between such types/protocols has meaning: drop/ignore >redundant fields, rename relevant fields, make up missing ones by some >convention (empty strings if they have to be strings, None to mean "I >dunno" like SQL NULL, etc). But, this *IS* lossy in some cases, in the >normal sense: through the facade (simplified interface) I can't access ALL >of the bits in the original (information-richer). > >Adapting LotsOfInfo -> PersonName is fine; so does LotsOfInfo -> FullName. > >Adapting PersonName -> FullName is iffy, because I don't have the deuced >middlename information. But that's what NULL aka None is for, so if >that's allowed, I can survive. > >But going from LotsOfInfo to FullName transitively, by way of PersonName, >cannot give the same result as going directly -- the middle name info >disappears, because there HAS been a "lossy" step. Certainly it is preferable to go direct if it's possible, which is why PyProtocols always converges to the "shortest adapter path". However, if you did *not* have a direct adaptation available from LotsOfInfo to FullName, would it not be *preferable* to have some adaptation than none? The second point is that conversion from PersonName->FullName is only correct if FullName allows "I don't know" as a valid answer for the middle name. If that's *not* the case, then such a conversion is "noisy" because it is pretending to know the middle name, when that isn't possible. >So the issue of "lossy" DOES matter, and I think you muddy things up when >you try to apply it to a string -> file adaptation ``by casting'' (opening >the file thus named). Right; as I keep saying, that isn't adaptation, it's conversion. The closest adaptation you can get for the intent is to adapt a string to a file *factory*, that can then be used to open a file. >Forbidding lossy adaptation means forbidding facade here; not being >allowed to get adaptation from a rich source of information when what's >needed is a subset of that info with some renaming and perhaps mixing. No, it means it's a bad idea to have implicit conversions that result in unintended data loss or "making up" things to fill out data the original data doesn't have. You should explicitly state that you mean to get rid of things, or what things you want to make up. By the way, the analogy you're drawing between loss of floating point precision and dropping fields from information about a person isn't valid for the definition of "lossy" I'm struggling to clarify. A floating point number is an atomic value, but facts about a person are not made atomic simply by storing them in the same object. So, separating those facts or using only some of them does not lose any relevant semantics. >Forbidding indications of "I don't know" comparable to SQL's NULL (thus >forbidding the adaptation PersonName -> FullName) might make the whole >scheme incompatible with the common use of relational databases and the >like -- probably not acceptable, either. If your target protocol allows for "I don't know", then consumers of that protocol must be willing to accept "I don't know" for an answer, in which case everything is fine. It's *faking* when you don't know, and the target protocol does *not* allow for not knowing, that is a problem. ("Noisy" adaptation.) >Allowing both lossy adaptations, NULLs, _and_ transitivity inevitably >leads sooner or later to ACCIDENTAL info loss -- the proper adapter to go >directly LotsOfInfo -> FullName was not registered, and instead of getting >an exception to point out that error, your program limps along having >accidentally dropped a piece of information, here the middle-name. But in this case you have explicitly designed a protocol that does not guarantee that you get all the required information! If the information is in fact required, why did you allow it to be null? This makes no sense to me. >OK, but then 12.3 -> 12 should be OK, since the loss of the fractionary >part IS part of the difference in interfaces, right? And yet it doesn't >SMELL like adaptation to me -- which is why I tried to push the issue away >with the specific disclaimer about numbers. The semantics of 12.3 are atomic. Let us say it represents some real-world measurement, 12.3 inches perhaps. In the real world, are those .3 inches somehow separable from the 12? That makes no sense. >>IOW, adaptation is all about "as a" relationships from concrete objects >>to abstract roles, and between abstract roles. Although one may >>colloquially speak of using a screwdriver "as a" hammer, this is not the >>case in adaptation. One may use a screwdriver "as a" >>pounder-of-nails. The difference is that a hammer might also be usable >>"as a" remover-of-nails. Therefore, there is no general "as a" >>relationship between pounder-of-nails and remover-of-nails, even though a >>hammer is usable "as" either one. Thus, it does not make sense to say >>that a screwdriver is usable "as a" hammer, because this would imply it's >>also usable to remove nails. > >I like the "as a" -- but it can't ignore Facade, I think. I don't think it's a problem, because 1) your example at least represents facts with relatively independent semantics: you *can* separate a first name from a last name, even though they belong to the same person. And 2) if a target protocol has optional aspects, then lossy adaptation to it is okay by definition. Conversely, if the aspect is *not* optional, then lossy adaptation to it is not acceptable. I don't think there can really be a middle ground; you have to decide whether the information is required or not. If you have a protocol whose semantics cannot provide the required target semantics, then you should explicitly perform the loss or addition of information, rather than doing so implicitly via adaptation. >>interface-to-interface adaptation should be reserved for non-lossy, >>non-noisy adapters. > >No Facade, no NULLs? Yes, we disagree about this one: I believe >adaptation that occurs by showing just a subset of the info, with renaming >etc, is absolutely fine (Facade); and adaptation by using an allowed NULL >(say None) to mean "missing information", when going to a "wider" >interface, is not pleasant but is sometimes indispensable in the real >world -- that's why SQL works in the real world, even though SQL beginners >and a few purists hate NULLs with a vengeance. If you allow for nulls, that's fine -- just be prepared to get them. Real-world databases also have NOT NULL columns for this reason. :) >The points are rather that adaptation that "loses" (actually "hides") some >information is something we MUST have; Agreed. > and adaptation that supplies "I don't know" markers (NULL-like) for some > missing information, where that's allowed, is really very desirable. Also agreed, emphasizing "where that's allowed". The point is, if it's allowed, it's not a problem, is it? > Call this lossy and noisy if you wish, we still can't do without. No; it's noisy only if the target requires a value and the source has no reasonable way to supply it, requiring you to make something up. And leaving out independent semantics (like first name vs. last name) isn't lossy IMO. >Transitivity is a nice convenience, IF it could be something that an >adapter EXPLICITLY claims rather than something just happening by >default. I might live with it, grudgingly, if it was the default with >some nice easy way to turn it off; my problem with that is -- even if 90% >of the cases could afford to be transitive, people will routinely forget >to mark the other 10% and mysterious, hard-to-find bugs will result. Actually, in the cases where I have mistakenly defined a lossy or noisy adaptation, my experience has been that it blows up very rapidly and obviously, often because PyProtocols will detect an adapter ambiguity (two adaptation paths of equal length), and it detects this at adapter registration time, not adaptation time. However, the more *common* source of a transitivity problem in my experience is in *interface inheritance*, not oddball adapters. As I mentioned previously, a common error is to derive an interface from an interface you require, rather than one you intend your new interface to provide. In the presence of inheritance transitivity (which I have not heard you argue against), this means that you may provide something you don't intend, and therefore allow your interface to be used for something that you didn't intend to guarantee. Anyway, this problem manifests when you try to adapt something to the base interface, and it works when it really shouldn't. It's more difficult to track down than it ought to be, because looking at the base interface won't tell you anything, and the derived interface might be buried deep in a base class of the concrete object. But there's no way to positively prevent this class of bugs without prohibiting interface inheritance, which is the most common source of adaptation transitivity bugs in my experience. >In PyProtocols docs you specifically warn against adapting from an >adapter... yet that's what transitivity intrinsically does! I warn against *not keeping an original object*, because the original object may be adaptable to things that an adapter is *not*. This is because we don't have an 'IUnknown' to recover the original object, not because of transitivity. >>In that case, I generally prefer to be explicit and use conversion rather >>than using adaptation. For example, if I really mean to truncate the >>fractional part of a number, I believe it's then appropriate to use >>'int(someNumber)' and make it clear that I'm intentionally using a lossy >>conversion rather than simply treating a number "as an" integer without >>changing its meaning. > >That's how it feels to me FOR NUMBERS, but I can't generalize the feeling >to the general case of facade between "records" with many fields of >information, see above. Then perhaps we have made some progress; "records" are typically a collection of facts with independent semantics, while a number is an atomic value. Facts taken in isolation do not alter their semantics, but dropping precision from a value does. So, to summarize my thoughts from this post: * Replacing LiskovViolation is possible by dropping type/isinstance checks from adapt(), and adding an isinstance check to object.__conform__; Liskov violators then override __conform__ in their class to return None when asked to conform to a protocol they wish to reject, and return super().__conform__ for all other cases. This achieves your use case while simplifying both the implementation and the usage. * Classic class support is a must; exceptions are still required to be classic, and even if they weren't in 2.5, backward compatibility should be provided for at least one release. * Lossy/noisy refer to removing or adding dependent semantics, not independent semantics, so facade-ish adaptation is not lossy or noisy. * If a target protocol permits NULL, then adaptation that supplies NULL is not noisy or lossy. If it is NOT NULL, then adaptation that supplies NULL is just plain wrong. Either way, there is no issue with transitivity, because either it's allowed or it isn't. (If NULLs aren't allowed, then you should be explicit when you make things up, and not do it implicitly via adaptation.) * In my experience, incorrectly deriving an interface from another is the most common source of unintended adaptation side-effects, not adapter composition. From FBatista at uniFON.com.ar Tue Jan 11 18:58:32 2005 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Tue Jan 11 19:00:59 2005 Subject: [Python-Dev] logging class submission Message-ID: [Stephan Stapel] #- > There's a license issue here? #- #- I was given the advise to use this license. If this license #- prohibts inclusion into Python, how should I re-license the code? I just was asking. Who gave you the advise? . Facundo Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog PyAr - Python Argentina: http://pyar.decode.com.ar/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA. La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050111/bbc5b682/attachment.htm From pje at telecommunity.com Tue Jan 11 19:03:18 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 19:02:15 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <6EA3D69F-63B7-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110132407.0344a9d0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111125528.028f69e0@mail.telecommunity.com> At 10:59 AM 1/11/05 +0100, Alex Martelli wrote: >all you have to do is ADD > def __conform__(self, protocol): > if issubclass(protocol, Abstract): > raise LiskovViolation > >that's all. That will raise a TypeError if protocol is not a class or type, so this could probably serve as an example of how difficult it is to write a good Liskov-violating __conform__. :) Actually, there's another problem with it; if you do this: class Y(X): pass class Z(Y): pass then 'adapt(Z(),Y)' will now fail because of a Liskov violation. It should really check for 'protocol is Abstract' or 'protocol in (Abstract,..)' in order to avoid this issue. >Basically, rephrasing private inheritance with containment and delegation >is a lot of messy work, and results in far more complicated >structures. And instead of paying the tiny price of a __conform__ call at >adaptation time, you pay the price of delegating calls over and over at >each x.tm1() call, so it's unlikely performance will improve. Well, as I mentioned in my other post, such inheritance is a lot simpler with PEAK, so I've probably forgotten how hard it is if you're not using PEAK. :) PEAK also caches the delegated methods in the instance's __dict__, so there's virtually no performance penalty after the first access. Again, not an argument that others should use PEAK, just an explanation as to why I missed this point; I've been using PEAK's delegation features for quite some time and so tend to think of delegation as something relatively trivial. From barry at python.org Tue Jan 11 19:09:02 2005 From: barry at python.org (Barry Warsaw) Date: Tue Jan 11 19:09:10 2005 Subject: [Python-Dev] logging class submission In-Reply-To: References: Message-ID: <1105466942.18590.23.camel@geddy.wooz.org> On Tue, 2005-01-11 at 12:58, Batista, Facundo wrote: > [Stephan Stapel] > > #- > There's a license issue here? > #- > #- I was given the advise to use this license. If this license > #- prohibts inclusion into Python, how should I re-license the code? > > I just was asking. Who gave you the advise? Here's a link to the PSF contribution form: http://www.python.org/psf/contrib.html This contains links to the recommended licenses for software that might be included in Python. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050111/3f9fdb4c/attachment.pgp From stephan.stapel at web.de Tue Jan 11 19:12:29 2005 From: stephan.stapel at web.de (Stephan Stapel) Date: Tue Jan 11 19:12:33 2005 Subject: [Python-Dev] logging class submission Message-ID: <1490506001@web.de> > I just was asking. Who gave you the advise? someon in a german python forum. I'll change the license asap. I'm just curious, but do I really have to use the contributor agreement etc.? I mean I'm just trying to submit a small class, no big framework. cheers, Stephan ________________________________________________________________ Verschicken Sie romantische, coole und witzige Bilder per SMS! Jetzt neu bei WEB.DE FreeMail: http://freemail.web.de/?mc=021193 From pje at telecommunity.com Tue Jan 11 19:20:14 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 19:19:10 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <1105464446.41e40c7e27114@mcherm.com> Message-ID: <5.1.1.6.0.20050111131211.0293eec0@mail.telecommunity.com> At 09:27 AM 1/11/05 -0800, Michael Chermside wrote: >Phillip: > >I think you must inhabit a far more perfect world than I do. > >You say, for instance, that: > > ...-1 if this introduces a performance penalty [...] just to > > support people who want to create deliberate Liskov violations. > > I personally don't think that we should pander to Liskov > > violators I've since dropped both the performance objection and the objection to supporting Liskov violation; in a more recent post I've proposed an alternative algorithm for allowing it, that has a simpler implementation. >You say that: > > I think one should adapt primarily to interfaces, and > > interface-to-interface adaptation should be reserved for > > non-lossy, non-noisy adapters. > >... but in my world, half the time I'm using adaptation to >correct for the fact that someone else's poorly-written >code requests some class where it should have just used >an interface. PEP 246 adaptation? Or are you talking about some other language? (I ask out of curiosity.) I agree that if it's possible to adapt to concrete types, people will do so. However, I think we all agree that this isn't a great idea and should still be considered bad style. That's not the same thing as saying it should be forbidden, and I haven't said it should be forbidden. >You seem to inhabit a world in which transitivity of adaptation >can be enforced. But in my world, people occasionally misuse >adaptation because they think they know what they're doing >or because they're in a big hurry and it's the most convenient >tool at hand. How is this different from abuse of *any* language feature that you're then forced to work around? Are you saying we should not provide a feature because *some* people will abuse the feature? I don't understand. If you allow interface inheritance, you're just as susceptible to an invalid adaptation path, and in my experience this is more likely to bite you unintentionally, mainly because interface inheritance works differently than class inheritance (which of course is used more often). Do you want to prohibit interface inheritance, too? From pje at telecommunity.com Tue Jan 11 19:32:53 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 19:31:53 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050110125415.02f28ec0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111130333.028f5820@mail.telecommunity.com> At 11:59 AM 1/11/05 +0100, Alex Martelli wrote: >On 2005 Jan 11, at 11:01, Alex Martelli wrote: >>On 2005 Jan 10, at 18:59, Phillip J. Eby wrote: >>>At 12:43 PM 1/10/05 -0500, Phillip J. Eby wrote: >>>>As a practical matter, all of the existing interface systems (Zope, >>>>PyProtocols, and even the defunct Twisted implementation) treat >>>>interface inheritance as guaranteeing substitutability for the base >>>>interface, and do so transitively. >>> >>>An additional data point, by the way: the Eclipse Java IDE has an >>>adaptation system that works very much like PEP 246 does, and it appears >>>that in a future release they intend to support automatic adapter >>>transitivity, so as to avoid requiring each provider of an interface to >>>"provide O(n^2) adapters when writing the nth version of an >>>interface." IOW, their current release is transitive only for interface >>>inheritance ala Zope or Twisted; their future release will be transitive >>>for adapter chains ala PyProtocols. >> >>This is definitely relevant prior art, so thanks for pointing it out. >>If interfaces change so often that 'n' can become worryingly high, this >>is a valid concern. In my world, though, published interfaces do NOT >>change as often as to require such remedies;-). FWIW, I don't believe that by "nth version" the original author wasn't referring to changed versions of the same interface, but was instead maybe trying to say that N interfaces that adapt to IFoo, when IFoo has M interfaces that it can be adapted to, means that N*M adapters are required in all, if adapter composition isn't possible. >"adapters" for a single underlying "object", is, I fear, stronger than we >can impose (so it may be that Eclipse is a better parallel, but I know >little of it while COM is in my bones, so that's what I keep thinking of;-). Fair enough. I think Eclipse's *implementation* maps fairly directly onto PEP 246, except that __conform__ is replaced by a 'getAdapter()' method, and an AdapterManager is used to look up adapters in place of both __adapt__ and the PEP 246 registry. So, it is much closer to PEP 246 than COM, in that COM all adaptation is managed by the object, and it cannot be externally adapted. (At least, the last I looked at COM many years ago it was the case; maybe that has changed now?) >So, I see transitivity as a nice thing to have _IF_ it's something that >gets explicitly asserted for a certain adapter -- if the adapter has to >explicitly state to the system that it "isn't lossy" (maybe), or "isn't >noisy" (perhaps more useful), or something like that... some amount of >reassurance about the adapter that makes it fully safe to use in such a chain. Maybe, although I think in our other thread we may be converging on definitions of lossy and noisy that are such we can agree that it's not really a problem. (I hope.) >Maybe it might suffice to let an adapter which IS 'lossy' (or, more >likely, one that is 'noisy') state the fact. I don't see a valid use case for implementing such a thing as an automatically-invoked adapter. > I'm always reluctant by instinct to default to convenient but risky > behavior, trusting programmers to explicitly assert otherwise when > needed; but in many ways this kind of design is a part of Python and > works fine (_with_ the BDFL's fine nose/instinct for making the right > compromise between convenience and safety in each case, of course). Proposal: let adaptation implemented via __conform__ be nontransitive, and adaptation via __adapt__ or the adapter registry be transitive. This would mean that lossy or noisy adapters could be implemented as an implicitly-executed explicit conversion, but only directly on a particular concrete class and its subclasses, thereby further limiting the scope and impact of a lossy or noisy adapter. Also, think about this: technically, if you implement lossy or noisy adaptation in __conform__, it *isn't* lossy or noisy, because you have to do it in the class -- which means that as the class' author, you have decreed it to have such semantics. However, if you are a third party, you will have to explicitly invoke the lossy or noisy adapter. IOW, if you globally register an adapter (with either the interface or the global registry), you are guaranteeing that your adaptation is not lossy or noisy. Otherwise, you need to put it in __conform__ or use it explicitly. >I'm still pondering the "don't adapt an adapter" suggestion, which seems a >sound one, and yet also seems to be, intrinsically, what >transitivity-by-chaining does. Note that QI does not suffer from this, >because it lets you get the underlying object identity (IUnknown) from any >interface adapter. Maybe, just maybe, we should also consider that -- a >distinguished protocol bereft of any real substance but acting as a flag >for "real unadapted object identity". Perhaps we could use 'object' for >that, at least if the flow of logic in 'adapt' stays as in the current PEP >246 draft (i.e., __conform__ is given a chance before isinstance triggers >-- so, all adapters could __conform__ to object by returning the >underlying object being adapted, while other objects without such a >feature in __conform__ would end up with 'adapt(x, object) is x'). Or, if >object in this role turns out to be confusing, IDentity (;-) or some other >specially designed protocol. It's a nice idea; the only problem I see is how far down it goes. Any adapter composition implies that adapters need to know whether to also call adapt(x,object) on their adaptee. From mcherm at mcherm.com Tue Jan 11 19:47:13 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Tue Jan 11 19:47:15 2005 Subject: [Python-Dev] PEP 246, redux Message-ID: <1105469233.41e41f315092a@mcherm.com> I wrote: > >... but in my world, half the time I'm using adaptation to > >correct for the fact that someone else's poorly-written > >code requests some class where it should have just used > >an interface. Phillip replies: > PEP 246 adaptation? Or are you talking about some other language? > (I ask out of curiosity.) Well, it's partly just a rhetorical device here. I mean PEP 246 adaption, but (unlike you!) I'm not actually using it yet (aside from playing around to try things out), really I'm just guessing how I WOULD be using it if it were part of core python. > I agree that if it's possible to adapt to concrete types, people will do > so. However, I think we all agree that this isn't a great idea and should > still be considered bad style. I'd agree except for the case where I am trying to pass an object into code which is misbehaving. If we do add type declarations that trigger an adapt() call, then people WILL write poor code which declares concrete types, and I will find myself writing __conform__ methods to work around it. In this case, I'm the one making use of adaption (the original author was just expecting a TypeError), but what I'm doing isn't (IMO) bad style. > >You seem to inhabit a world in which transitivity of adaptation > >can be enforced. But in my world, people occasionally misuse > >adaptation because they think they know what they're doing > >or because they're in a big hurry and it's the most convenient > >tool at hand. > > How is this different from abuse of *any* language feature that you're > then forced to work around? Are you saying we should not provide a > feature because *some* people will abuse the feature? I don't > understand. If we're just recomending that people design for transitivity, then I don't have a problem (although see Alex's fairly good point illustrated with LotsOfInfo, PersonName, and FullName -- I found it convincing). But I was under the impression that the point of transitivity was to make it "required", then automatically walk chains of adaptions. Then I fear one case of mis-used adaption could "poison" my entire adaption mechanism. The N^2 explosion of pairwise-only adapters scares me less, because I think in most real situations N will be small. > If you allow interface inheritance, you're just as susceptible to an > invalid adaptation path, and in my experience this is more likely to > bite you unintentionally, mainly because interface inheritance works > differently than class inheritance (which of course is used more > often). Do you want to prohibit interface inheritance, too? Hmm. Sounds like you're making a point here that's important, but which I don't quite get. Can you elaborate? I certainly hadn't intended to prohibit interface inheritance... how exactly does it "bite" one? -- Michael Chermside From cce at clarkevans.com Tue Jan 11 19:50:20 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Tue Jan 11 19:50:22 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> Message-ID: <20050111185020.GA28966@prometheusresearch.com> On Tue, Jan 11, 2005 at 12:54:36PM -0500, Phillip J. Eby wrote: | * Replacing LiskovViolation is possible by dropping type/isinstance | checks from adapt(), and adding an isinstance check to | object.__conform__; Liskov violators then override __conform__ in their | class to return None when asked to conform to a protocol they wish to | reject, and return super().__conform__ for all other cases. This | achieves your use case while simplifying both the implementation and the | usage. I'd rather not assume that class inheritance implies substitutability, unless the class is "marked" as an interface (assuming that one doesn't have interfaces). I'd like it to be explicit -- a bit of a nudge to remind a developer to verify substitutability is a good thing. In this scenerio, a LiskovViolation exception isn't needed (aside, I don't see the rationale for the exception: to prevent third party adapters?). Could we make a boilerplate __conform__ which enables class-based substitutability a well-known decorator? | * In my experience, incorrectly deriving an interface from another is the | most common source of unintended adaptation side-effects, not adapter | composition It'd be nice if interfaces had a way to specify a test-suite that could be run against a component which claims to be compliant. For example, it could provide invalid inputs and assert that the proper errors are returned, etc. Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From DavidA at ActiveState.com Tue Jan 11 19:59:26 2005 From: DavidA at ActiveState.com (David Ascher) Date: Tue Jan 11 20:01:43 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: Message-ID: <41E4220E.3080507@ActiveState.com> Alex Martelli wrote: > > On 2005 Jan 10, at 23:15, Thomas Heller wrote: > >> Alex Martelli writes: >> >>> PEP: 246 >>> Title: Object Adaptation >> >> >> Minor nit (or not?): You could provide a pointer to the Liskov >> substitution principle, for those reader that aren't too familiar with >> that term. > > > Excellent idea, thanks. Terminology point: I know that LiskovViolation is technically correct, but I'd really prefer it if exception names (which are sometimes all users get to see) were more informative for people w/o deep technical background. Would that be possible? --david From mcherm at mcherm.com Tue Jan 11 20:26:26 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Tue Jan 11 20:26:26 2005 Subject: [Python-Dev] PEP 246, redux Message-ID: <1105471586.41e42862b9a39@mcherm.com> David Ascher writes: > Terminology point: I know that LiskovViolation is technically correct, > but I'd really prefer it if exception names (which are sometimes all > users get to see) were more informative for people w/o deep technical > background. Would that be possible? I don't see how. Googling on Liskov immediately brings up clear and understandable descriptions of the principle that's being violated. I can't imagine summarizing the issue more concisely than that! What would you suggest? Including better explanations in the documentation is a must, but "LiskovViolation" in the exception name seems unbeatably clear and concise. -- Michael Chermside From pje at telecommunity.com Tue Jan 11 20:44:29 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 20:43:28 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <1105469233.41e41f315092a@mcherm.com> Message-ID: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> At 10:47 AM 1/11/05 -0800, Michael Chermside wrote: >I'd agree except for the case where I am trying to pass an object >into code which is misbehaving. If we do add type declarations that >trigger an adapt() call, then people WILL write poor code which >declares concrete types, and I will find myself writing __conform__ >methods to work around it. In this case, I'm the one making use of >adaption (the original author was just expecting a TypeError), but >what I'm doing isn't (IMO) bad style. Agreed. However, assuming that you're declaring a "clean" adaptation, perhaps it should be registered with the global registry rather than implemented in __conform__, which would be less work for you. >If we're just recomending that people design for transitivity, then I >don't have a problem (although see Alex's fairly good point illustrated >with LotsOfInfo, PersonName, and FullName -- I found it convincing). It's a bit misleading, however; if the target protocol allows for "nulls", then it's allowed to have nulls. If it doesn't allow nulls, then the adaptation is broken. Either way, it seems to me to work out, you just have to decide which way you want it. >But I was under the impression that the point of transitivity was to >make it "required", then automatically walk chains of adaptions. I don't have a problem with making some part of the adaptation process avoid transitivity, such as hand-implemented __conform__ methods. > Then >I fear one case of mis-used adaption could "poison" my entire adaption >mechanism. The N^2 explosion of pairwise-only adapters scares me less, >because I think in most real situations N will be small. Well, Eclipse is a pretty good example of a large N, and I know that both Twisted and Zope developers have occasionally felt the need to do "double-dip" adaptation in order to work around the absence of transitive adapter composition in their adaptation systems. > > If you allow interface inheritance, you're just as susceptible to an > > invalid adaptation path, and in my experience this is more likely to > > bite you unintentionally, mainly because interface inheritance works > > differently than class inheritance (which of course is used more > > often). Do you want to prohibit interface inheritance, too? > >Hmm. Sounds like you're making a point here that's important, but which >I don't quite get. Can you elaborate? I certainly hadn't intended to >prohibit interface inheritance... how exactly does it "bite" one? If you derive an interface from another interface, this is supposed to mean that your derived interface promises to uphold all the promises of the base interface. That is, your derived interface is always usable where the base interface is required. However, oftentimes one mistakenly derives an interface from another while meaning that the base interface is *required* by the derived interface, which is similar in meaning but subtly different. Here, you mean to say, "IDerived has all of the requirements of IBase", but you have instead said, "You can use IDerived wherever IBase is desired". But now, suppose that you have class Foo, which has an adapter defined to IDerived, and which is looked up for you by IDerived.__adapt__ and IBase.__adapt__. Then, if you pass a Foo instance to a function that expects an *IBase*, then the function will end up with an IDerived. Sometimes this is not at all what you want, at which point I normally go back and copy the relevant methods from IBase to IDerived and remove the inheritance relationship. This problem exists in Zope's adaptation system as well as in PyProtocols. I have found that I am far less likely to have an adaptation problem from defining a questionable adapter, than I am to have one from wrongly-used inheritance. I am now more careful about the inheritance, but it's difficult because intuitively an interface defines a *requirement*, so it seems logical to inherit from an interface in order to add requirements! Now, in the case where both an IBase and an IDerived adapter exist, Zope and PyProtocols prefer to use the IBase adapter when an IBase is requested. But this doesn't address the problem case, which is where there is no IBase-only adaptation. From pje at telecommunity.com Tue Jan 11 20:48:39 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 20:47:37 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050111185020.GA28966@prometheusresearch.com> References: <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> At 01:50 PM 1/11/05 -0500, Clark C. Evans wrote: >On Tue, Jan 11, 2005 at 12:54:36PM -0500, Phillip J. Eby wrote: >| * Replacing LiskovViolation is possible by dropping type/isinstance >| checks from adapt(), and adding an isinstance check to >| object.__conform__; Liskov violators then override __conform__ in their >| class to return None when asked to conform to a protocol they wish to >| reject, and return super().__conform__ for all other cases. This >| achieves your use case while simplifying both the implementation and the >| usage. > >I'd rather not assume that class inheritance implies substitutability, Hm, you should take that up with Alex then, since that is what his current PEP 246 draft does. :) Actually, the earlier drafts did that too, so I'm not sure why you want to change this now. What I've actually suggested here actually allows for inheritance=substitutability as the default, but also makes it trivially changeable for any given inheritance hierarchy by overriding __conform__ at the base of that hierarchy, and without introducing a special exception class to do it. From aleax at aleax.it Tue Jan 11 21:10:00 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 21:10:05 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> References: <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> Message-ID: On 2005 Jan 11, at 20:48, Phillip J. Eby wrote: ... >> I'd rather not assume that class inheritance implies substitutability, > > Hm, you should take that up with Alex then, since that is what his > current PEP 246 draft does. :) Actually, the earlier drafts did that > too, so I'm not sure why you want to change this now. > > What I've actually suggested here actually allows for > inheritance=substitutability as the default, but also makes it > trivially changeable for any given inheritance hierarchy by overriding > __conform__ at the base of that hierarchy, and without introducing a > special exception class to do it. The base of the hierarchy has no idea of which subclasses follow or break Liskov subtitutability. It's just silly to site the check there. Moreover, having to change the base class is more invasive than being able to do it in the derived class: typically the author of the derived class is taking the base class from some library and does not want to change that library -- changing the derived class is not ideal, but still way better. Alex From aleax at aleax.it Tue Jan 11 21:23:02 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 21:23:07 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> Message-ID: <9832132B-640E-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 11, at 20:44, Phillip J. Eby wrote: ... >> If we're just recomending that people design for transitivity, then I >> don't have a problem (although see Alex's fairly good point >> illustrated >> with LotsOfInfo, PersonName, and FullName -- I found it convincing). > > It's a bit misleading, however; if the target protocol allows for > "nulls", then it's allowed to have nulls. If it doesn't allow nulls, > then the adaptation is broken. Either way, it seems to me to work > out, you just have to decide which way you want it. NULLs are allowed, but *PRAGMATICALLY* they shouldn't be used except where there's no alternative. Is the concept of *PRAGMATICS* so deucedly HARD for all of your eggheads?! Maybe a full-credits course in linguistics should be mandatory for CS major or wherever you got your sheepskin[s]. In terms of syntax and semantics, a TCP/IP stack which just dropped all packet instantly would be compliant with the standards. No GUARANTEE that any given packet will be delivered is ever written down anywhere, after all. The reason such a TCP/IP stack would NOT be a SENSIBLE implementation of the standards is PRAGMATICS. The stack is supposed to do a best-effort ATTEMPT to deliver packets, dammit! That may be hard to formalize mathematically, but it makes all the difference in the world between a silly joke and a real-world tool. My best example of pragmatics in linguistics: if I state...: """ I never strangle python-dev posters with the initials PJE in months with an "R" in them """ I am saying nothing that is false or incorrect or misleading, in terms of syntax and semantics. This assertion is grammatically correct and semantically true. Does this mean you should worry come May...? Not necessarily, because the assertion is _pragmatically_ dubious. *PRAGMATICALLY*, in all natural languages, when I state "I never do X under condition Y" there's an implication that "condition Y" DOES have something to do with the case -- that if condition Y DOESN'T hold then my assurance about not doing X weakens. If condition Y has nothing to do with my doing or not doing X, then by the PRAGMATICS of natural language I'm NOT supposed to juxtapose the two things -- even though both syntactically and semantically it's perfectly correct to do so. Network protocol specs, programming language, libraries, etc, have pragmatics, too. They're way harder to formalize, but that doesn't mean they can be blithely ignored in the real world. Yes, you're ALLOWED to stuff with NULL any field that isn't explicitly specified as NOT NULL. But you should ONLY do so when the information is REALLY missing, NOT when you've lost it along the way because you've implemented adapter-chain transitivity: dropping information which you COULD have preserved with a bit more care (==without transitivity) is a violation of PRAGMATICS, of the BEST-EFFORT implication, just as it would be to drop packets once in a while in a TCP/IP stack due to some silly programming bug which was passed silently. Alex From pje at telecommunity.com Tue Jan 11 21:30:19 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 21:29:18 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com> At 09:10 PM 1/11/05 +0100, Alex Martelli wrote: >On 2005 Jan 11, at 20:48, Phillip J. Eby wrote: > ... >>>I'd rather not assume that class inheritance implies substitutability, >> >>Hm, you should take that up with Alex then, since that is what his >>current PEP 246 draft does. :) Actually, the earlier drafts did that >>too, so I'm not sure why you want to change this now. >> >>What I've actually suggested here actually allows for >>inheritance=substitutability as the default, but also makes it trivially >>changeable for any given inheritance hierarchy by overriding __conform__ >>at the base of that hierarchy, and without introducing a special >>exception class to do it. > >The base of the hierarchy has no idea of which subclasses follow or break >Liskov subtitutability. It's just silly to site the check >there. Moreover, having to change the base class is more invasive than >being able to do it in the derived class: typically the author of the >derived class is taking the base class from some library and does not want >to change that library -- changing the derived class is not ideal, but >still way better. Stop; you're responding to something I didn't propose! (Maybe you're reading these posts in reverse order, and haven't seen the actual proposal yet?) Clark said he didn't want to assume substitutability; I was pointing out that he could choose to not assume that, if he wished, by implementing an appropriate __conform__ at the base of his hierarchy. This is entirely unrelated to deliberate Liskov violation, and is in any case not possible with your original proposal. I don't agree with Clark's use case, but my proposal supports it as a possibility, and yours does not. To implement a Liskov violation with my proposal, you do exactly the same as with your proposal, *except* that you can simply return None instead of raising an exception, and the logic for adapt() is more straightforward. From pje at telecommunity.com Tue Jan 11 21:53:08 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 21:52:16 2005 Subject: [Python-Dev] Concrete proposals for PEP 246 Message-ID: <5.1.1.6.0.20050111144851.009d2ec0@mail.telecommunity.com> To save everyone from having to wade through the lengthy discussions between Alex and I, and to avoid putting all the summarization burden on Alex, I thought I would start a new thread listing my concrete proposals for PEP 246 changes, and summarizing my understanding of the current agreements and disagreements we have. (Alex, please correct me if I have misrepresented your position in any respects.) First, I propose to allow explicitly-declared Liskov violations, but using a different mechanism than the one Alex has proposed. Specifically, I wish to remove all type or isinstance checking from the 'adapt()' function. But, the 'object' type should have a default __conform__ that is equivalent to: class object: def __conform__(self,protocol): if isinstance(protocol,ClassTypes) and isinstance(self,protocol): return self return None and is inherited by all object types, including types defined by extension modules, unless overridden. This approach provides solid backward compatibility with the previous version of PEP 246, as long as any user-implemented __conform__ methods call their superclass __conform__. But, it also allows Liskov-violating classes to simply return 'None' to indicate their refusal to conform to a base class interface, instead of having to raise a special exception. I think that, all else being equal, this is a simpler implementation approach to providing the feature, which Alex and others have convinced me is a valid (if personally somewhat distasteful) use case. Second: Alex has agreed to drop the "cast" terminology, since its meanings in various languages are too diverse to be illuminating. We have instead begun creeping towards agreement on some concepts such as "lossy" or "noisy" conversions. I think we can also drop my original "lossy" term, because "noisy" can also imply information loss and better explains the real issue anyway. A "noisy" conversion, then, is one where the conversion "makes up" information that was not implicitly present in the original, or drops information that *alters the semantics of the information that is retained*. (This phrasing gets around Alex's LotsOfInfo example/objection, while still covering loss of numeric precision; mere narrowing and renaming of attributes/methods does not alter the semantics of the retained information.) Adaptation is not recommended as a mechanism for noisy conversions, because implicit changes to semantics are a bad idea. Note that this is actually independent of any transitivity issues -- implicit noisy conversion is just a bad idea to start with, which is why 'someList[1.2]' raises a TypeError rather than implicitly converting 1.2 to an integer! If you *mean* to drop the .2, you should say so, by explicitly converting to an integer. (However, it *might* be acceptable to implicitly convert 1.0 to an integer; I don't currently have a strong opinion either way on that issue, other than to note that the conversion is not "noisy" in that case.) Anyway, I think that the current level of consensus between Alex and myself on the above is now such that his comparison to casting could now be replaced by some discussion of noisy vs. non-noisy (faithful? high-fidelity?) conversion, and the fact that adaptation is suitable only for the latter, supplemented by some examples of noisy conversion use cases and how to transform them into non-noisy constructs. The string->file vs. string->file_factory example is a particularly good one, I think, because it shows how to address a common, practical issue. Third: (Proposed) The PEP should explicitly support classic classes, or else there is no way to adapt exception instances. (Yes, I have actually done this; peak.web adapts exception instances to obtain appropriate handlers, for example.) Fourth: The principal issue from the original discussion that remains open at this time is determining specific policies or recommendations for addressing various forms of transitivity, which we have delved into a little bit. (By "open" I jut mean that there is no proposal for this issue currently on the table, not to imply that my proposals are not also "open" in the sense of awaiting consensus.) Anyway, the kinds of transitivity we're discussing are: 1. interface inheritance transitivity (i.e. adapt(x,IBase) if adapt(x,IDerived) and IDerived inherits from IBase) 2. adapter composition transitivity (i.e. adapt(x,ISome) if adapt(x,IOther) and there is a general-purpose ISome->IOther adapter available. These are mostly issues for the design of an interface system (implementing __adapt__) and for the design of a global adapter registry. I don't think it's practical to implement either kind of transitivity on the __conform__ side, at least not for hand-written __conform__ methods. To summarize current PEP 246 implementations' choices on this issue, Zope implements type 1 transitivity, but not type 2; PyProtocols implements both. Both Zope and PyProtocols allow for individual objects to assert compliance with an interface that their class does not claim compliance with, and to use this assertion as a basis for adaptation. In the case of PyProtocols, this is handled by adding a per-instance __conform__, but Zope has a separate concept of declaring what interfaces an instance "provides", distinct from what it is "adaptable to". PyProtocols in contrast considers "provides" to be the same as "adaptable to with no adapter", i.e. a trivial special case of adaptation rather than a distinct concept. I have also asserted that in practice I have encountered more problems with type 1 transitivity than with type 2, because of the strong temptation to derive an interface to avoid duplicating methods. In other words, inappropriate use of interface inheritance produces roughly the same effect as introducing a noisy adapter into a type 2 adapter mesh, but IMO it's much easier to do innocently and accidentally, as it doesn't even require that you write an adapter! From pje at telecommunity.com Tue Jan 11 22:08:13 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 22:07:13 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <9832132B-640E-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> At 09:23 PM 1/11/05 +0100, Alex Martelli wrote: >Is the concept of *PRAGMATICS* so deucedly HARD for all of your eggheads?! Hmm. Pot, meet kettle. :) >Yes, you're ALLOWED to stuff with NULL any field that isn't explicitly >specified as NOT NULL. > >But you should ONLY do so when the information is REALLY missing, NOT when >you've lost it along the way because you've implemented adapter-chain >transitivity: dropping information which you COULD have preserved with a >bit more care (==without transitivity) is a violation of PRAGMATICS, of >the BEST-EFFORT implication, just as it would be to drop packets once in a >while in a TCP/IP stack due to some silly programming bug which was passed >silently. This is again a misleading analogy. You are comparing end-to-end with point-to-point. I am saying that if you have a point-to-point connection that drops all packets of a particular kind, you should not put it into your network, unless you know that an alternate route exists that can ensure those packets get through. Otherwise, you are breaking the network. Thus, I am saying that PRAGMATICALLY, it is silly to create a cable that drops all ACK packets, for example, and then plug it into your network. And especially, it's silly to turn around that as a reason that one should only use end-to-end leased lines, because that packet forwarding business is dangerously unreliable! As far as I can tell, you are arguing that you should never use packet forwarding for communication, because somebody might have a router somewhere that drops packets. While I am arguing that if a router is known to drop packets incorrectly, the router is broken and should be removed from the network, or else bypassed via another route. And, in the cases where you have a "leased line" direct from point A to point B, your routers should be smart enough to use that route in place of forwarding from A to C to D to B, or whatever. From fredrik at pythonware.com Tue Jan 11 23:20:59 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Jan 11 23:20:58 2005 Subject: [Python-Dev] copy confusion Message-ID: back in Python 2.1 (and before), an object could define how copy.copy should work simply by definining a __copy__ method. here's the relevant portion: ... try: copierfunction = _copy_dispatch[type(x)] except KeyError: try: copier = x.__copy__ except AttributeError: raise error, \ "un(shallow)copyable object of type %s" % type(x) y = copier() ... I recently discovered that this feature has disappeared in 2.3 and 2.4. in- stead of looking for an instance method, the code now looks at the object's type: ... cls = type(x) copier = _copy_dispatch.get(cls) if copier: return copier(x) copier = getattr(cls, "__copy__", None) if copier: return copier(x) ... (copy.deepcopy still seems to be able to use __deepcopy__ hooks, though) is this a bug, or a feature of the revised copy/pickle design? (the code in copy_reg/copy/pickle might be among the more convoluted pieces of python coding that I ever seen... and what's that smiley doing in copy.py?) and if it's a bug, does the fact that nobody reported this for 2.3 indicate that I'm the only one using this feature? is there a better way to control copying that I should use instead? From pje at telecommunity.com Tue Jan 11 23:39:34 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 11 23:38:36 2005 Subject: [Python-Dev] copy confusion In-Reply-To: Message-ID: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> At 11:20 PM 1/11/05 +0100, Fredrik Lundh wrote: >I recently discovered that this feature has disappeared in 2.3 and 2.4. in- >stead of looking for an instance method, the code now looks at the object's >type: > > ... > > cls = type(x) > > copier = _copy_dispatch.get(cls) > if copier: > return copier(x) > > copier = getattr(cls, "__copy__", None) > if copier: > return copier(x) > > ... > >(copy.deepcopy still seems to be able to use __deepcopy__ hooks, though) > >is this a bug, or a feature of the revised copy/pickle design? Looks like a bug to me; it breaks the behavior of classic classes, since type(classicInstance) returns InstanceType. However, it also looks like it might have been introduced to fix the possibility that calling '__copy__' on a new-style class with a custom metaclass would result in ending up with an unbound method. (Similar to the "metaconfusion" issue being recently discussed for PEP 246.) ISTM the way to fix both issues is to switch to using x.__class__ in preference to type(x) to retrieve the __copy__ method from, although this still allows for metaconfusion at higher metaclass levels. Maybe we need a getclassattr to deal with this issue, since I gather from Armin's post that this problem has popped up other places besides here and PEP 246. (Follow-up: Guido's checkin comment for the change suggests it was actually done as a performance enhancement while adding a related feature (copy_reg integration), rather than as a fix for possible metaconfusion, even though it also has that effect.) From aleax at aleax.it Tue Jan 11 23:50:55 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 23:51:01 2005 Subject: [Python-Dev] copy confusion In-Reply-To: References: Message-ID: <40628469-6423-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 11, at 23:20, Fredrik Lundh wrote: > back in Python 2.1 (and before), an object could define how copy.copy > should > work simply by definining a __copy__ method. here's the relevant > portion: > > ... > try: > copierfunction = _copy_dispatch[type(x)] > except KeyError: > try: > copier = x.__copy__ > except AttributeError: > raise error, \ > "un(shallow)copyable object of type %s" % type(x) > y = copier() > ... > > I recently discovered that this feature has disappeared in 2.3 and > 2.4. in- > stead of looking for an instance method, the code now looks at the > object's > type: Hmmm, yes, we were discussing this general issue as part of the huge recent thread about pep 246. In the new-style object model, special methods are supposed to be looked up on the type, not on the object; otherwise, having a class with special methods would be a problem -- are the methods meant to apply to the class object itself, or to its instances? However, apparently, the code you quote is doing it wrong: > cls = type(x) > > copier = _copy_dispatch.get(cls) > if copier: > return copier(x) > > copier = getattr(cls, "__copy__", None) > if copier: > return copier(x) ...because getattr is apparently the wrong way to go about it (e.g., it could get the '__copy__' from type(cls), which would be mistaken). Please see Armin Rigo's only recent post to Python-Dev for the way it should apparently be done instead -- assuming Armin is right (he generally is), there should be plenty of bugs in copy.py (ones that emerge when you're using custom metaclasses &c -- are you doing that?). Still, if you're using an instance of an old-style class, the lookup in _copy_dispatch should be on types.InstanceType -- is that what you're trying to copy, an instance of an old-style class? > (copy.deepcopy still seems to be able to use __deepcopy__ hooks, > though) It starts with a peek into a dispatch dictionary for the type of the object, too, just like shallow copy does. What's the type of what you're trying to copy? > is this a bug, or a feature of the revised copy/pickle design? (the > code in > copy_reg/copy/pickle might be among the more convoluted pieces of > python > coding that I ever seen... and what's that smiley doing in copy.py?) > > and if it's a bug, does the fact that nobody reported this for 2.3 > indicate that > I'm the only one using this feature? is there a better way to control > copying > that I should use instead? When I can, I use __getstate__ and __setstate__, simply because they seem clear and flexible to be (usable for copying, deep copying, pickling). But that doesn't mean __copy__ or __deepcopy__ should be left broken, of course. Although there are features of design intent here, it does appear to me there may be bugs too (if the getattr on the type is wrong); in this case it's worrisome, not just that nobody else reported problems, but also that the unit tests didn't catch them...:-( Alex From aleax at aleax.it Tue Jan 11 23:56:26 2005 From: aleax at aleax.it (Alex Martelli) Date: Tue Jan 11 23:56:31 2005 Subject: [Python-Dev] copy confusion In-Reply-To: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: <05E1B32A-6424-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 11, at 23:39, Phillip J. Eby wrote: ... >> cls = type(x) >> >> copier = _copy_dispatch.get(cls) >> if copier: >> return copier(x) ... >> this a bug, or a feature of the revised copy/pickle design? > > Looks like a bug to me; it breaks the behavior of classic classes, > since type(classicInstance) returns InstanceType. It doesn't, because types.InstanceType is a key in _copy_dispatch and gets a function that implements old-style classe behavior. > However, it also looks like it might have been introduced to fix the > possibility that calling '__copy__' on a new-style class with a custom > metaclass would result in ending up with an unbound method. (Similar > to the "metaconfusion" issue being recently discussed for PEP 246.) > > ISTM the way to fix both issues is to switch to using x.__class__ in > preference to type(x) to retrieve the __copy__ method from, although > this still allows for metaconfusion at higher metaclass levels. What "both issues"? There's only one issue, it seems to me -- one of metaconfusion. Besides, getattr(x.__class__, '__copy__') would not give backwards compatibility if x is an old-style instance -- it would miss the per-instance x.__copy__ if any. Fortunately, _copy_dispatch deals with that. So changing from type(x) to x.__class__ seems useless. > Maybe we need a getclassattr to deal with this issue, since I gather > from Armin's post that this problem has popped up other places besides > here and PEP 246. Apparently we do: a bug in a reference implementation in a draft PEP is one thing, one that lives so long in a key module of the standard library is quite another. > (Follow-up: Guido's checkin comment for the change suggests it was > actually done as a performance enhancement while adding a related > feature (copy_reg integration), rather than as a fix for possible > metaconfusion, even though it also has that effect.) OK, but if Armin is correct about the code in the reference implementation of pep 246, and I think he is, this is still a bug in copy.py (though probably not the specific one that bit /f). Alex From gvanrossum at gmail.com Tue Jan 11 23:58:08 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 11 23:58:12 2005 Subject: [Python-Dev] copy confusion In-Reply-To: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: [Fredrik] > >I recently discovered that this feature has disappeared in 2.3 and 2.4. in- > >stead of looking for an instance method, the code now looks at the object's > >type: > > > > ... > > > > cls = type(x) > > > > copier = _copy_dispatch.get(cls) > > if copier: > > return copier(x) > > > > copier = getattr(cls, "__copy__", None) > > if copier: > > return copier(x) > > > > ... > > > >(copy.deepcopy still seems to be able to use __deepcopy__ hooks, though) > > > >is this a bug, or a feature of the revised copy/pickle design? [Phillip] > Looks like a bug to me; it breaks the behavior of classic classes, since > type(classicInstance) returns InstanceType. I'm not so sure. I can't seem to break this for classic classes. The only thing this intends to break, and then only for new-style classes, is the ability to have __copy__ be an instance variable (whose value should be a callable without arguments) -- it must be a method on the class. This is the same thing that I've done for all built-in operations (__add__, __getitem__ etc.). > However, it also looks like it might have been introduced to fix the > possibility that calling '__copy__' on a new-style class with a custom > metaclass would result in ending up with an unbound method. (Similar to > the "metaconfusion" issue being recently discussed for PEP 246.) Sorry, my head just exploded. :-( I think I did this change (for all slots) to make the operations more efficient by avoiding dict lookups. It does have the desirable property of not confusing a class's attributes with its metaclass's attributes, but only as long as you use the operation's native syntax (e.g. x[y]) rather than the nominally "equivalent" method call (e.g. x.__getitem__(y)). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax at aleax.it Wed Jan 12 00:09:17 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 00:09:21 2005 Subject: [Python-Dev] copy confusion In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: On 2005 Jan 11, at 23:58, Guido van Rossum wrote: ... >>> cls = type(x) >>> copier = _copy_dispatch.get(cls) >>> if copier: >>> return copier(x) ... >>> is this a bug, or a feature of the revised copy/pickle design? > > [Phillip] >> Looks like a bug to me; it breaks the behavior of classic classes, >> since >> type(classicInstance) returns InstanceType. > > I'm not so sure. I can't seem to break this for classic classes. You can't, _copy_dispatch deals with those. > The only thing this intends to break, and then only for new-style > classes, is the ability to have __copy__ be an instance variable > (whose value should be a callable without arguments) -- it must be a > method on the class. This is the same thing that I've done for all > built-in operations (__add__, __getitem__ etc.). And a wonderful idea it is. >> However, it also looks like it might have been introduced to fix the >> possibility that calling '__copy__' on a new-style class with a custom >> metaclass would result in ending up with an unbound method. (Similar >> to >> the "metaconfusion" issue being recently discussed for PEP 246.) > > Sorry, my head just exploded. :-( > > I think I did this change (for all slots) to make the operations more > efficient by avoiding dict lookups. It does have the desirable > property of not confusing a class's attributes with its metaclass's > attributes, but only as long as you use the operation's native syntax > (e.g. x[y]) rather than the nominally "equivalent" method call (e.g. > x.__getitem__(y)). Unfortunately, we do have a problem with the code in copy.py: class MetaCopyableClass(type): def __copy__(cls): """ code to copy CLASSES of this metaclass """ # etc, etc, snipped class CopyableClass: __metaclass__ = MetaCopyableClass # rest of class snipped x = CopyableClass() import copy y = copy.copy(x) kallisti:/tmp alex$ python x.py Traceback (most recent call last): File "x.py", line 14, in ? y = copy.copy(x) File "/usr/local/lib/python2.4/copy.py", line 79, in copy return copier(x) TypeError: __copy__() takes exactly 1 argument (2 given) kallisti:/tmp alex$ See? copy.copy(x) ends up using MetaCopyableClass.__copy__ -- because of a getattr on CopyableClass for '__copy__', which gets the BOUND-METHOD defined in the metaclass, with im_self being CopyableClass. I had exactly the same metabug in the pep 246 reference implementation, Armin Rigo showed how to fix it in his only recent post. Alex From DavidA at ActiveState.com Wed Jan 12 00:13:43 2005 From: DavidA at ActiveState.com (David Ascher) Date: Wed Jan 12 00:15:33 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <1105471586.41e42862b9a39@mcherm.com> References: <1105471586.41e42862b9a39@mcherm.com> Message-ID: <41E45DA7.1030302@ActiveState.com> Michael Chermside wrote: > David Ascher writes: > >>Terminology point: I know that LiskovViolation is technically correct, >>but I'd really prefer it if exception names (which are sometimes all >>users get to see) were more informative for people w/o deep technical >>background. Would that be possible? > > > I don't see how. Googling on Liskov immediately brings up clear > and understandable descriptions of the principle that's being violated. > I can't imagine summarizing the issue more concisely than that! What > would you suggest? Including better explanations in the documentation > is a must, but "LiskovViolation" in the exception name seems unbeatably > clear and concise. Clearly, I disagree. My point is that it'd be nice if we could come up with an exception name which could be grokkable without requiring 1) Google, 2) relatively high-level understanding of type theory. Googling on Liskov brings up things like: http://c2.com/cgi/wiki?LiskovSubstitutionPrinciple """What is wanted here is something like the following substitution property: If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T." - BarbaraLiskov, Data Abstraction and Hierarchy, SIGPLAN Notices, 23,5 (May, 1988).""" If you think that that is clear and understandable to the majority of the Python community, you clearly have a different perspective on that community. I have (almost) no doubt that all Python-dev'ers understand it, but maybe we should ask someone like Anna Ravenscroft or Mark Lutz if they thinks it'd be appropriate from a novice user's POV. I'm quite sure that the experts could understand a more pedestrian name, and quite sure that the reverse isn't true. I also think that the term "violation" isn't necessarily the best word to add to the Python namespace, when error or exception would do just fine. In addition, to say that it's unbeatably clear without a discussion of alternatives (or if I've missed it, please let me know) seems weird. The point is broader, though -- when I get my turn in the time machine, I'll lobby for replacing NameError with UndefinedVariable or something similar (or more useful still). The former is confusing to novices, and while it can be learned, that's yet another bit of learning which is, IMO, unnecessary, even though it may be technically more correct. --david ascher From gvanrossum at gmail.com Wed Jan 12 00:20:12 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 00:20:33 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <41E45DA7.1030302@ActiveState.com> References: <1105471586.41e42862b9a39@mcherm.com> <41E45DA7.1030302@ActiveState.com> Message-ID: > My point is that it'd be nice if we could come up with an exception name > which could be grokkable without requiring 1) Google, 2) relatively > high-level understanding of type theory. How about SubstitutabilityError? > The point is broader, though -- when I get my turn in the time machine, > I'll lobby for replacing NameError with UndefinedVariable or something > similar (or more useful still). The former is confusing to novices, and > while it can be learned, that's yet another bit of learning which is, > IMO, unnecessary, even though it may be technically more correct. We did that for UnboundLocalError, which subclasses NameError. Perhaps we can rename NameError to UnboundVariableError (and add NameError as an alias for b/w compat). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From DavidA at ActiveState.com Wed Jan 12 00:21:29 2005 From: DavidA at ActiveState.com (David Ascher) Date: Wed Jan 12 00:23:12 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: References: <1105471586.41e42862b9a39@mcherm.com> <41E45DA7.1030302@ActiveState.com> Message-ID: <41E45F79.609@ActiveState.com> Guido van Rossum wrote: >>My point is that it'd be nice if we could come up with an exception name >>which could be grokkable without requiring 1) Google, 2) relatively >>high-level understanding of type theory. > > > How about SubstitutabilityError? That would be far, far better, yes. > We did that for UnboundLocalError, which subclasses NameError. Perhaps > we can rename NameError to UnboundVariableError (and add NameError as > an alias for b/w compat). Sure, although (and here I'm pushing it, I know, and I should have argued it way back then), the notion of 'unbound' is possibly too low-level still. 'Unknown' would probably carry much more meaning to those people who most need it. But yes, you're catching my drift. --david From pje at telecommunity.com Wed Jan 12 00:26:59 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 00:26:02 2005 Subject: [Python-Dev] copy confusion In-Reply-To: <05E1B32A-6424-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111182510.032a14b0@mail.telecommunity.com> At 11:56 PM 1/11/05 +0100, Alex Martelli wrote: >What "both issues"? There's only one issue, it seems to me -- one of >metaconfusion. I was relying on Fredrik's report of a problem with the code; that is the other "issue" I referred to. From fredrik at pythonware.com Wed Jan 12 00:30:20 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Jan 12 00:30:22 2005 Subject: [Python-Dev] Re: copy confusion References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: Guido van Rossum wrote: > The only thing this intends to break /.../ it breaks classic C types: >>> import cElementTree >>> x = cElementTree.Element("tag") >>> x >>> x.__copy__ >>> x.__copy__() >>> import copy >>> y = copy.copy(x) Traceback (most recent call last): File "", line 1, in ? File "C:\python24\lib\copy.py", line 93, in copy raise Error("un(shallow)copyable object of type %s" % cls) copy.Error: un(shallow)copyable object of type >>> dir(x) ['__copy__', '__deepcopy__', 'append', 'clear', 'find', 'findall', 'findtext', 'get', 'getchildren', 'getiterator', 'insert', 'items', 'keys', 'makeelement', 'set'] >>> dir(type(x)) ['__class__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__getattribute__', '__getitem__', '__getslice__', '__hash__', '__init__', '__len__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__setslice__', '__str__'] (and of course, custom C types is the only case where I've ever used __copy__; the default behavior has worked just fine for all other cases) for cElementTree, I've worked around this with an ugly __reduce__ hack, but that doesn't feel right... From aleax at aleax.it Wed Jan 12 00:33:22 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 00:33:27 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> Message-ID: <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 11, at 22:08, Phillip J. Eby wrote: ... >> Yes, you're ALLOWED to stuff with NULL any field that isn't >> explicitly specified as NOT NULL. >> >> But you should ONLY do so when the information is REALLY missing, NOT >> when you've lost it along the way because you've implemented >> adapter-chain transitivity: dropping information which you COULD have >> preserved with a bit more care (==without transitivity) is a >> violation of PRAGMATICS, of the BEST-EFFORT implication, just as it >> would be to drop packets once in a while in a TCP/IP stack due to >> some silly programming bug which was passed silently. > > This is again a misleading analogy. You are comparing end-to-end with > point-to-point. I am saying that if you have a point-to-point > connection that drops all packets of a particular kind, you should not > put it into your network, unless you know that an alternate route > exists that can ensure those packets get through. Otherwise, you are > breaking the network. But adaptation is not transmission! It's PERFECTLY acceptable for an adapter to facade: to show LESS information in the adapted object than was in the original. It's PERFECTLY acceptable for an adapter to say "this piece information is not known" when it's adapting an object for which that information, indeed, is not known. It's only CONJOINING the two perfectly acceptable adapters, as transitivity by adapter chain would do automatically, that you end up with a situation that is pragmatically undesirable: asserting that some piece of information is not known, when the information IS indeed available -- just not by the route automatically taken by the transitivity-system. What happened here is not that either of the adapters registered is wrong: each does its job in the best way it can. The programming error, which transitivity hides (degrading the quality of information resulting from the system -- a subtle kind of degradation that will be VERY hard to unearth), is simply that the programmer forgot to register the direct adapter. Without transitivity, the programmer's mistake emerges easily and immediately; transitivity hides the mistake. By imposing transitivity, you're essentially asserting that, if a programmer forgets to code and register an A -> C direct adapter, this is never a problem, as long as A -> B and B -> C adapters are registered, because A -> B -> C will give results just as good as the direct A -> C would have, so there's absolutely no reason to trouble the programmer about the trivial detail that transitivity is being used. At the same time, if I understand correctly, you're ALSO saying that if two other adapters exist, A -> Z and Z -> C, *THEN* it's an error, because you don't know when adapting A -> C whether to go via B or via Z. Well, if you consistently believe what I state in the previous paragraph, then this is just weird: since you're implicitly asserting that any old A->?->C transitive adaptation is just as good as a direct A->C, why should you worry about there being more than one such 2-step adaptation available? Roll the dice to pick one and just proceed. Please note that in the last paragraph I'm mostly trying to "reason by absurd": I do NOT believe one can sensibly assert in the general case that A->?->C is just as good as A->C, without imposing FAR stronger constraints on adaptation that we possibly can (QI gets away with it because, designed from scratch, it can and does impose such constraints, essentially that all interfaces "belong" to ONE single object -- no independent 3rd party adaptation, which may be a bigger loss than the constraints gain, actually). I'm willing to compromise to the extent of letting any given adaptation somehow STATE, EXPLICITLY, "this adaptation is lossless and perfect, and can be used as a part of transitive chains of adaptation without any cost whatsoever". If we do that, though, the adaptation system should trust this assertion, so if there are two possibilities of equal minimal length, such as A->B->C or A->Z->C, with all the steps being declared lossless and perfect, then it SHOULD just pick one by whatever criterion, since both will be equally perfect anyway -- so maybe my reasoning by absurd wasn't totally absurd after all;-). Would this compromise be acceptable to you? Alex From cce at clarkevans.com Wed Jan 12 00:38:58 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Wed Jan 12 00:39:00 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com> References: <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> <5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com> Message-ID: <20050111233858.GB88115@prometheusresearch.com> On Tue, Jan 11, 2005 at 03:30:19PM -0500, Phillip J. Eby wrote: | Clark said he didn't want to assume substitutability; I was pointing out | that he could choose to not assume that, if he wished, by implementing an | appropriate __conform__ at the base of his hierarchy. Oh, that's sufficient. If someone making a base class wants to assert that derived classes should check compliance (rather than having it automagic), then they can do this. Good enough! | I don't agree with Clark's use case, but my | proposal supports it as a possibility, and yours does not. It was a straw-man; and I admit, not a particularly compelling one. | To implement a Liskov violation with my proposal, you do exactly the same | as with your proposal, *except* that you can simply return None instead | of raising an exception, and the logic for adapt() is more | straightforward. I think I prefer just returning None rather than raising a specific exception. The semantics are different: None implies that other adaptation mechanisms (like a registry) could be tried, while LiskovException implies that processing halts and no further adaptation techniques are to be used. In this case, None is the better choice for this particular case since it would enable third-parties to register a wrapper. Overall, I think both you and Alex are now proposing essentially the same thing... no? Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From pje at telecommunity.com Wed Jan 12 00:40:18 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 00:39:22 2005 Subject: [Python-Dev] copy confusion In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111182710.032a10c0@mail.telecommunity.com> At 02:58 PM 1/11/05 -0800, Guido van Rossum wrote: >[Phillip] > > Looks like a bug to me; it breaks the behavior of classic classes, since > > type(classicInstance) returns InstanceType. > >I'm not so sure. I can't seem to break this for classic classes. Sorry; I was extrapolating from what I thought was Fredrik's description of this behavior as a bug, and examining of the history of the code that he referenced. I saw that the current version of that code had evolved directly from a version that was retrieving instance.__copy__; I therefore assumed that the loss-of-feature Fredrik was reporting was that. That is, I thought that the problem he was experiencing was that classic classes no longer supported __copy__ because this code had changed. I guess I should have looked at other lines of code besides the ones he pointed out; sorry about that. :( >The only thing this intends to break, and then only for new-style >classes, is the ability to have __copy__ be an instance variable >(whose value should be a callable without arguments) -- it must be a >method on the class. This is the same thing that I've done for all >built-in operations (__add__, __getitem__ etc.). Presumably, this is the actual feature loss that Fredrik's describing; i.e. lack of per-instance __copy__ on new-style classes. That would make more sense. > > However, it also looks like it might have been introduced to fix the > > possibility that calling '__copy__' on a new-style class with a custom > > metaclass would result in ending up with an unbound method. (Similar to > > the "metaconfusion" issue being recently discussed for PEP 246.) > >Sorry, my head just exploded. :-( The issue is that for special attributes (like __copy__, __conform__, etc.) that do not have a corresponding type slot, using getattr() is not sufficient to obtain slot-like behavior. This is because 'aType.__special__' may refer to a __special__ intended for *instances* of 'aType', instead of the __special__ for aType. As Armin points out, the only way to fully emulate type slot behavior for unslotted special attributes is to perform a search of the __dict__ of each type in the MRO of the type of the object for which you wish to obtain the special attribute. So, in this specific case, __copy__ does not have a type slot, so it is impossible using getattr (or simple attribute access) to guarantee that you are retrieving the correct version of __copy__ in the presence of metaclasses. This is what Alex and I dubbed "metaconfusion" in discussion of the same issue for PEP 246's __adapt__ and __conform__ methods; until they have tp_adapt and tp_conform slots, they can have this same problem. Alex and I also just speculated that perhaps the stdlib should include a function that can do this, so that stdlib modules that define unslotted special attributes (such as __copy__) can ensure they work correctly in the presence of metaclasses. From DavidA at ActiveState.com Wed Jan 12 00:51:33 2005 From: DavidA at ActiveState.com (David Ascher) Date: Wed Jan 12 00:53:17 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: References: <1105471586.41e42862b9a39@mcherm.com> <41E45DA7.1030302@ActiveState.com> Message-ID: <41E46685.2040603@ActiveState.com> Guido van Rossum wrote: >>The point is broader, though -- when I get my turn in the time machine, >>I'll lobby for replacing NameError with UndefinedVariable or something Strange, my blog reading just hit upon http://blogs.zdnet.com/open-source/index.php?p=93 ... "Perhaps as open source developers are making their resolutions for 2005, they could add human-readable error codes to their list? " :-) --david From pje at telecommunity.com Wed Jan 12 01:17:23 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 01:16:29 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050111233858.GB88115@prometheusresearch.com> References: <5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com> <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <5.1.1.6.0.20050111144531.0299a220@mail.telecommunity.com> <5.1.1.6.0.20050111152405.028ec210@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111191422.03b16de0@mail.telecommunity.com> At 06:38 PM 1/11/05 -0500, Clark C. Evans wrote: >| To implement a Liskov violation with my proposal, you do exactly the same >| as with your proposal, *except* that you can simply return None instead >| of raising an exception, and the logic for adapt() is more >| straightforward. > >I think I prefer just returning None rather than raising a >specific exception. The semantics are different: None implies that >other adaptation mechanisms (like a registry) could be tried, while >LiskovException implies that processing halts and no further >adaptation techniques are to be used. In this case, None is >the better choice for this particular case since it would enable >third-parties to register a wrapper. > >Overall, I think both you and Alex are now proposing essentially >the same thing... no? Yes; I'm just proposing shuffling the invocation of things around a bit in order to avoid the need for an exception, and in the process increasing the number of possible customizations a bit. Not that I care about those customizations as such; I just would like to simplify the protocol. I suppose there's some educational benefit in making somebody explicitly declare that they're a Liskov violator, but it seems that if we're going to support it, it should be simple. From pje at telecommunity.com Wed Jan 12 01:29:15 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 01:28:19 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111184644.02bf8ce0@mail.telecommunity.com> At 12:33 AM 1/12/05 +0100, Alex Martelli wrote: >But adaptation is not transmission! It's PERFECTLY acceptable for an >adapter to facade: to show LESS information in the adapted object than was >in the original. It's also true that it's acceptable for a router to choose not to forward packets, e.g. for security reasons, QoS, etc. My point was that you seem to be using this to conclude that multihop packet forwarding is a bad idea in the general case, and that's what doesn't make any sense to me. More to the point, the error in your example isn't the filtering-out of information, it's the adding of a NULL back in. If NULLs are questionable for the target interface, this is not in general a candidate for implicit adaptation IMO -- *whether or not transitivity is involved*. Let's look at the reverse of the float-to-int case for a better example. Should I be able to implicitly adapt a float to a Decimal? No, because I might be making up precision that isn't there. May I explicitly convert a float to a decimal, if I know what I'm doing? Yes, of course. Just don't expect Python to guess for you. This is very much like your example; adding a NULL middle name seems to me almost exactly like going from float to Decimal with spurious precision. If you know what you're doing, it's certainly allowable to do it explicitly, but Python should not do it implicitly. Thus, my argument is that an adapter like this should never be made part of the adapter system, even if there's no transitivity. However, if you agree that such an adapter shouldn't be implicit, then it logically follows that there is no problem with allowing transitivity, except of course that people may sometimes break the rule. However, I think we actually have an opportunity now to actually codify sensible adaptation practices, such that the violations will be infrequent. It's also possible that we may be able to define some sort of restricted implicitness so that somebody making a noisy adapter can implicitly adapt in a limited context. >What happened here is not that either of the adapters registered is wrong: >each does its job in the best way it can. The programming error, which >transitivity hides (degrading the quality of information resulting from >the system -- a subtle kind of degradation that will be VERY hard to >unearth), is simply that the programmer forgot to register the direct >adapter. Without transitivity, the programmer's mistake emerges easily >and immediately; transitivity hides the mistake. Where we differ is that I believe that if the signal degradation over a path isn't acceptable, it shouldn't be made an implicit part of the network; it should be an explicitly forced route instead. Note by the way, that the signal degradation in your example comes from a broken adapter: it isn't valid to make up data if you want real data. It's PRAGMATIC, as you say, to make up the data when you don't have a choice, but this does not mean it should be AUTOMATIC. So, we both believe in restricting the automatic and implicit nature of adaptation. The difference is that I'm saying you should be explicit when you do something questionable, and you are saying you should be explicit when you're doing something that is *not*. Perhaps as you previously suggested, this really is the aspect where we need the BDFL to be the tiebreaker. >At the same time, if I understand correctly, you're ALSO saying that if >two other adapters exist, A -> Z and Z -> C, *THEN* it's an error, because >you don't know when adapting A -> C whether to go via B or via Z. Well, >if you consistently believe what I state in the previous paragraph, then >this is just weird: since you're implicitly asserting that any old A->?->C >transitive adaptation is just as good as a direct A->C, why should you >worry about there being more than one such 2-step adaptation available? Because such ambiguities are usually an indication of some *other* error, often in the area of interface inheritance transitivity; they rarely occur as a direct result of implementing two versions of the same adapter (at least in my experience). >I'm willing to compromise to the extent of letting any given adaptation >somehow STATE, EXPLICITLY, "this adaptation is lossless and perfect, and >can be used as a part of transitive chains of adaptation without any cost >whatsoever". If we do that, though, the adaptation system should trust >this assertion, so if there are two possibilities of equal minimal length, >such as A->B->C or A->Z->C, with all the steps being declared lossless and >perfect, then it SHOULD just pick one by whatever criterion, since both >will be equally perfect anyway -- so maybe my reasoning by absurd wasn't >totally absurd after all;-). Here's the part I don't think you're seeing: interface inheritance transitivity has this *exact* same problem, and it's *far* easier to stumble into it, assuming you don't start out by declaring adapters that we both agree are insane (like filename-to-file). If your argument is valid for adapters, however, then the only logical conclusion is that we cannot permit an adapter for a derived interface to be returned when a base interface is requested. Is this your position as well? From python at rcn.com Wed Jan 12 01:27:09 2005 From: python at rcn.com (Raymond Hettinger) Date: Wed Jan 12 01:30:30 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <20050110233126.GA14363@janus.swcomplete.com> Message-ID: <000401c4f83d$7432db40$e841fea9@oemcomputer> Would the csv module be a good place to add a DBF reader and writer? Dbase's dbf file format is one of the oldest, simplest and more common database interchange formats. It can be a good alternative to CSV as a means of sharing data with pre-existing, non-python apps. On the plus side, it has a precise spec, can preserve numeric and date types, has guaranteed round-trip equivalence, and does not have weird escape rules. On the minus side, strings are limited to ASCII without NULs and the fields are fixed length. I've posted a draft on ASPN. It interoperates well with the rest of the CSV module because it also accepts/returns a list of fieldnames and a sequence of records. http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/362715 Raymond Hettinger From python at rcn.com Wed Jan 12 01:52:53 2005 From: python at rcn.com (Raymond Hettinger) Date: Wed Jan 12 01:56:13 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050111184644.02bf8ce0@mail.telecommunity.com> Message-ID: <000501c4f841$0c7874c0$e841fea9@oemcomputer> > Thus, my argument is that an adapter like this should never be made part > of > the adapter system, even if there's no transitivity. However, if you > agree > that such an adapter shouldn't be implicit, then it logically follows that > there is no problem with allowing transitivity, except of course that > people may sometimes break the rule. At some point, the PEP should be extended to include a list of best practices and anti-patterns for using adapters. I don't find issues of transitivity and implicit conversion to be immediately obvious. Also, it is not clear to me how or if existing manual adaption practices should change. For example, if I need a file-like interface to a string, I currently wrap it with StringIO. How will that change it the future? By an explicit adapt/conform pair? Or by strings knowing how to conform to file-like requests? Raymond Hettinger From andrewm at object-craft.com.au Wed Jan 12 01:57:16 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 12 01:57:21 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <000401c4f83d$7432db40$e841fea9@oemcomputer> References: <000401c4f83d$7432db40$e841fea9@oemcomputer> Message-ID: <20050112005716.848393C889@coffee.object-craft.com.au> >Would the csv module be a good place to add a DBF reader and writer? I would have thought it would make sense as it's own module (in the same way that we have separate modules that present a common interface for the different databases), or am I missing something? I'd certainly like to see a DBF parser in python - reading and writing odd file formats is bread-and-butter for us contractors... 8-) -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From roeland.rengelink at chello.nl Wed Jan 12 02:48:43 2005 From: roeland.rengelink at chello.nl (Roeland Rengelink) Date: Wed Jan 12 03:00:24 2005 Subject: [Python-Dev] PEP 246, redux Message-ID: <41E481FB.5020003@chello.nl> I'm trying to understand the relation between Guido's posts on optional static typing and PEP 245 (interfaces) and 246 (adaptation). I have a couple of questions PEP 245 proposes to introduce a fundamental distinction between type and interface. However, 245 only introduces a syntax for interfaces, and says very little about the semantics of interfaces. (Basically only that if X implements Y then implements(X, Y) will return True). The semantics of interfaces are currently only implied by PEP 246, and by Guido's posts referring to 246. Unfortunately PEP 246 explicitly refuses to decide that protocols are 245-style interfaces. Therefore, it is not clear to me how acceptance of 245 would impact on 246? Specifically, what would be the difference between: x = adapt(obj, a_245_style_interface) x = adapt(obj, a_protocol_type) and, if there is no difference, what would the use-case of interfaces be? Put another way: explicit interfaces and adaptation based typing seem to be about introducing rigor (dynamic, not static) to Python. Yet, PEP 245 and 246 seems to go out of their way to give interfaces and adaptation as little baggage as possible. So, where is the rigor going to come from? On the one hand this seems very Pythonic - introduce a new feature with as little baggage as possible, and see where it evolves from there. Let the rigor flow, not from the restrictions of the language, but from the expressive power of the language. On the other hand: why not, at least: - explore in 245 how the semantics of interfaces might introduce rigor into the language. It would be particularly illuminating to find out in what way implementing an interface differs from deriving from an ABC and in what way an interface hierarchy differs semantically from a hierarchy of ABCs - rewrite 246 under the assumption that 245 (including semantics) has been accepted I would volunteer, but, for those of you who hadn't noticed yet, I don't know what I'm talking about. Cheers, Roeland Rengelink From pje at telecommunity.com Wed Jan 12 04:06:08 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 04:05:14 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <000501c4f841$0c7874c0$e841fea9@oemcomputer> References: <5.1.1.6.0.20050111184644.02bf8ce0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050111215448.039197b0@mail.telecommunity.com> At 07:52 PM 1/11/05 -0500, Raymond Hettinger wrote: >Also, it is not clear to me how or if existing manual adaption practices >should change. For example, if I need a file-like interface to a >string, I currently wrap it with StringIO. How will that change it the >future? By an explicit adapt/conform pair? Or by strings knowing how >to conform to file-like requests? The goal here is to be able to specify that a function parameter takes, e.g. a "readable stream", and that you should be able to either explicitly wrap in a StringIO to satisfy this, or *possibly* that you be able to just pass a string and have it work automatically. If the latter is the case, there are a variety of possible ways it might work. str.__conform__ might recognize the "readable stream" interface, or the __adapt__ method of the "readable stream" interface could recognize 'str'. Or, Alex's new proposed global type registry might contain an entry for 'str,readableStream'. Which of these is the preferred scenario very much depends on a lot of things, like who defined the "readable stream" interface, and whether anybody has registered an adapter for it! PyProtocols tries to answer this question by allowing you to register adapters with interfaces, and then the interface's __adapt__ method will do the actual adaptation. Zope does something similar, at least in that it uses the interface's __adapt__ method, but that method actually uses a global registry. Neither PyProtocols nor Zope make much use of actually implementing hand-coded __conform__ or __adapt__ methods, as it's too much trouble for something that's so inherently declarative anyway, and only the creator of the object class or the interface's type have any ability to define adapters that way. Given that built-in types are often handy sources of adaptation (e.g. str-to-StringIO in your example), it isn't practical in present-day Python to add a __conform__ method to the str type! Thus, in the general case it just seems easier to use a per-interface or global registry for most normal adaptation, rather than using __conform__. However, having __conform__ exist is a nice "out" for implementing unusual custom requirements (like easy dynamic conformance), so I don't think it should be removed. From gvanrossum at gmail.com Wed Jan 12 05:11:16 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 05:11:19 2005 Subject: [Python-Dev] copy confusion In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: > Unfortunately, we do have a problem with the code in copy.py: > > class MetaCopyableClass(type): > def __copy__(cls): > """ code to copy CLASSES of this metaclass """ > # etc, etc, snipped > > class CopyableClass: > __metaclass__ = MetaCopyableClass > # rest of class snipped > > x = CopyableClass() > > import copy > y = copy.copy(x) > > kallisti:/tmp alex$ python x.py > Traceback (most recent call last): > File "x.py", line 14, in ? > y = copy.copy(x) > File "/usr/local/lib/python2.4/copy.py", line 79, in copy > return copier(x) > TypeError: __copy__() takes exactly 1 argument (2 given) > kallisti:/tmp alex$ > > See? copy.copy(x) ends up using MetaCopyableClass.__copy__ -- because > of a getattr on CopyableClass for '__copy__', which gets the > BOUND-METHOD defined in the metaclass, with im_self being > CopyableClass. > > I had exactly the same metabug in the pep 246 reference implementation, > Armin Rigo showed how to fix it in his only recent post. Don't recall seeing that, but if you or he can fix this without breaking other stuff, it's clear you should go ahead. (This worked in 2.2, FWIW; it broke in 2.3.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mdehoon at ims.u-tokyo.ac.jp Wed Jan 12 08:38:30 2005 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Wed Jan 12 08:37:12 2005 Subject: [Python-Dev] PyOS_InputHook and threads Message-ID: <41E4D3F6.4070807@ims.u-tokyo.ac.jp> I have started writing a patch that replaces PyOS_InputHook with PyOS_AddInputHook and PyOS_RemoveInputHook. I am a bit confused though on how hook functions are supposed to work with threads. PyOS_InputHook is a pointer to a hook function, which can be defined for example in a C extension module. When Python is running in a single thread, PyOS_InputHook is called ten times per second while Python is waiting for the user to type something. This is achieved by setting readline's rl_event_hook function to PyOS_InputHook. When Python uses multiple threads, each thread has its own PyOS_InputHook (I am not sure if this was intended). However, with IDLE I noticed that the subprocess thread doesn't call its PyOS_InputHook. In IDLE (if I understand correctly how it works), one thread takes care of the GUI and the interaction with the user, while another thread executes the user's commands. If an extension module sets PyOS_InputHook, the PyOS_InputHook belonging to this second thread is set. However, this PyOS_InputHook does not get called. Is this simply an oversight? What would be a suitable place to add the call to PyOS_InputHook? In other words, where does the second thread go idle? --Michiel. > On Thu, Dec 09, 2004, Michiel Jan Laurens de Hoon wrote: > >>> >>> My suggestion is therefore to replace PyOS_InputHook by two functions >>> PyOS_AddInputHook and PyOS_RemoveInputHook, and let Python keep track of >>> which hooks are installed. This way, an extension module can add a hook >>> function without having to worry about other extension modules trying >>> to use the same hook. >>> >>> Any comments? Would I need to submit a PEP for this proposal? > > > Because this is only for the C API, your best bet is to write a patch > and submit it to SF. If people whine or it gets rejected, then write a > PEP. -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From aleax at aleax.it Wed Jan 12 09:03:59 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 09:04:07 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> Since this bug isn't the cause of Fredrik's problem I'm changing the subject (and keep discussing the specific problem that Fredrik uncovered under the original subject). On 2005 Jan 12, at 05:11, Guido van Rossum wrote: ... >> I had exactly the same metabug in the pep 246 reference >> implementation, >> Armin Rigo showed how to fix it in his only recent post. > > Don't recall seeing that, but if you or he can fix this without > breaking other stuff, it's clear you should go ahead. (This worked in > 2.2, FWIW; it broke in 2.3.) Armin's fix was to change: conform = getattr(type(obj), '__conform__', None) into: for basecls in type(obj).__mro__: if '__conform__' in basecls.__dict__: conform = basecls.__dict__['__conform__'] break else: # not found I have only cursorily examined the rest of the standard library, but it seems to me there may be a few other places where getattr is being used on a type for this purpose, such as pprint.py which has a couple of occurrences of r = getattr(typ, "__repr__", None) Since this very same replacement is needed in more than one place for "get the following special attribute from the type of the object', it seems that a function to do it should be introduced in one place and used from where it's needed: def get_from_first_dict(dicts, name, default=None): for adict in dicts: try: return adict[name] except KeyError: pass return default to be called, e.g. in the above example with '__conform__', as: conform = get_from_first_dict( (basecls.__dict__ for basecls in type(obj).__mro__), '__conform__' ) The needed function could of course be made less general, by giving more work to the function and less work to the caller, all the way down to: def getspecial(obj, name, default=None): for basecls in type(obj).__mro__: try: return basecls.__dict__[name] except KeyError: pass return default to be called, e.g. in the above example with '__conform__', as: conform = getspecial(obj, '__conform__') This has the advantage of not needing the genexp, so it's usable to implement the fix in 2.3.5 as well as in 2.4.1. Moreover, it can specialcase old-style class instances to provide the backwards compatible behavior, if desired -- that doesn't matter (but doesn't hurt) to fix the bug in copy.py, because in that case old-style instances have been already specialcases previously, and might help to avoid breaking anything in other similar bugfixes, so that's what I would suggest: def getspecial(obj, name, default=None): if isinstance(obj, types.InstanceType): return getattr(obj, name, default) for basecls in type(obj).__mro__: try: return basecls.__dict__[name] except KeyError: pass return default The tradeoff between using type(obj) and obj.__class__ isn't crystal-clear to me, but since the latter might apparently be faked by some proxy to survive isinstance calls type(obj) appears to me to be right. Where in the standard library to place this function is not clear to me either. Since it's going into bugfix-only releases, I assume it shouldn't be "published". Maybe having it as copy._getspecial (i.e. with a private name) is best, as long as it's OK to introduce some coupling by having (e.g.) pprint call copy._getspecial too. Performance might be a problem, but the few bugfix locations where a getattr would be replaced by this getspecial don't seem to be hotspots, so maybe we don't need to worry about it for 2.3 and 2.4 (it might be nice to have this functionality "published" in 2.5, of course, and then it should probably be made fast). Feedback welcome -- the actual patch will doubtless be tiny, but it would be nice to have it right the first time (as it needs to go into both the 2.3 and 2.4 bugfix branches and the 2.5 head). Alex From aleax at aleax.it Wed Jan 12 10:52:10 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 10:52:20 2005 Subject: [Python-Dev] Re: copy confusion In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> Message-ID: On 2005 Jan 12, at 00:30, Fredrik Lundh wrote: > Guido van Rossum wrote: > >> The only thing this intends to break /.../ > > it breaks classic C types: True!!! And it only breaks copy, NOT deepcopy, because of the following difference between the two functions in copy.py...: def deepcopy(x, memo=None, _nil=[]): ... cls = type(x) copier = _deepcopy_dispatch.get(cls) if copier: y = copier(x, memo) else: try: issc = issubclass(cls, type) except TypeError: # cls is not a class (old Boost; see SF #502085) issc = 0 if issc: y = _deepcopy_atomic(x, memo) else: copier = getattr(x, "__deepcopy__", None) Now: >>> x = cElementTree.Element("tag") >>> cls = type(x) >>> issubclass(cls, type) False therefore, copy.deepcopy ends up doing the getattr of '__deepcopy__' on x and live happily ever after. Function copy.copy does NOT do that issubclass check, therefore it breaks Fredrik's code. > (and of course, custom C types is the only case where I've ever used > __copy__; the default behavior has worked just fine for all other > cases) > > for cElementTree, I've worked around this with an ugly __reduce__ hack, > but that doesn't feel right... I think you're entirely correct and that we SHOULD bugfix copy.py so that function copy, just like function deepcopy, does the getattr from the object when not issubclass(cls, type). The comment suggests that check is there only for strange cases such as "old Boost" (presumably Boost Python in some previous incarnation) but it appears to me that it's working fine for your custom C type and that it would work just as well for __copy__ as is seems to do for __deepcopy__. The fix, again, should be a tiny patch -- and it seems to me that we should have it for 2.3.5 as well as for 2.4.1 and the HEAD. Alex From jim at zope.com Wed Jan 12 13:45:50 2005 From: jim at zope.com (Jim Fulton) Date: Wed Jan 12 13:45:56 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050111185020.GA28966@prometheusresearch.com> References: <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <5.1.1.6.0.20050111103431.03a8fc60@mail.telecommunity.com> <20050111185020.GA28966@prometheusresearch.com> Message-ID: <41E51BFE.2090008@zope.com> Clark C. Evans wrote: > On Tue, Jan 11, 2005 at 12:54:36PM -0500, Phillip J. Eby wrote: ... > | * In my experience, incorrectly deriving an interface from another is the > | most common source of unintended adaptation side-effects, not adapter > | composition > > It'd be nice if interfaces had a way to specify a test-suite that > could be run against a component which claims to be compliant. For > example, it could provide invalid inputs and assert that the proper > errors are returned, etc. We've tried this in Zope 3 with very limited success. In fact, so far, our attempts have provided more pain than their worth. The problem is that interfaces are usually abstract enough that it's very difficult to write generic tests. For example, many objects implement mapping protocols, but place restrictions on the values stored. It's hard to provide generic tests that don't require lots of inconvenient hooks. There are exceptions of course. Our ZODB storage tests use a generic storage-interface test, but this is possible because the ZODB storage interfaces are extremely concrete. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From p.f.moore at gmail.com Wed Jan 12 14:44:50 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Wed Jan 12 14:44:53 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <79990c6b05011205445ea4af76@mail.gmail.com> On Wed, 12 Jan 2005 00:33:22 +0100, Alex Martelli wrote: > But adaptation is not transmission! It's PERFECTLY acceptable for an > adapter to facade: to show LESS information in the adapted object than > was in the original. It's PERFECTLY acceptable for an adapter to say > "this piece information is not known" when it's adapting an object for > which that information, indeed, is not known. It's only CONJOINING the > two perfectly acceptable adapters, as transitivity by adapter chain > would do automatically, that you end up with a situation that is > pragmatically undesirable: asserting that some piece of information is > not known, when the information IS indeed available -- just not by the > route automatically taken by the transitivity-system. [Risking putting my head above the parapet here :-)] If you have adaptations A->B, B->C, and A->C, I would assume that the system would automatically use the direct A->C route rather than A->B->C. I understand that this is what PyProtocols does. Are you mistakenly thinking that shortest-possible-route semantics aren't used? Maybe the PEP should explicitly require such semantics. If I'm missing the point here, I apologise. But I get the feeling that something's getting lost in the discussions. Paul. From p.f.moore at gmail.com Wed Jan 12 15:00:20 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Wed Jan 12 15:00:23 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <79990c6b05011206001a5a3805@mail.gmail.com> On Wed, 12 Jan 2005 00:33:22 +0100, Alex Martelli wrote: > By imposing transitivity, you're essentially asserting that, if a > programmer forgets to code and register an A -> C direct adapter, this > is never a problem, as long as A -> B and B -> C adapters are > registered, because A -> B -> C will give results just as good as the > direct A -> C would have, so there's absolutely no reason to trouble > the programmer about the trivial detail that transitivity is being > used. [...] > paragraph, then this is just weird: since you're implicitly asserting > that any old A->?->C transitive adaptation is just as good as a direct > A->C, why should you worry about there being more than one such 2-step > adaptation available? Roll the dice to pick one and just proceed. I know this is out-of-context picking, but I don't think I've ever seen anyone state that A->?->C is "just as good as" a direct A->C. I would have thought it self-evident that a shorter adaptation path is always better. And specifically, I know that Philip has stated that PyProtocols applies a shorter-is-better algorithm. Having pointed this out, I'll go back to lurking. You two are doing a great job of converging on something so far, so I'll let you get on with it. Paul. From aleax at aleax.it Wed Jan 12 15:06:49 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 15:06:53 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <79990c6b05011205445ea4af76@mail.gmail.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> Message-ID: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 14:44, Paul Moore wrote: > On Wed, 12 Jan 2005 00:33:22 +0100, Alex Martelli > wrote: >> But adaptation is not transmission! It's PERFECTLY acceptable for an >> adapter to facade: to show LESS information in the adapted object than >> was in the original. It's PERFECTLY acceptable for an adapter to say >> "this piece information is not known" when it's adapting an object for >> which that information, indeed, is not known. It's only CONJOINING >> the >> two perfectly acceptable adapters, as transitivity by adapter chain >> would do automatically, that you end up with a situation that is >> pragmatically undesirable: asserting that some piece of information is >> not known, when the information IS indeed available -- just not by the >> route automatically taken by the transitivity-system. > > [Risking putting my head above the parapet here :-)] > > If you have adaptations A->B, B->C, and A->C, I would assume that the > system would automatically use the direct A->C route rather than > A->B->C. I understand that this is what PyProtocols does. Yes, it is. > Are you mistakenly thinking that shortest-possible-route semantics > aren't used? Maybe the PEP should explicitly require such semantics. No, I'm not. I'm saying that if, by mistake, the programmer has NOT registered the A->C adapter (which would be easily coded and work perfectly), then thanks to transitivity, instead of a clear and simple error message leading to immediate diagnosis of the error, they'll get a subtle unnecessary degradation of information and resulting reduction in information quality. PyProtocols' author claims this can't happen because if adapters A->B and B->C are registered then each adapter is always invariably claiming to be lossless and perfect. However, inconsistently with that stance, I believe that PyProtocols does give an error message if it finds two adaptation paths of equal minimum length, A->B->C or A->Z->C -- if it is truly believed that each adaptation step is lossless and perfect, it's inconsistent to consider the existence of two equal-length paths an error... either path should be perfect, so just picking either one of them should be a perfectly correct strategy. > If I'm missing the point here, I apologise. But I get the feeling that > something's getting lost in the discussions. The discussions on this subject always and invariably get extremely long (and often somewhat heated, too), so it's quite possible that a lot is getting lost along the way, particularly to any other reader besides the two duelists. Thus, thanks for focusing on one point that might well be missed by other readers (though not by either PJE or me;-) and giving me a chance to clarify it! Alex From marktrussell at btopenworld.com Wed Jan 12 15:27:08 2005 From: marktrussell at btopenworld.com (Mark Russell) Date: Wed Jan 12 15:27:11 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <1105540027.5326.19.camel@localhost> I strongly prefer *not* to have A->B and B->C automatically used to construct A->C. Explicit is better than implicit, if in doubt don't guess, etc etc. So I'd support: - As a first cut, no automatic transitive adaptation - Later, and only if experience shows there's a need for it, add a way to say "this adaptor can be used as part of a transitive chain" Mark Russell From aleax at aleax.it Wed Jan 12 15:48:41 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 15:48:45 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <79990c6b05011206001a5a3805@mail.gmail.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011206001a5a3805@mail.gmail.com> Message-ID: <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 15:00, Paul Moore wrote: > On Wed, 12 Jan 2005 00:33:22 +0100, Alex Martelli > wrote: >> By imposing transitivity, you're essentially asserting that, if a >> programmer forgets to code and register an A -> C direct adapter, this >> is never a problem, as long as A -> B and B -> C adapters are >> registered, because A -> B -> C will give results just as good as the >> direct A -> C would have, so there's absolutely no reason to trouble >> the programmer about the trivial detail that transitivity is being >> used. > [...] >> paragraph, then this is just weird: since you're implicitly asserting >> that any old A->?->C transitive adaptation is just as good as a direct >> A->C, why should you worry about there being more than one such 2-step >> adaptation available? Roll the dice to pick one and just proceed. > > I know this is out-of-context picking, but I don't think I've ever > seen anyone state that A->?->C is "just as good as" a direct A->C. I > would have thought it self-evident that a shorter adaptation path is > always better. And specifically, I know that Philip has stated that > PyProtocols applies a shorter-is-better algorithm. Yes, he has. If A->C is registered as a direct adaptation, it gets used and everybody lives happily ever after. The controversy comes when A->C is *NOT* registered as a direct adaptation. If there is no degradation of information quality, etc, at any intermediate step, picking the shortest path is still sensible because of likely performance consideration. Each adaptation step might put some kind of wrapper/proxy/adapter object in the mix, delegate calls, etc. Of course, it's possible that some such wrappers are coded much more tighter &c, so that in fact some roundabout A -> X1 -> X2 -> C would actually be better performing than either A -> B -> C or A -> Z -> C, but using one of the shortest available paths appears to be a reasonable heuristic for what, if one "assumes away" any degradation, is after all a minor issue. Demanding that the set of paths of minimal available length has exactly one element is strange, though, IF one is assuming that all adaptation paths are exactly equivalent except at most for secondary issues of performance (which are only adjudicated by the simplest heuristic: if those issues were NOT considered minor/secondary, then a more sophisticated scheme would be warranted, e.g. by letting the programmer associate a cost to each step, picking the lowest-cost path, AND letting the caller of adapt() also specify the maximal acceptable cost or at least obtain the cost associated with the chosen path). Personally, I disagree with having transitivity at all, unless perhaps it be restricted to adaptations specifically and explicitly stated to be "perfect and lossless"; PJE claims that ALL adaptations MUST, ALWAYS, be "perfect and lossless" -- essentially, it seems to me, he _has_ to claim that, to defend transitivity being applied automatically, relentlessly, NON-optionally, NON-selectively (but then the idea of giving an error when two or more shortest-paths have the same length becomes dubious). Even C++ at least lets a poor beleaguered programmer assert that a conversion (C++ does not have adaptation, but it does have conversion) is _EXPLICIT_, meaning that it only applies as a single isolated step and NOT as a part of one of those automatic transitive conversion-chains which so often produce amazing, hell-to-debug results. That's the wrong default (explicit should be the default, "usable transitively" should need to be asserted outright), explainable by the usual historical and backwards compatibilty reason (just like having methods default to non-virtual, etc, etc), but at least it's THERE -- a way to stop transitivity and restore sanity. I have not yet seen PJE willing to compromise on this point -- having two categories or grades of adaptations, one "perfect, lossless, noiseless, impeccable" usable transitively and one "sublunar" NOT usable transitively. ((If he was, we could still argue on which one should be the default;-)) Much the same applies to inheritance, BTW, which as PJE has pointed out a few times also induces transitivity in adaptation, and, according to him, is a more likely cause of bugs than chains of adapters (I have no reason to doubt that any way to induce transitivity without very clearly stating that one WANTS that effect can be extremely bug-prone). So, yes, I'd _also_ love to have two grades of inheritance, one of the "total commitment" kind (implying transitivity and whatever), and one more earthly a la ``I'm just doing some convenient reuse, leave me alone''. Here, regretfully, I'll admit C++ has the advantage, since ``private inheritance'' is exactly that inferior, implementation-only kind (I'm perfectly happy with Python NOT having private methods nor attributes, but private INHERITANCE sometimes I still miss;-). Ah well, can't have everything. While I hope we can offer some lifesaver to those poor practicing programmers whose inheritance-structures aren't always perfect and pristine, if the only way to treat interface-inheritance is the Hyperliskovian one, ah well, we'll survive. BTW, Microsoft's COM's interfaces ONLY have the "inferior" kind of inheritance. You can say that interface ISub inherits from IBas: this means that ISub has all the same methods as IBas with the same signatures, plus it may have other methods; it does *NOT* mean that anything implementing ISub must also implement IBas, nor that a QueryInterface on an ISub asking for an IBas must succeed, or anything of that kind. In many years of COM practice I have NEVER found this issue to be a limitation -- it works just fine. I do not know CORBA anywhere as well as I do COM, but, doesn't CORBA interface inheritance, per OMG's standards, also work that way? Alex From pje at telecommunity.com Wed Jan 12 16:12:26 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 16:12:30 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <1105540027.5326.19.camel@localhost> References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> At 02:27 PM 1/12/05 +0000, Mark Russell wrote: >I strongly prefer *not* to have A->B and B->C automatically used to >construct A->C. Explicit is better than implicit, if in doubt don't >guess, etc etc. > >So I'd support: > > - As a first cut, no automatic transitive adaptation > > - Later, and only if experience shows there's a need for it, Well, by the experience of the people who use it, there is a need, so it's already "later". :) And at least my experience *also* shows that transitive interface inheritance with adaptation is much easier to accidentally screw up than transitive adapter composition -- despite the fact that nobody objects to the former. But if you'd like to compare the two approaches pragmatically, try using both zope.interface and PyProtocols, and see what sort of issues you run into. They have pretty much identical interface syntax, and you can use the PyProtocols declaration API and 'adapt' function to do interface declarations for either Zope interfaces or PyProtocols interfaces -- and the adaptation semantics follow Zope if you're using Zope interfaces. So, you can literally flip between the two by changing where you import the 'Interface' class from. Both Zope and PyProtocols support the previous draft of PEP 246; the new draft adds only two new features: * Ability for a class to opt out of the 'isinstance()' check for a base class (i.e., for a class to say it's not substitutable for its base class, for Alex's "private inheritance" use case) * Ability to have a global type->protocol adapter registry Anyway, I'm honestly curious as to whether anybody can find a real situation where transitive adapter composition is an *actual* problem, as opposed to a theoretical one. I've heard a lot of people talk about what a bad idea it is, but I haven't heard any of them say they actually tried it. Conversely, I've also heard from people who *have* tried it, and liked it. However, at this point I have no way to know if this dichotomy is just a reflection of the fact that people who don't like the idea don't try it, and the people who either like the idea or don't care are open to trying it. The other thing that really blows my mind is that the people who object to the idea don't get that transitive interface inheritance can produce the exact same problem, and it's more likely to happen in actual *practice*, than it is in theory. As for the issue of what should and shouldn't exist in Python, it doesn't really matter; PEP 246 doesn't (and can't!) *prohibit* transitive adaptation. However, I do strongly object to the spreading of theoretical FUD about a practical, useful technique, much as I would object to people saying that "using significant whitespace is braindead" who had never tried actually using Python. The theoretical problems with transitive adapter composition are in my experience just as rare as whitespace-eating nanoviruses from outer space. From gvanrossum at gmail.com Wed Jan 12 16:26:36 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 16:26:40 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: [Alex] > I'm saying that if, by mistake, the programmer has NOT > registered the A->C adapter (which would be easily coded and work > perfectly), then thanks to transitivity, instead of a clear and simple > error message leading to immediate diagnosis of the error, they'll get > a subtle unnecessary degradation of information and resulting reduction > in information quality. I understand, but I would think that there are just as many examples of cases where having to register a trivial A->C adapter is much more of a pain than it's worth; especially if there are a number of A->B pairs and a number of B->C pairs, the number of additional A->C pairs needed could be bewildering. But I would like to see some input from people with C++ experience. C++ goes to great lengths to pick automatic conversions (which perhaps aren't quite the same as adaptations but close enough for this comparison to work) and combine them. *In practice*, is this a benefit or a liability? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax at aleax.it Wed Jan 12 16:36:35 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 16:36:41 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> Message-ID: On 2005 Jan 12, at 16:12, Phillip J. Eby wrote: > At 02:27 PM 1/12/05 +0000, Mark Russell wrote: >> I strongly prefer *not* to have A->B and B->C automatically used to >> construct A->C. Explicit is better than implicit, if in doubt don't >> guess, etc etc. >> >> So I'd support: >> >> - As a first cut, no automatic transitive adaptation >> >> - Later, and only if experience shows there's a need for it, > > Well, by the experience of the people who use it, there is a need, so > it's already "later". :) And at least my experience *also* shows > that transitive interface inheritance with adaptation is much easier > to accidentally screw up than transitive adapter composition -- > despite the fact that nobody objects to the former. A-hem -- I *grumble* about the former (and more generally the fact that inheritance is taken as so deucedly *committal*:-). If it doesn't really count as a "complaint" it's only because I doubt I can do anything about it and I don't like tilting at windmills. But, I _DO_ remember Microsoft's COM, with inheritance of interface *NOT* implying anything whatsoever (except the fact that the inheriting one has all the methods of the inherited one with the same signature, w/o having to copy and paste, plus of course you can add more) -- I remember that idea with fondness, as I do many other features of a components-system that, while definitely not without defects, was in many respects a definite improvement over the same respects in its successors. > The other thing that really blows my mind is that the people who > object to the idea don't get that transitive interface inheritance can > produce the exact same problem, and it's more likely to happen in > actual *practice*, than it is in theory. Believe me, I'm perfectly glad to believe that [a] implied transitivity in any form, and [b] hypercommittal inheritance, cause HUGE lots of problems; and to take your word that the combination is PARTICULARLY bug-prone in practice. It's just that I doubt I can do anything much to help the world avoid that particular blight. > As for the issue of what should and shouldn't exist in Python, it > doesn't really matter; PEP 246 doesn't (and can't!) *prohibit* > transitive adaptation. However, I do strongly object to the spreading > of theoretical FUD about a practical, useful technique, much as I > would object to people saying that "using significant whitespace is > braindead" who had never tried actually using Python. The theoretical > problems with transitive adapter composition are in my experience just > as rare as whitespace-eating nanoviruses from outer space. Well, I'm not going to start real-life work on a big and complicated system (the kind where such problems would emerge) relying on a technique I'm already dubious about, if I have any say in the matter, so of course I'm unlikely to gain much real-life experience -- I'm quite content, unless somebody should be willing to pay me adequately for my work yet choose to ignore my advice in the matter;-), to rely on imperfect analogies with other experiences based on other kinds of unwanted and unwarranted but uncontrollable and unstoppable applications of transitivity by underlying systems and frameworks. I already know -- you told us so -- that if I had transitivity as you wish it (uncontrollable, unstoppable, always-on) I could not any more write and register a perfectly reasonable adapter which fills in with a NULL an optional field in the adapted-to interface, without facing undetected degradation of information quality by that adapter being invisibly, uncontrollably chained up with another -- no error message, no nothing, no way to stop this -- just because a direct adapter wasn't correctly written and registered. Just this "detail", for me, is reason enough to avoid using any framework that imposes such noncontrollable transitivity, if I possibly can. Alex From theller at python.net Wed Jan 12 16:44:57 2005 From: theller at python.net (Thomas Heller) Date: Wed Jan 12 16:43:35 2005 Subject: getattr and __mro__ (was Re: [Python-Dev] PEP 246, redux) In-Reply-To: <20050111124157.GA16642@vicky.ecs.soton.ac.uk> (Armin Rigo's message of "Tue, 11 Jan 2005 12:41:57 +0000") References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <20050111124157.GA16642@vicky.ecs.soton.ac.uk> Message-ID: Armin Rigo writes: > ... is that the __adapt__() and __conform__() methods should work just > like all other special methods for new-style classes. The confusion > comes from the fact that the reference implementation doesn't do that. > It should be fixed by replacing: > > conform = getattr(type(obj), '__conform__', None) > > with: > > for basecls in type(obj).__mro__: > if '__conform__' in basecls.__dict__: > conform = basecls.__dict__['__conform__'] > break > else: > # not found > > and the same for '__adapt__'. > > The point about tp_xxx slots is that when implemented in C with slots, you get > the latter (correct) effect for free. This is how metaconfusion is avoided in > post-2.2 Python. Using getattr() for that is essentially broken. Trying to > call the method and catching TypeErrors seems pretty fragile -- e.g. if you > are calling a __conform__() which is implemented in C you won't get a Python > frame in the traceback either. I'm confused. Do you mean that getattr(obj, "somemethod")(...) does something different than obj.somemethod(...) with new style class instances? Doesn't getattr search the __dict__'s along the __mro__ list? Thomas From gvanrossum at gmail.com Wed Jan 12 16:45:55 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 16:46:02 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011206001a5a3805@mail.gmail.com> <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: [Alex] > Of course, it's possible that some such wrappers are coded much > more tighter &c, so that in fact some roundabout A -> X1 -> X2 -> C > would actually be better performing than either A -> B -> C or A -> Z > -> C, but using one of the shortest available paths appears to be a > reasonable heuristic for what, if one "assumes away" any degradation, > is after all a minor issue. I would think that the main reason for preferring the shortest path is the two degenerate cases, A->A (no adaptation necessary) and A->C (a direct adapter is available). These are always preferable over longer possibilities. > Demanding that the set of paths of minimal available length has exactly > one element is strange, though, I think you're over-emphasizing this point (in several messages); somehow you sound a bit like you're triumphant over having found a bug in your opponent's reasoning. [...] > So, yes, I'd _also_ love to have two grades of inheritance, one of the > "total commitment" kind (implying transitivity and whatever), and one > more earthly a la ``I'm just doing some convenient reuse, leave me > alone''. I'll bet that the list of situations where occasionally you wish you had more control over Python's behavior is a lot longer than that, and I think that if we started implementing that wish list (or anybody's wish list), we would soon find that we had destroyed Python's charming simplicity. My personal POV here: even when you break Liskov in subtle ways, there are lots of situations where assuming substitutability has no ill effects, so I'm happy to pretend that a subclass is always a subtype of all of its base classes, (and their base classes, etc.). If it isn't, you can always provide an explicit adapter to rectify things. As an example where a subclass that isn't a subtype can be used successfully, consider a base class that defines addition to instances of the same class. Now consider a subclass that overrides addition to only handle addition to instances of that same subclass; this is a Liskov violation. Now suppose the base class also has a factory function that produces new instances, and the subclass overrides this to produce new instances of the subclass. Then a function designed to take an instance of the base class and return the sum of the instances produced by calling the factory method a few times will work perfectly with a subclass instance as argument. Concrete: class B: def add(self, other: B) -> B: ... def factory(self) -> B: ... class C(B): def add(self, other: C) -> C: ... # "other: C" violates Liskov def factory(self) -> C: ... def foo(x: B) -> B: x1 = x.factory() x2 = x.factory() return x1.add(x2) This code works fine in today's python if one leaves the type declarations out. I don't think anybody is served by forbidding it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Wed Jan 12 16:49:20 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 16:49:25 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <1105540027.5326.19.camel@localhost> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> Message-ID: [Phillip] > As for the issue of what should and shouldn't exist in Python, it doesn't > really matter; PEP 246 doesn't (and can't!) *prohibit* transitive > adaptation. Really? Then isn't it underspecified? I'd think that by the time we actually implement PEP 246 in the Python core, this part of the semantics should be specified (at least the default behavior, even if there are hooks to change this). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax at aleax.it Wed Jan 12 16:50:55 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 16:51:00 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: On 2005 Jan 12, at 16:26, Guido van Rossum wrote: ... > [Alex] >> I'm saying that if, by mistake, the programmer has NOT >> registered the A->C adapter (which would be easily coded and work >> perfectly), then thanks to transitivity, instead of a clear and simple >> error message leading to immediate diagnosis of the error, they'll get >> a subtle unnecessary degradation of information and resulting >> reduction >> in information quality. > > I understand, but I would think that there are just as many examples > of cases where having to register a trivial A->C adapter is much more > of a pain than it's worth; especially if there are a number of A->B > pairs and a number of B->C pairs, the number of additional A->C pairs > needed could be bewildering. Hm? For any A and B there can be only one A->B adapter registered. Do you mean a number of A->B1, B1->C1 ; A->B2, B2->C2; etc? Because if it was B1->C and B2->C, as I understand the transitivity of PyProtocols, it would be considered an error. > But I would like to see some input from people with C++ experience. Here I am, at your service. I've done, taught, mentored, etc, much more C++ than Python in my life. I was technical leader for the whole C -> C++ migration of a SW house which at that time had more than 100 programmers (just as I had earlier been for the Fortran -> C migration back a few years previously, with around 30 programmers): I taught internal courses, seminars and workshops on C++, its differences from C, OO programming and design, Design Patterns, and later generic programming, the STL, and so on, and so forth. I mentored a lot of people (particularly small groups of people that would later go and teach/mentor the others), pair-programmed in the most critical migrations across the breadth of that SW house's software base, etc, etc. FWIW, having aced Brainbench's C++ tests (I was evaluating them to see if it would help us select among candidates claiming C++ skills), I was invited to serve for a while as one of their "Most Valued Professionals" (MVPs) for C++, and although I had concluded that for that SW house's purposes the tests weren't all that useful, I did, trying to see if I could help make them better (more suitable to test _real-world_ skills and less biased in favour of people with that "language-lawyer" or "library-packrat" kind of mentality I have, which is more useful in tests than out in the real world). I hope I can qualify as a C++ expert by any definition. > C++ goes to great lengths to pick automatic conversions (which perhaps > aren't quite the same as adaptations but close enough for this > comparison to work) I agree with you, though I believe PJE doesn't (he doesn't accept my experience with such conversions as a valid reason for me to be afraid of "close enough for this comparison" adaptations). > and combine them. *In practice*, is this a benefit > or a liability? It's in the running for the coveted "Alex's worst nightmare" prize, with a few other features of C++ - alternatively put, the prize for "reason making Alex happiest to have switched to Python and _almost_ managed to forget C++ save when he wakes up screaming in the middle of the night";-). Alex From gvanrossum at gmail.com Wed Jan 12 16:52:15 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 16:52:18 2005 Subject: getattr and __mro__ (was Re: [Python-Dev] PEP 246, redux) In-Reply-To: References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <20050111124157.GA16642@vicky.ecs.soton.ac.uk> Message-ID: [Armin] > > ... is that the __adapt__() and __conform__() methods should work just > > like all other special methods for new-style classes. The confusion > > comes from the fact that the reference implementation doesn't do that. > > It should be fixed by replacing: > > > > conform = getattr(type(obj), '__conform__', None) > > > > with: > > > > for basecls in type(obj).__mro__: > > if '__conform__' in basecls.__dict__: > > conform = basecls.__dict__['__conform__'] > > break > > else: > > # not found > > > > and the same for '__adapt__'. > > > > The point about tp_xxx slots is that when implemented in C with slots, you get > > the latter (correct) effect for free. This is how metaconfusion is avoided in > > post-2.2 Python. Using getattr() for that is essentially broken. Trying to > > call the method and catching TypeErrors seems pretty fragile -- e.g. if you > > are calling a __conform__() which is implemented in C you won't get a Python > > frame in the traceback either. [Thomas] > I'm confused. Do you mean that > > getattr(obj, "somemethod")(...) > > does something different than > > obj.somemethod(...) > > with new style class instances? Doesn't getattr search the __dict__'s > along the __mro__ list? No, he's referring to the (perhaps not widely advertised) fact that obj[X] is not quite the same as obj.__getitem__(X) since the explicit method invocation will find obj.__dict__["__getitem__"] if it exists but the operator syntax will start the search with obj.__class__.__dict__. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax at aleax.it Wed Jan 12 16:52:53 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 16:52:59 2005 Subject: getattr and __mro__ (was Re: [Python-Dev] PEP 246, redux) In-Reply-To: References: <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110113559.02bc2350@mail.telecommunity.com> <5.1.1.6.0.20050110150424.039f0590@mail.telecommunity.com> <20050111124157.GA16642@vicky.ecs.soton.ac.uk> Message-ID: <052FE6C7-64B2-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 16:44, Thomas Heller wrote: ... >> conform = getattr(type(obj), '__conform__', None) ... > I'm confused. Do you mean that > > getattr(obj, "somemethod")(...) > > does something different than > > obj.somemethod(...) > > with new style class instances? Doesn't getattr search the __dict__'s > along the __mro__ list? Yes, but getattr(obj, ... ALSO searches obj itself, which is what we're trying to avoid here. getattr(type(obj), ... OTOH has a DIFFERENT problem -- it ALSO searches type(type(obj)), the metaclass, which we DON'T want. Alex From aleax at aleax.it Wed Jan 12 17:02:11 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 17:02:16 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011206001a5a3805@mail.gmail.com> <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <51767A6A-64B3-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 16:45, Guido van Rossum wrote: > My personal POV here: even when you break Liskov in subtle ways, there > are lots of situations where assuming substitutability has no ill > effects, so I'm happy to pretend that a subclass is always a subtype > of all of its base classes, (and their base classes, etc.). If it > isn't, you can always provide an explicit adapter to rectify things. Ah, this is the crucial point: an explicit adapter must take precedence over substitutability that is assumed by subclassing. From my POV this does just as well as any other kind of explicit control about whether subclassing implies substitutability. In retrospect, that's the same strategy as in copy.py: *FIRST*, check the registry -- if something is found in the registry, THAT takes precedence. *THEN*, only for cases where the registry doesn't give an answer, proceed with other steps and checks and sub-strategies. So, I think PEP 246 should specify that the step now called (e) [checking the registry] comes FIRST; then, an isinstance step [currently split between (a) and (d)], then __conform__ and __adapt__ steps [currently called (b) and (c)]. Checking the registry is after all very fast: make the 2-tuple (type(obj), protocol), use it to index into the registry -- period. So, it's probably not worth complicating the semantics at all just to "fast path" the common case. I intend to restructure pep246 at next rewrite to reflect this "obvious once thought of" idea, and thanks, Guido, for providing it. Alex From aleax at aleax.it Wed Jan 12 17:04:42 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 17:04:48 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <1105540027.5326.19.camel@localhost> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> Message-ID: On 2005 Jan 12, at 16:49, Guido van Rossum wrote: > [Phillip] >> As for the issue of what should and shouldn't exist in Python, it >> doesn't >> really matter; PEP 246 doesn't (and can't!) *prohibit* transitive >> adaptation. > > Really? Then isn't it underspecified? I'd think that by the time we > actually implement PEP 246 in the Python core, this part of the > semantics should be specified (at least the default behavior, even if > there are hooks to change this). Very good point -- thanks Phillip and Guido jointly for pointing this out. Alex From dw at botanicus.net Wed Jan 12 17:26:06 2005 From: dw at botanicus.net (David Wilson) Date: Wed Jan 12 17:26:10 2005 Subject: [Python-Dev] PATCH/RFC for AF_NETLINK support In-Reply-To: <20050111141523.32125.1902928401.divmod.quotient.6074@ohm> References: <20050111013252.GA216@thailand.botanicus.net> <20050111141523.32125.1902928401.divmod.quotient.6074@ohm> Message-ID: <20050112162606.GA48911@thailand.botanicus.net> On Tue, Jan 11, 2005 at 02:15:23PM +0000, Jp Calderone wrote: > > I would like to see (optional?) support for this before your patch is > > merged. I have a long-term interest in a Python-based service control / > > init replacement / system management application, for use in specialised > > environments. I could definately use this. :) > > Useful indeed, but I'm not sure why basic NETLINK support should be > held up for it? Point taken. I don't recall why I thought special code would be required for this. I was thinking a little more about how support might be added for older kernels. No harm can be done by compiling in the constant, and it doesn't cost much. How about: #include ... #ifndef NETLINK_KOBJECT_UEVENT #define NETLINK_KOBJECT_UEVENT 15 #endif /* Code assuming build host supports KOBJECT_UEVENT. */ Type thing. Cheers, David. -- ... do you think I'm going to waste my time trying to pin physical interpretations upon every optical illusion of our instruments? Since when is the evidence of our senses any match for the clear light of reason? -- Cutie, in Asimov's Reason From pje at telecommunity.com Wed Jan 12 17:40:43 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 17:39:55 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> At 04:36 PM 1/12/05 +0100, Alex Martelli wrote: >I already know -- you told us so -- that if I had transitivity as you wish >it (uncontrollable, unstoppable, always-on) I could not any more write and >register a perfectly reasonable adapter which fills in with a NULL an >optional field in the adapted-to interface, without facing undetected >degradation of information quality by that adapter being invisibly, >uncontrollably chained up with another -- no error message, no nothing, no >way to stop this -- just because a direct adapter wasn't correctly written >and registered. But why would you *want* to do this, instead of just explicitly converting? That's what I don't understand. If I were writing such a converter, I wouldn't want to register it for ANY implicit conversion, even if it was non-transitive! From aleax at aleax.it Wed Jan 12 18:18:47 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 18:18:54 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> Message-ID: <04EFD92E-64BE-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 17:40, Phillip J. Eby wrote: > At 04:36 PM 1/12/05 +0100, Alex Martelli wrote: >> I already know -- you told us so -- that if I had transitivity as you >> wish it (uncontrollable, unstoppable, always-on) I could not any more >> write and register a perfectly reasonable adapter which fills in with >> a NULL an optional field in the adapted-to interface, without facing >> undetected degradation of information quality by that adapter being >> invisibly, uncontrollably chained up with another -- no error >> message, no nothing, no way to stop this -- just because a direct >> adapter wasn't correctly written and registered. > > But why would you *want* to do this, instead of just explicitly > converting? That's what I don't understand. If I were writing such a > converter, I wouldn't want to register it for ANY implicit conversion, > even if it was non-transitive! Say I have an SQL DB with a table such as: CREATE TABLE fullname ( first VARCHAR(50) NOT NULL, middle VARCHAR(50), last VARCHAR(50) NOT NULL, -- snipped other information fields ) Now, I need to record a lot of names of people, which I get from a vast variety of sources, so they come in as different types. No problem: I'll just adapt each person-holding type to an interface which offers first, middle and last names (as well as other information fields, here snipped), using None to mean I don't know the middle name for a given person (that's what NULL means, after all: "information unknown" or the like; the fact that fullname.middle is allowed to be NULL indicates that, while it's of course BETTER to have that information, it's not a semantic violation if that information just can't be obtained nohow). All of my types which hold info on people can at least supply first and last names; some but not all can supply middle names. Fine, no problem: I can adapt them all with suitable adapters anyway, noninvasively, without having to typecheck, typeswitch, or any other horror. Ah, the magic of adaptation! So, I define an interface -- say with arbitrary syntax: interface IFullname: first: str middle: str or None last: str # snipped other information fields and my function to write a data record is just: def writeFullname(person: IFullname): # do the writing So, I have another interface in a similar vein, perhaps to map to/from some LDAP and similar servers which provide a slightly different set of info fields: interface IPerson: firstName: str lastName: str userid: str # snipped other stuff I have some data about people coming in from LDAP and the like, which I want to record in that SQL DB -- the incoming data is held in types that implement IPerson, so I write an adapter IPerson -> IFullname for the purpose. If the datatypes are immutable, conversion is as good as adaptation here, as I mentioned ever since the first mail in which I sketched this case, many megabytes back. But adaptation I can get automatically WITHOUT typechecking on what exactly is the concrete type I'm having to write (send to LDAP, whatver) this time -- a crucial advantage of adaptation, as you mention in the PyProtocols docs. Besides, maybe in some cases some of those attributes are in fact properties that get computed at runtime, fetched from a slow link if and only if they're first required, whatever, or even, very simply, some datatype is mutable and I need to ensure I'm dealing with the current state of the object/record. So, I'm not sure why you appear to argue for conversion against adaptation, or explicit typechecking against the avoidance thereof which is such a big part of adapt's role in life. Alex From mcherm at mcherm.com Wed Jan 12 18:46:46 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed Jan 12 18:47:07 2005 Subject: [Python-Dev] PEP 246, redux Message-ID: <1105552006.41e562862bf84@mcherm.com> This is a collection of responses to various things that don't appear to have been resolved yet: Phillip writes: > if a target protocol has optional aspects, then lossy adaptation to it is > okay by definition. Conversely, if the aspect is *not* optional, then > lossy adaptation to it is not acceptable. I don't think there can really > be a middle ground; you have to decide whether the information is required > or not. I disagree. To belabor Alex's example, suppose LotsOfInfo has first, middle, and last names; PersonName has first and last, and FullName has first, middle initial and last. FullName's __doc__ specifically states that if the middle name is not available or the individual does not have a middle name, then "None" is acceptable. Converting LotsOfInfo to FullName via PersonName results in None for the middle name. But that's just not right... it _should_ be filling in the middle initial because that information IS available. It's technically correct in a theoretical sort of a way (FullName never PROMISES that middle_initial will be available), but it's wrong in a my-program-doesn't- work-right sort of way, because it HAS the information available yet doesn't use it. You're probably going to say "okay, then register a LotsOfInfo->FullName converter", and I agree. But if no such converter is registered, I would rather have a TypeError then an automatic conversion which produces incorrect results. I can explicitly silence it by registering a trivial converter: def adapt_LotsOfInfo_to_FullName(lots_of_info): person_name = adapt(lots_of_info, PersonName) return adapt(person_name, FullName) but if it just worked, I could only discover that by being clever enough to think of writing unit test for middle name. ------------------ Elsewhere, Phillip writes: > If you derive an interface from another interface, this is supposed to mean > that your derived interface promises to uphold all the promises of the base > interface. That is, your derived interface is always usable where the base > interface is required. > > However, oftentimes one mistakenly derives an interface from another while > meaning that the base interface is *required* by the derived interface, > which is similar in meaning but subtly different. Here, you mean to say, > "IDerived has all of the requirements of IBase", but you have instead said, > "You can use IDerived wherever IBase is desired". Okay, that's beginning to make sense to me. > it's difficult because intuitively an interface defines a *requirement*, so > it seems logical to inherit from an interface in order to add requirements! Yes... I would fall into this trap as well until I'd been burned a few times. ------------------ Alex summarizes nicely: > Personally, I disagree with having transitivity at all, unless perhaps > it be restricted to adaptations specifically and explicitly stated to > be "perfect and lossless"; PJE claims that ALL adaptations MUST, > ALWAYS, be "perfect and lossless" -- essentially, it seems to me, he > _has_ to claim that, to defend transitivity being applied > automatically, relentlessly, NON-optionally, NON-selectively [...] > Much the same applies to inheritance, BTW, which as PJE has pointed out > a few times also induces transitivity in adaptation, and, according to > him, is a more likely cause of bugs than chains of adapters But Alex goes on to say that perhaps we need two grades of adaptations (perfect and real-world) and two grades of interface inheritance (perfect and otherwise) so that the transitivity can be (automatically) invoked only for the perfect ones. That feels to me like excessive complexity: why not just prohibit transitivity? What, after all, is the COST of prohibiting transitivity? For the first case (adapter chains) the cost is a N^2 explosion in the number of adapters needed. I said I thought that N would be small, but Phillip (who knows what he's talking about, don't get me wrong here) says that it's big enough to be mildly annoying at times to Twisted and Eclipse developers. For the second case (interface inheritance), I haven't yet thought through clearly how at affects things... in fact, it sort of seems like there's no good way to _prevent_ "transitivity" in this case short of prohibiting interface inheritance entirely. And, as Phillip points out to me (see above) this is a more common type of error. Gee... I'm understanding the problem a little better, but elegant solutions are still escaping me. -- Michael Chermside From Scott.Daniels at Acm.Org Wed Jan 12 18:52:54 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Wed Jan 12 18:51:49 2005 Subject: [Python-Dev] Recent IBM Patent releases Message-ID: IBM has recently released 500 patents for use in opensource code. http://www.ibm.com/ibm/licensing/patents/pledgedpatents.pdf "...In order to foster innovation and avoid the possibility that a party will take advantage of this pledge and then assert patents or other intellectual property rights of its own against Open Source Software, thereby limiting the freedom of IBM or any other Open Source developer to create innovative software programs, the commitment not to assert any of these 500 U.S. patents and all counterparts of these patents issued in other countries is irrevocable except that IBM reserves the right to terminate this patent pledge and commitment only with regard to any party who files a lawsuit asserting patents or other intellectual property rights against Open Source Software." Since this includes patents on compression and encryption stuff, we will definitely be faced with deciding on whether to allow use of these patents in the main Python library. Somebody was worried about BSD-style licenses on Groklaw, and said, "Yes, you can use this patent in the free version... but if you close the code, you're violating IBM's Patents, and they WILL come after you. Think of what would have happened if IBM had held a patent that was used in the FreeBSD TCP/IP stack? Microsoft used it as the base of the Windows NT TCP/IP stack. IBM could then sue Microsoft for patent violations." To which he got the following reply: "Sorry, but that's not correct. That specific question was asked on the IBM con-call about this announcement. i.e. if there were a commercial product that was a derived work of an open source project that used these royalty-free patents, what would happen? "IBM answered that, so long as the commercial derived work followed the terms of the open source license agreement, there was no problem. (So IBM is fine with a commercial product based on an open source BSD project making use of these patents.)" This means to me we can put these in Python's library, but it is definitely something to start deciding now. -- Scott David Daniels Scott.Daniels@Acm.Org From vinay_sajip at yahoo.co.uk Wed Jan 12 18:21:34 2005 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Wed Jan 12 18:53:19 2005 Subject: [Python-Dev] Re: logging class submission In-Reply-To: <1490506001@web.de> References: <1490506001@web.de> Message-ID: <41E55C9E.1050409@yahoo.co.uk> There is already a TimedRotatingFileHandler which will do backups on a schedule, including daily. The correct way of doing what you want is to submit a patch via SourceForge. If the patch is accepted, then your code will end up in Python. Thanks, Vinay Sajip From pje at telecommunity.com Wed Jan 12 18:58:38 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 18:57:51 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <04EFD92E-64BE-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com> At 06:18 PM 1/12/05 +0100, Alex Martelli wrote: >On 2005 Jan 12, at 17:40, Phillip J. Eby wrote: > >>At 04:36 PM 1/12/05 +0100, Alex Martelli wrote: >>>I already know -- you told us so -- that if I had transitivity as you >>>wish it (uncontrollable, unstoppable, always-on) I could not any more >>>write and register a perfectly reasonable adapter which fills in with a >>>NULL an optional field in the adapted-to interface, without facing >>>undetected degradation of information quality by that adapter being >>>invisibly, uncontrollably chained up with another -- no error message, >>>no nothing, no way to stop this -- just because a direct adapter wasn't >>>correctly written and registered. >> >>But why would you *want* to do this, instead of just explicitly >>converting? That's what I don't understand. If I were writing such a >>converter, I wouldn't want to register it for ANY implicit conversion, >>even if it was non-transitive! > >[snip lots of stuff] >I have some data about people coming in from LDAP and the like, which I >want to record in that SQL DB -- the incoming data is held in types that >implement IPerson, so I write an adapter IPerson -> IFullname for the purpose. This doesn't answer my question. Obviously it makes sense to adapt in this fashion, but not IMPLICITLY and AUTOMATICALLY. That's the distinction I'm trying to make. I have no issue with writing an adapter like 'PersonAsFullName' for this use case; I just don't think you should *register* it for automatic use any time you pass a Person to something that takes a FullName. > So, I'm not sure why you appear to argue for conversion against > adaptation, or explicit typechecking against the avoidance thereof which > is such a big part of adapt's role in life. Okay, I see where we are not communicating; where I've been saying "conversion", you are taking this to mean, "don't write an adapter", but what I mean is "don't *register* the adapter for implicit adaptation; explicitly use it in the place where you need it. From skip at pobox.com Wed Jan 12 02:48:50 2005 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 12 18:59:04 2005 Subject: [Python-Dev] Re: [Csv] csv module TODO list In-Reply-To: <000401c4f83d$7432db40$e841fea9@oemcomputer> References: <20050110233126.GA14363@janus.swcomplete.com> <000401c4f83d$7432db40$e841fea9@oemcomputer> Message-ID: <16868.33282.507253.969557@montanaro.dyndns.org> Raymond> Would the csv module be a good place to add a DBF reader and Raymond> writer? Not really. Raymond> I've posted a draft on ASPN. It interoperates well with the Raymond> rest of the CSV module because it also accepts/returns a list Raymond> of fieldnames and a sequence of records. Raymond> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/362715 Just clean it up (do the doco thing) and check it in as dbf.py with reader and writer functions. I see your modus operandi at work: code something in Python then switch to C when nobody's looking. ;-) Skip From gvanrossum at gmail.com Wed Jan 12 18:59:13 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 18:59:16 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: [Alex] > Armin's fix was to change: > > conform = getattr(type(obj), '__conform__', None) > > into: > > for basecls in type(obj).__mro__: > if '__conform__' in basecls.__dict__: > conform = basecls.__dict__['__conform__'] > break > else: > # not found > > I have only cursorily examined the rest of the standard library, but it > seems to me there may be a few other places where getattr is being used > on a type for this purpose, such as pprint.py which has a couple of > occurrences of > r = getattr(typ, "__repr__", None) [And then proceeds to propose a new API to improve the situation] I wonder if the following solution wouldn't be more useful (since less code will have to be changed). The descriptor for __getattr__ and other special attributes could claim to be a "data descriptor" which means that it gets first pick *even if there's also a matching entry in the instance __dict__*. Quick illustrative example: >>> class C(object): foo = property(lambda self: 42) # a property is always a "data descriptor" >>> a = C() >>> a.foo 42 >>> a.__dict__["foo"] = "hello" >>> a.foo 42 >>> Normal methods are not data descriptors, so they can be overridden by something in __dict__; but it makes some sense that for methods implementing special operations like __getitem__ or __copy__, where the instance __dict__ is already skipped when the operation is invoked using its special syntax, it should also be skipped by explicit attribute access (whether getattr(x, "__getitem__") or x.__getitem__ -- these are entirely equivalent). We would need to introduce a new decorator so that classes overriding these methods can also make those methods "data descriptors", and so that users can define their own methods with this special behavior (this would be needed for __copy__, probably). I don't think this will cause any backwards compatibility problems -- since putting a __getitem__ in an instance __dict__ doesn't override the x[y] syntax, it's unlikely that anybody would be using this. "Ordinary" methods will still be overridable. PS. The term "data descriptor" now feels odd, perhaps we can say "hard descriptors" instead. Hard descriptors have a __set__ method in addition to a __get__ method (though the __set__ method may always raise an exception, to implement a read-only attribute). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Wed Jan 12 02:59:22 2005 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 12 18:59:40 2005 Subject: [Python-Dev] Re: [Csv] csv module and universal newlines In-Reply-To: <20050110044441.250103C889@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050110044441.250103C889@coffee.object-craft.com.au> Message-ID: <16868.33914.837771.954739@montanaro.dyndns.org> Andrew> The csv parser consumes lines from an iterator, but it also has Andrew> it's own idea of end-of-line conventions, which are currently Andrew> only used by the writer, not the reader, which is a source of Andrew> much confusion. The writer, by default, also attempts to emit a Andrew> \r\n sequence, which results in more confusion unless the file Andrew> is opened in binary mode. Andrew> I'm looking for suggestions for how we can mitigate these Andrew> problems (without breaking things for existing users). You can argue that reading csv data from/writing csv data to a file on Windows if the file isn't opened in binary mode is an error. Perhaps we should enforce that in situations where it matters. Would this be a start? terminators = {"darwin": "\r", "win32": "\r\n"} if (dialect.lineterminator != terminators.get(sys.platform, "\n") and "b" not in getattr(f, "mode", "b")): raise IOError, ("%s not opened in binary mode" % getattr(f, "name", "???")) The elements of the postulated terminators dictionary may already exist somewhere within the sys or os modules (if not, perhaps they should be added). The idea of the check is to enforce binary mode on those objects that support a mode if the desired line terminator doesn't match the platform's line terminator. Skip From skip at pobox.com Wed Jan 12 02:39:00 2005 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 12 18:59:44 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <41E45DA7.1030302@ActiveState.com> References: <1105471586.41e42862b9a39@mcherm.com> <41E45DA7.1030302@ActiveState.com> Message-ID: <16868.32692.589702.66263@montanaro.dyndns.org> >>> Terminology point: I know that LiskovViolation is technically >>> correct, but I'd really prefer it if exception names (which are >>> sometimes all users get to see) were more informative for people w/o >>> deep technical background. Would that be possible? >> >> I don't see how. Googling on Liskov immediately brings up clear and >> understandable descriptions of the principle that's being violated. >> I can't imagine summarizing the issue more concisely than that! What >> would you suggest? Including better explanations in the documentation >> is a must, but "LiskovViolation" in the exception name seems >> unbeatably clear and concise. David> Clearly, I disagree. I had never heard the term before and consulted the Google oracle as well. I found this more readable definition: Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it. here: http://www.compulink.co.uk/~querrid/STANDARD/lsp.htm Of course, the situations in which a Liskov violation can occur can be a bit subtle. David> My point is that it'd be nice if we could come up with an David> exception name which could be grokkable without requiring 1) David> Google, 2) relatively high-level understanding of type theory. I suspect if there was a succinct way to convey the concept in two or three words it would already be in common use. The alternative seems to be to make sure it's properly docstringed and added to the tutorial's glossary: >>> help(lv.LiskovViolation) Help on class LiskovViolation in module lv: class LiskovViolation(exceptions.Exception) | Functions that use pointers or references to base classes must be | able to use objects of derived classes without knowing it. | ... I suspect there's something to be said for exposing the user base to a little bit of software engineering terminology every now and then. A couple years ago I suspect most of us had never heard of list comprehensions, and we all bat the term about without a second thought now. Skip From skip at pobox.com Wed Jan 12 19:02:36 2005 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 12 19:02:57 2005 Subject: [Python-Dev] Recent IBM Patent releases In-Reply-To: References: Message-ID: <16869.26172.317064.537812@montanaro.dyndns.org> Scott> Since this includes patents on compression and encryption stuff, Scott> we will definitely be faced with deciding on whether to allow use Scott> of these patents in the main Python library. Who is going to decide if a particular library would be affected by one or more of the 500 patents IBM released? Skip From aleax at aleax.it Wed Jan 12 19:04:04 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 19:04:11 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com> References: <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> <5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com> Message-ID: <5841288E-64C4-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 18:58, Phillip J. Eby wrote: ... >> I have some data about people coming in from LDAP and the like, which >> I want to record in that SQL DB -- the incoming data is held in types >> that implement IPerson, so I write an adapter IPerson -> IFullname >> for the purpose. > > This doesn't answer my question. Obviously it makes sense to adapt in > this fashion, but not IMPLICITLY and AUTOMATICALLY. That's the > distinction I'm trying to make. I have no issue with writing an > adapter like 'PersonAsFullName' for this use case; I just don't think > you should *register* it for automatic use any time you pass a Person > to something that takes a FullName. I'm adapting incoming data that can be of any of a huge variety of concrete types with different interfaces. *** I DO NOT WANT TO TYPECHECK THE INCOMING DATA *** to know what adapter or converter to apply -- *** THAT'S THE WHOLE POINT *** of PEP 246. I can't believe we're misunderstanding each other about this -- there MUST be miscommunication going on! >> So, I'm not sure why you appear to argue for conversion against >> adaptation, or explicit typechecking against the avoidance thereof >> which is such a big part of adapt's role in life. > > Okay, I see where we are not communicating; where I've been saying > "conversion", you are taking this to mean, "don't write an adapter", > but what I mean is "don't *register* the adapter for implicit > adaptation; explicitly use it in the place where you need it. "Adaptation is not conversion" is how I THOUGHT we had agreed to rephrase my unfortunate "adaptation is not casting" -- so if you're using conversion to mean adaptation, I'm nonplussed. Needing to be explicit and therefore to typechecking/typeswitching to pick which adapter to apply is just what I don't *WANT* to do, what I don't want *ANYBODY* to have to do EVER, and the very reason I'm spending time and energy on PEP 246. So, how would you propose I know which adapter I need, without spreading typechecks all over my bedraggled *CODE*?!?! Alex From mcherm at mcherm.com Wed Jan 12 19:08:20 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed Jan 12 19:08:24 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name Message-ID: <1105553300.41e56794d1fc5@mcherm.com> I wrote: > I don't see how [LiskovViolation could have a more descriptive name]. > Googling on Liskov immediately brings up clear and understandable > descriptions of the principle David writes: > Clearly, I disagree. [...] Skip writes: > I had never heard the term before and consulted the Google oracle as well. This must be one of those cases where I am mislead by my background... I thought of Liskov substitution principle as a piece of basic CS background that everyone learned in school (or from the net, or wherever they learned programming). Clearly, that's not true. Guido writes: > How about SubstitutabilityError? It would be less precise and informative to ME but apparently more so to a beginner. Obviously, we should support the beginner! -- Michael Chermside From Scott.Daniels at Acm.Org Wed Jan 12 19:13:12 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Wed Jan 12 19:11:55 2005 Subject: [Python-Dev] Re: Recent IBM Patent releases In-Reply-To: <16869.26172.317064.537812@montanaro.dyndns.org> References: <16869.26172.317064.537812@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > Who is going to decide if a particular library would be affected by one or > more of the 500 patents IBM released? > > Skip I am thinking more along the lines of, "our policy on accepting new code [will/will not] be to allow new submissions which use some of that patented code." I believe our current policy is that the author warrants that the code is his/her own work and not encumbered by any patent. -- Scott David Daniels Scott.Daniels@Acm.Org From gvanrossum at gmail.com Wed Jan 12 19:16:14 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 19:16:17 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: > > [Alex] > >> I'm saying that if, by mistake, the programmer has NOT > >> registered the A->C adapter (which would be easily coded and work > >> perfectly), then thanks to transitivity, instead of a clear and simple > >> error message leading to immediate diagnosis of the error, they'll get > >> a subtle unnecessary degradation of information and resulting > >> reduction > >> in information quality. [Guido] > > I understand, but I would think that there are just as many examples > > of cases where having to register a trivial A->C adapter is much more > > of a pain than it's worth; especially if there are a number of A->B > > pairs and a number of B->C pairs, the number of additional A->C pairs > > needed could be bewildering. [Alex] > Hm? I meant if there were multiple A's. For every Ai that has an Ai->B you would also have to register a trivial Ai->C. And if there were multiple C's (B->C1, B->C2, ...) then the number of extra adaptors to register would be the number of A's *times* the number of C's, in addition to the sum of those numbers for the "atomic" adaptors (Ai->B, B->Cj). > > But I would like to see some input from people with C++ experience. > > Here I am, at your service. [...] > It's in the running for the coveted "Alex's worst nightmare" prize, Aha. This explains why you feel so strongly about it. But now, since I am still in favor of automatic "combined" adaptation *as a last resort*, I ask you to consider that Python is not C++, and that perhaps we can make the experience in Python better than it was in C++. Perhaps allowing more control over when automatic adaptation is acceptable? For example, inteface B (or perhaps this should be a property of the adapter for B->C?) might be marked so as to allow or disallow its consideration when looking for multi-step adaptations. We could even make the default "don't consider", so only people who have to deal with the multiple A's and/or multiple C's all adaptable via the same B could save themselves some typing by turning it on. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From walter at livinglogic.de Wed Jan 12 19:18:04 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Jan 12 19:18:07 2005 Subject: [Python-Dev] Feed style codec API Message-ID: <41E569DC.2000407@livinglogic.de> Now that Python 2.4 is out the door (and the problems with StreamReader.readline() are hopefully fixed), I'd like bring up the topic of a feed style codec API again. A feed style API would make it possible to use stateful encoding/decoding where the data is not available as a stream. Two examples: - xml.sax.xmlreader.IncrementalParser: Here the client passes raw XML data to the parser in multiple calls to the feed() method. If the parser wants to use Python codecs machinery, it has to wrap a stream interface around the data passed to the feed() method. - WSGI (PEP 333) specifies that the web application returns the fragments of the resulting webpage as an iterator. If this result is encoded unicode we have the same problem: This must be wrapped in a stream interface. The simplest solution is to add a feed() method both to StreamReader and StreamWriter, that takes the state of the codec into account, but doesn't use the stream. This can be done by simply moving a few lines of code into separate methods. I've uploaded a patch to Sourceforge: #1101097. There are other open issues with the codec changes: unicode-escape, UTF-7, the CJK codecs and probably a few others don't support decoding imcomplete input yet (although AFAICR the functionality is mostly there in the CJK codecs). Bye, Walter D?rwald From pje at telecommunity.com Wed Jan 12 19:30:01 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 19:29:15 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <5.1.1.6.0.20050112132041.02f92e60@mail.telecommunity.com> At 09:59 AM 1/12/05 -0800, Guido van Rossum wrote: >We would need to introduce a new decorator so that classes overriding >these methods can also make those methods "data descriptors", and so >that users can define their own methods with this special behavior >(this would be needed for __copy__, probably). I have used this technique as a workaround in PyProtocols for __conform__ and __adapt__, so I know it works. In fact, here's the implementation: def metamethod(func): """Wrapper for metaclass method that might be confused w/instance method""" return property(lambda ob: func.__get__(ob,ob.__class__)) Basically, if I implement a non-slotted special method (like __copy__ or __getstate__) on a metaclass, I usually wrap it with this decorator in order to avoid the "metaconfusion" issue. I didn't mention it because the technique had gotten somehow tagged in my brain as a temporary workaround until __conform__ and __adapt__ get slots of their own, rather than a general fix for metaconfusion. Also, I wrote the above when Python 2.2 was still pretty new, so its applicability was somewhat limited by the fact that 2.2 didn't allow super() to work with data descriptors. So, if you used it, you couldn't use super. This probably isn't a problem any more, although I'm still using the super-alike that I wrote as a workaround, so I don't know for sure. :) >PS. The term "data descriptor" now feels odd, perhaps we can say "hard >descriptors" instead. Hard descriptors have a __set__ method in >addition to a __get__ method (though the __set__ method may always >raise an exception, to implement a read-only attribute). I think I'd prefer "Override descriptor" to "Hard descriptor", but either is better than data descriptor. Presumably there will need to be backward compatibility macros in the C API, though for e.g. PyDescriptor_IsData or whatever it's currently called. From pje at telecommunity.com Wed Jan 12 19:33:20 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 19:32:35 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <5.1.1.6.0.20050112105656.03a7e180@mail.telecommunity.com> At 07:26 AM 1/12/05 -0800, Guido van Rossum wrote: >[Alex] > > I'm saying that if, by mistake, the programmer has NOT > > registered the A->C adapter (which would be easily coded and work > > perfectly), then thanks to transitivity, instead of a clear and simple > > error message leading to immediate diagnosis of the error, they'll get > > a subtle unnecessary degradation of information and resulting reduction > > in information quality. > >I understand, but I would think that there are just as many examples >of cases where having to register a trivial A->C adapter is much more >of a pain than it's worth; especially if there are a number of A->B >pairs and a number of B->C pairs, the number of additional A->C pairs >needed could be bewildering. Alex has suggested that there be a way to indicate that an adapter is noiseless and therefore suitable for transitivity. I, on the other hand, would prefer having a way to declare that an adapter is *unsuitable* for transitivity, in the event that you feel the need to introduce an "imperfect" implicit adapter (versus using an explicit conversion). So, in principle we agree, but we differ regarding what's a better *default*, and this is probably where you will be asked to Pronounce, because as Alex says, this *is* controversial. However, I think there is another way to look at it, in which *both* can be the default... Look at it this way. If you create an adapter class or function to do some kind of adaptation, it is not inherently transitive, and you have to explicitly invoke it. (This is certainly the case today!) Now, let us say that you then register that adapter to perform implicit adaptation -- and by default, such a registration says that you are happy with it being used implicitly, so it will be used whenever you ask for its target interface or some further adaptation thereof. So, here we have a situation where in some sense, BOTH approaches are the "default", so in theory, both sides should be happy. However, as I understand it, Alex is *not* happy with this, because he wants to be able to register a noisy adapter for *implicit* use, but only in the case where it is used directly. The real question, then, is "What is that use case good for?" And I don't have an answer to that question, because it's Alex's use case. I'm proposing an approach that has two useful extremes: be noisy and explicit, or clean and implicit. Alex seems to want to also add the middle ground of "noisy but implicit", and I think this is a bad idea because it will lead to precisely to the same problems as it does in C++! Python as it exists today tends to support the proposition that noisy adaptation or conversion should not be implicit, since trying to do 'someList[1.2]' raises a TypeError, rather than silently truncating. The number of steps between 'float' and 'int' in some adaptation graph has absolutely nothing to do with it; even if there is only one step, doing this kind of conversion or adaptation implicitly is just a bad idea. From pje at telecommunity.com Wed Jan 12 19:46:22 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 19:45:37 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5841288E-64C4-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com> <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112113929.037a5260@mail.telecommunity.com> <5.1.1.6.0.20050112125438.03d48270@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112133516.02f961c0@mail.telecommunity.com> At 07:04 PM 1/12/05 +0100, Alex Martelli wrote: >On 2005 Jan 12, at 18:58, Phillip J. Eby wrote: > ... >>>I have some data about people coming in from LDAP and the like, which I >>>want to record in that SQL DB -- the incoming data is held in types that >>>implement IPerson, so I write an adapter IPerson -> IFullname for the purpose. >> >>This doesn't answer my question. Obviously it makes sense to adapt in >>this fashion, but not IMPLICITLY and AUTOMATICALLY. That's the >>distinction I'm trying to make. I have no issue with writing an adapter >>like 'PersonAsFullName' for this use case; I just don't think you should >>*register* it for automatic use any time you pass a Person to something >>that takes a FullName. > >I'm adapting incoming data that can be of any of a huge variety of >concrete types with different interfaces. *** I DO NOT WANT TO TYPECHECK >THE INCOMING DATA *** to know what adapter or converter to apply -- *** >THAT'S THE WHOLE POINT *** of PEP 246. I can't believe we're >misunderstanding each other about this -- there MUST be miscommunication >going on! Indeed! Let me try and be more specific about my assumptions. What *I* would do in your described scenario is *not* register a general-purpose IPerson -> IFullName adapter, because in the general case this is a lossy adaptation. However, for some *concrete* types an adaptation to IFullName must necessarily have a NULL middle name; therefore, I would define a *concrete adaptation* from those types to IFullName, *not* an adaptation from IPerson -> IFullName. This still allows for transitive adapter composition from IFullName on to other interfaces, if need be. IOW, the standard for "purity" in adapting from a concrete type to an interface can be much lower than for adapting from interface to interface. An interface-to-interface adapter is promising that it can adapt any possible implementation of the first interface to the second interface, and that it's always suitable for doing so. IMO, if you're not willing to make that commitment, you shouldn't define an interface-to-interface adapter. Hm. Maybe we actually *agree*, and are effectively only arguing terminology? That would be funny, yet sad. :) >>> So, I'm not sure why you appear to argue for conversion against >>> adaptation, or explicit typechecking against the avoidance thereof >>> which is such a big part of adapt's role in life. >> >>Okay, I see where we are not communicating; where I've been saying >>"conversion", you are taking this to mean, "don't write an adapter", but >>what I mean is "don't *register* the adapter for implicit adaptation; >>explicitly use it in the place where you need it. > >"Adaptation is not conversion" is how I THOUGHT we had agreed to rephrase >my unfortunate "adaptation is not casting" -- so if you're using >conversion to mean adaptation, I'm nonplussed. Sorry; I think of "noisy adaptation" as being "conversion" -- i.e. I have one mental bucket for all conversion/adaptation scenarios that aren't "pure as-a" relationships. >Needing to be explicit and therefore to typechecking/typeswitching to pick >which adapter to apply is just what I don't *WANT* to do, what I don't >want *ANYBODY* to have to do EVER, and the very reason I'm spending time >and energy on PEP 246. So, how would you propose I know which adapter I >need, without spreading typechecks all over my bedraggled *CODE*?!?! See above. I thought this was obvious; sorry for the confusion. I think we may be getting close to a breakthrough, though; so let's hang in there a bit longer. From foom at fuhm.net Wed Jan 12 19:47:30 2005 From: foom at fuhm.net (James Y Knight) Date: Wed Jan 12 19:47:34 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011206001a5a3805@mail.gmail.com> <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <69F7EF0F-64CA-11D9-86E3-000A95A50FB2@fuhm.net> I'd just like to share a use case for transitive adaption that I've just run into (I'm using Zope.Interface which does not support it). Make of this what you will, I just thought an actual example where transitive adaption is actually necessary might be useful to the discussion. I have a framework with an interface 'IResource'. A number of classes implement this interface. I introduce a new version of the API, with an 'INewResource', so I register a wrapping adapter from IResource->INewResource in the global adapter registry. This all works fine as long as all the old code returns objects that directly provide IResource. However, the issue is that code working with the old API *doesn't* just do that, it may also have returned something that is adaptable to IResource. Therefore, calling INewResource(obj) will fail, because there is no direct adapter from obj to INewResource, only obj->IResource and IResource->INewResource. Now, I can't just add the extra adapters from obj->INewResource myself, because the adapter from obj->IResource is in client code, compatibility with which is needed. So, as far as I can tell, I have two options: 1) everywhere I want to adapt to INewResource, do a dance: resource = INewResource(result, None) if resource is not None: return resource resource = IResource(result, None) if resource is not None: return INewResource(resource) else: raise TypeError("blah") 2) Make a custom __adapt__ on INewResource to do similar thing. This seems somewhat difficult with zope.interface (need two classes) but does work. class INewResource(zope.interface.Interface): pass class SpecialAdaptInterfaceClass(zope.interface.InterfaceClass): def __adapt__(self, result): resource = zope.interface.InterfaceClass.__adapt__(self, other) if resource is not None: return resource resource = IResource(result, None) if resource is not None: return INewResource(result) INewResource.__class__ = SpecialAdaptInterfaceClass I chose #2. In any case, it certainly looks doable, even with a non-transitive adaptation system, but it's somewhat irritating. Especially if you end up needing to do that kind of thing often. James From aahz at pythoncraft.com Wed Jan 12 19:47:38 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed Jan 12 19:47:40 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <20050112184738.GB26104@panix.com> On Wed, Jan 12, 2005, Guido van Rossum wrote: > > PS. The term "data descriptor" now feels odd, perhaps we can say "hard > descriptors" instead. Hard descriptors have a __set__ method in > addition to a __get__ method (though the __set__ method may always > raise an exception, to implement a read-only attribute). I'd prefer "property descriptor" since that's the primary use case. "Getset descriptor" also works for me. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From pje at telecommunity.com Wed Jan 12 20:06:51 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 20:06:06 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> At 10:16 AM 1/12/05 -0800, Guido van Rossum wrote: >For example, inteface B (or perhaps this should be a property of the >adapter for B->C?) might be marked so as to allow or disallow its >consideration when looking for multi-step adaptations. We could even >make the default "don't consider", so only people who have to deal >with the multiple A's and/or multiple C's all adaptable via the same B >could save themselves some typing by turning it on. Another possibility; I've realized from Alex's last mail that there's a piece of my reasoning that I haven't been including, and now I can actually explain it clearly (I hope). In my view, there are at least two kinds of adapters, with different fidelity requirements/difficulty: class -> interface ("lo-fi" is okay) interface -> interface (It better be perfect!) If you cannot guarantee that your interface-to-interface adapter is the absolute best way to adapt *any* implementation of the source interface, you should *not* treat it as an interface-to-interface adapter, but rather as a class-to-interface adapter for the specific classes that need it. And, if transitivity exists, it is now restricted to a sensible subset of the possible paths. I believe that this difference is why I don't run into Alex's problems in practice; when I encounter a use case like his, I may write the same adapter, but I'll usually register it as an adapter from class-to-interface, if I need to register it for implicit adaptation at all. Also note that the fact that it's harder to write a solid interface-to-interface adapter naturally leads to my experience that transitivity problems occur more often via interface inheritance. This is because interface inheritance as implemented by both Zope and PyProtocols is equivalent to defining an interface-to-interface adapter, but with no implementation! Obviously, if it is already hard to write a good interface-to-interface adapter, then it must be even harder when you have to do it with no actual code! :) I think maybe this gets us a little bit closer to having a unified (or at least unifiable) view on the problem area. If Alex agrees that class-to-interface adaptation is an acceptable solution for limiting the transitivity of noisy adaptation while still allowing some degree of implicitness, then maybe we have a winner. (Btw, the fact that Zope and Twisted's interface systems initially implemented *only* interface-to-interface adaptation may have also led to their conceptualizing transitivity as unsafe, since they didn't have the option of using class-to-interface adapters as a way to deal with more narrowly-applicable adaptations.) From aleax at aleax.it Wed Jan 12 20:31:25 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 20:31:32 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <8C3AB4AC-64D0-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 19:16, Guido van Rossum wrote: ... > [Alex] >> Hm? > > I meant if there were multiple A's. For every Ai that has an Ai->B you > would also have to register a trivial Ai->C. And if there were > multiple C's (B->C1, B->C2, ...) then the number of extra adaptors to > register would be the number of A's *times* the number of C's, in > addition to the sum of those numbers for the "atomic" adaptors (Ai->B, > B->Cj). Ah, OK, I get it now, thanks. > But now, since I am still in favor of automatic "combined" adaptation > *as a last resort*, I ask you to consider that Python is not C++, and > that perhaps we can make the experience in Python better than it was > in C++. Perhaps allowing more control over when automatic adaptation > is acceptable? Yes, that would be necessary to achieve parity with C++, which does now have the 'explicit' keyword (to state that a conversion must not be used as a step in a chain automatically constructed) -- defaults to "acceptable in automatic chains" for historical and backwards compatibility reasons. > For example, inteface B (or perhaps this should be a property of the > adapter for B->C?) might be marked so as to allow or disallow its > consideration when looking for multi-step adaptations. We could even > make the default "don't consider", so only people who have to deal > with the multiple A's and/or multiple C's all adaptable via the same B > could save themselves some typing by turning it on. Yes, this idea you propose seems to me to be a very reasonable compromise: one can get the convenience of automatic chains of adaptations but only when the adaptations involved are explicitly asserted to be OK for that. I think that the property (of being OK for automatic/implicit/chained/transitive use) should definitely be one of the adaptation rather than of an interface, btw. Alex From pje at telecommunity.com Wed Jan 12 20:35:02 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 20:34:17 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <1105540027.5326.19.camel@localhost> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112142027.03d54910@mail.telecommunity.com> At 07:49 AM 1/12/05 -0800, Guido van Rossum wrote: >[Phillip] > > As for the issue of what should and shouldn't exist in Python, it doesn't > > really matter; PEP 246 doesn't (and can't!) *prohibit* transitive > > adaptation. > >Really? Then isn't it underspecified? No; I just meant that: 1. With its current hooks, implementing transitivity is trivial; PyProtocols' interface objects have an __adapt__ that does the transitive lookup. So, as currently written, this is perfectly acceptable in PEP 246. 2. Given Python's overall flexibility, there's really no way to *stop* anybody from implementing it short of burying the whole thing in C and providing no way to access it from Python. And then somebody can still implement an extension module. ;) > I'd think that by the time we >actually implement PEP 246 in the Python core, this part of the >semantics should be specified (at least the default behavior, even if >there are hooks to change this). The default behavior *is* specified: it's just specified as "whatever you want". :) What Alex and I are really arguing about is what should be the "one obvious way to do it", and implicitly, what Python interfaces should do. Really, the whole transitivity argument is moot for PEP 246 itself; PEP 246 doesn't really care, because anybody can do whatever they want with it. It's Python's "standard" interface implementation that cares; should its __adapt__ be transitive, and if so, how transitive? (PEP 246's global registry could be transitive, I suppose, but it's only needed for adaptation to a concrete type, and I only ever adapt to interfaces, so I don't have any experience with what somebody might or might not want for that case.) Really, the only open proposals remaining (i.e. not yet accepted/rejected by Alex) for actually *changing* PEP 246 that I know of at this point are: 1. my suggestion for how to handle the LiskovViolation use case by returning None instead of raising a special exception 2. that classic classes be supported, since the old version of PEP 246 supported them and because it would make exceptions unadaptable otherwise. The rest of our discussion at this point is just pre-arguing a not-yet-written PEP for how Python interfaces should handle adaptation. :) From pje at telecommunity.com Wed Jan 12 20:39:09 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 20:38:23 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <1105552006.41e562862bf84@mcherm.com> Message-ID: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> At 09:46 AM 1/12/05 -0800, Michael Chermside wrote: >This is a collection of responses to various things that don't appear >to have been resolved yet: > >Phillip writes: > > if a target protocol has optional aspects, then lossy adaptation to it is > > okay by definition. Conversely, if the aspect is *not* optional, then > > lossy adaptation to it is not acceptable. I don't think there can really > > be a middle ground; you have to decide whether the information is required > > or not. > >I disagree. To belabor Alex's example, suppose LotsOfInfo has first, middle, >and last names; PersonName has first and last, and FullName has first, >middle initial and last. FullName's __doc__ specifically states that if >the middle name is not available or the individual does not have a middle >name, then "None" is acceptable. The error, IMO, is in registering an interface-to-interface adapter from PersonName to FullName; at best, it should be explicitly registered only for concrete classes that lack some way to provide a middle name. If you don't want to lose data implicitly, don't register an implicit adaptation that loses data. >You're probably going to say "okay, then register a LotsOfInfo->FullName >converter", and I agree. But if no such converter is registered, I >would rather have a TypeError then an automatic conversion which produces >incorrect results. Then don't register a data-losing adapter for implicit adaptation for any possible input source; only the specific input sources that you need it for. > > it's difficult because intuitively an interface defines a *requirement*, so > > it seems logical to inherit from an interface in order to add requirements! > >Yes... I would fall into this trap as well until I'd been burned a few times. It's burned me more than just a few times, and I *still* sometimes make it if I'm not paying attention. It's just too easy to make the mistake. So, I'm actually open to considering dropping interface inheritance. For adapters, I think it's much harder to make this mistake because you have more time to think about whether your adapter is universal or not, and you can always err on the safe side. In truth, I believe I much more frequently implement class-to-interface adapters than interface-to-interface ones. I can always go back later and declare the adapter as interface-to-interface if I want, so there's no harm in starting them out as class-to-interface adapters. >Gee... I'm understanding the problem a little better, but elegant >solutions are still escaping me. My solution is to use class-to-interface adaptation for most adaptation, and interface-to-interface adaptation only when the adaptation can be considered "correct by definition". It seems to work for me. From pje at telecommunity.com Wed Jan 12 20:51:16 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 20:50:38 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it> References: <79990c6b05011206001a5a3805@mail.gmail.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011206001a5a3805@mail.gmail.com> Message-ID: <5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com> At 03:48 PM 1/12/05 +0100, Alex Martelli wrote: >Demanding that the set of paths of minimal available length has exactly >one element is strange, though, IF one is assuming that all adaptation >paths are exactly equivalent except at most for secondary issues of >performance (which are only adjudicated by the simplest heuristic: if >those issues were NOT considered minor/secondary, then a more >sophisticated scheme would be warranted, e.g. by letting the programmer >associate a cost to each step, picking the lowest-cost path, AND letting >the caller of adapt() also specify the maximal acceptable cost or at least >obtain the cost associated with the chosen path). There's a very simple reason. If one is using only non-noisy adapters, there is absolutely no reason to ever define more than one adapter between the *same* two points. If you do, then somebody is doing something redundant, and there is a possibility for error. In practice, a package or library that declares two interfaces should provide the adapter between them, if a sensible one can exist. For two separate packages, ordinarily one is the client and needs to adapt from one of its own implementations or interfaces to a foreign interface, or vice versa, and in either case the client should be the registrant for the adapters. Bridging between two foreign packages is the only case in which there is an actual possibility of having two packages attempt to bridge the exact same interfaces or implementations, and this case is very rare, at least at present. Given the extreme rarity of this legitimate situation where two parties have independently created adapter paths of the same length and number of adapters between two points, I considered it better to consider the situation an error, because in the majority of these bridging cases, the current author is the one who created at least one of the bridges, in which case he now knows that he is creating a redundant adapter that he need no longer maintain. The very rarest of all scenarios would be that the developer is using two different packages that both bridge the same items between two *other* packages. This is the only scenario I can think of where would be a duplication that the current developer could not easily control, and the only one where PyProtocols' current policy would create a problem for the developer, requiring them to explicitly work around the issue by declaring an artificially "better" adapter path to resolve the ambiguity. As far as I can tell, this scenario will remain entirely theoretical until there are at least two packages out there with interfaces that need bridging, and two more packages exist that do the bridging, that someone might want to use at the same time. I think that this will take a while. :) In the meantime, all other adapter ambiguities are suggestive of a possible programming or design error, such as using interface inheritance to denote what an interface requires instead of what it provides, incorrectly claiming that something is a universal (interface-to-interface) adapter when it is only suitable for certain concrete classes, etc. >Personally, I disagree with having transitivity at all, unless perhaps it >be restricted to adaptations specifically and explicitly stated to be >"perfect and lossless"; PJE claims that ALL adaptations MUST, ALWAYS, be >"perfect and lossless" -- essentially, it seems to me, he _has_ to claim >that, to defend transitivity being applied automatically, relentlessly, >NON-optionally, NON-selectively (but then the idea of giving an error when >two or more shortest-paths have the same length becomes dubious). No, it follows directly from the premise. If adapters are non-noisy, why do you need more than one adapter chain of equal length between two points? If you have such a condition, you have a redundancy at the least, and more likely a programming error -- surely BOTH of those adapters are not correct, unless you have that excruciatingly-rare case I mentioned above. >BTW, Microsoft's COM's interfaces ONLY have the "inferior" kind of >inheritance. You can say that interface ISub inherits from IBas: this >means that ISub has all the same methods as IBas with the same signatures, >plus it may have other methods; it does *NOT* mean that anything >implementing ISub must also implement IBas, nor that a QueryInterface on >an ISub asking for an IBas must succeed, or anything of that kind. In >many years of COM practice I have NEVER found this issue to be a >limitation -- it works just fine. I'm actually open to at least considering dropping interface inheritance transitivity, due to its actual problems in practice. Fewer than half of the interfaces in PEAK do any inheritance, so having to explicitly declare that one interface implies another isn't a big deal. Such a practice might seem very strange to Java programers, however, since it means that if you declare (in Python) a method to take IBas, it will not accept an ISub, unless the object has explicitly declared that it supports both. (Whereas in Java it suffices for the class to declare that it supports ISub.) From cce at clarkevans.com Wed Jan 12 20:57:11 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Wed Jan 12 20:57:15 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <20050112195711.GA1813@prometheusresearch.com> On Wed, Jan 12, 2005 at 10:16:14AM -0800, Guido van Rossum wrote: | But now, since I am still in favor of automatic "combined" adaptation | *as a last resort*, I ask you to consider that Python is not C++, and | that perhaps we can make the experience in Python better than it was | in C++. Perhaps allowing more control over when automatic adaptation | is acceptable? | | For example, inteface B (or perhaps this should be a property of the | adapter for B->C?) might be marked so as to allow or disallow its | consideration when looking for multi-step adaptations. We could even | make the default "don't consider", so only people who have to deal | with the multiple A's and/or multiple C's all adaptable via the same B | could save themselves some typing by turning it on. How about not allowing transitive adaptation, by default, and then providing two techniques to help the user cope: - raise a AdaptIsTransitive(AdaptationError) exception when an adaptation has failed, but there exists a A->C pathway using an intermediate B - add a flag to adapt, allowTransitive, which defaults to False This way new developers don't accidently shoot their foot off, as Alex warns; however, the price for doing this sort of thing is cheap. The AdaptIsTransitive error could even explain the problem with a dynamic error message like: "You've tried to adapt a LDAPName to a FirstName, but no direct translation exists. There is an indirect translation using FullName: LDAPName -> FullName -> FirstName. If you'd like to use this intermediate object, simply call adapt() with allowTransitive = True" Clark From aleax at aleax.it Wed Jan 12 20:59:28 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 20:59:34 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> Message-ID: <777FF1CC-64D4-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 20:06, Phillip J. Eby wrote: > At 10:16 AM 1/12/05 -0800, Guido van Rossum wrote: >> For example, inteface B (or perhaps this should be a property of the >> adapter for B->C?) might be marked so as to allow or disallow its >> consideration when looking for multi-step adaptations. We could even >> make the default "don't consider", so only people who have to deal >> with the multiple A's and/or multiple C's all adaptable via the same B >> could save themselves some typing by turning it on. > > Another possibility; I've realized from Alex's last mail that there's > a piece of my reasoning that I haven't been including, and now I can > actually explain it clearly (I hope). In my view, there are at least > two kinds of adapters, with different fidelity > requirements/difficulty: > > class -> interface ("lo-fi" is okay) > interface -> interface (It better be perfect!) > > If you cannot guarantee that your interface-to-interface adapter is > the absolute best way to adapt *any* implementation of the source > interface, you should *not* treat it as an interface-to-interface > adapter, but rather as a class-to-interface adapter for the specific > classes that need it. And, if transitivity exists, it is now > restricted to a sensible subset of the possible paths. Even though Guido claimed I have been belaboring the following point, I do think it's crucial and I still haven't seen you answer it. If *any* I1->I2 adapter, by the very fact of its existence, asserts it's the *absolute best way* to adapt ANY implementation of I1 into I2; then why should the existence of two equal-length shortest paths A->I1->I2 and A->I3->I2 be considered a problem in any sense? Pick either, at random or by whatever rule: you trust that they're BOTH the absolute best, so they must be absolutely identical anyway. If you agree that this is the only sensible behavior, and PyProtocols' current behavior (TypeError for two paths of equal length save in a few special cases), then I guess can accept your stance that providing adaptation between interfaces implies the strongest possible degree of commitment to perfection, and that this new conception of *absolute best way* entirely and totally replaces previous weaker and more sensible descriptions, such as for example in that shorter chains "are less likely to be a ``lossy'' conversion". ``less likely'' and ``absolute best way'' just can't coexist. Two "absolute best ways" to do the same thing are exactly equally likely to be ``lossy'': that likelihood is ZERO, if "absolute" means anything. ((Preferring shorter chains as a heuristic for faster ones may be very reasonable approach if performance is a secondary consideration, as I've already mentioned; if performance were more important than that, then other ``costs'' besides the extreme NO_ADAPTER_NEEDED [[0 cost]] and DOES_NOT_SUPPORT [[infinite cost]] should be accepted, and the minimal-cost path ensured -- I do not think any such complication is warranted)). > I think maybe this gets us a little bit closer to having a unified (or > at least unifiable) view on the problem area. If Alex agrees that > class-to-interface adaptation is an acceptable solution for limiting > the transitivity of noisy adaptation while still allowing some degree > of implicitness, then maybe we have a winner. If you agree that it cannot be an error to have two separate paths of "absolute best ways" (thus equally perfect) of equal length, then I can accept your stance that one must ensure the "absolute best way" each time one codes and registers an I -> I adapter (and each time one interface inherits another interface, apparently); I can then do half the rewrite of the PEP 246 draft (the changes already mentioned and roughly agreed) and turn it over to you as new first author to complete with the transitivity details &c. If there is any doubt whatsoever marring that perfection, that "absolute best way", then I fear we're back at square one. Alex From pje at telecommunity.com Wed Jan 12 21:03:28 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 21:02:44 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <51767A6A-64B3-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011206001a5a3805@mail.gmail.com> <0CC2F6D6-64A9-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <5.1.1.6.0.20050112145313.02eb26b0@mail.telecommunity.com> At 05:02 PM 1/12/05 +0100, Alex Martelli wrote: >So, I think PEP 246 should specify that the step now called (e) [checking >the registry] comes FIRST; then, an isinstance step [currently split >between (a) and (d)], then __conform__ and __adapt__ steps [currently >called (b) and (c)]. One question, and one suggestion. The question: should the registry support explicitly declaring that a particular adaptation should *not* be used, thus pre-empting later phases entirely? This would allow for the possibility of speeding lookups by caching, as well as the option to "opt out" of specific adaptations, which some folks seem to want. ;) The suggestion: rather than checking isinstance() in adapt(), define object.__conform__ such that it does the isinstance() check. Then, Liskov violation is simply a matter of returning 'None' from __conform__ instead of raising a special error. > Checking the registry is after all very fast: make the 2-tuple > (type(obj), protocol), use it to index into the registry -- period. So, > it's probably not worth complicating the semantics at all just to "fast > path" the common case. Okay, one more suggestion/idea: $ timeit -s "d={}; d[1,2]=None" "d[1,2]" 1000000 loops, best of 3: 1.65 usec per loop $ timeit -s "d={}; d[1]={2:None}" "d[1][2]" 1000000 loops, best of 3: 0.798 usec per loop This seems to suggest that using nested dictionaries could be faster under some circumstances than creating the two-tuple to do the lookup. Of course, these are trivially-sized dictionaries and this is also measuring Python bytecode speed, not what would happen in C. But it suggests that more investigation might be in order. From skip at pobox.com Wed Jan 12 21:03:30 2005 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 12 21:03:59 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <1105553300.41e56794d1fc5@mcherm.com> References: <1105553300.41e56794d1fc5@mcherm.com> Message-ID: <16869.33426.883395.345417@montanaro.dyndns.org> Michael> This must be one of those cases where I am mislead by my Michael> background... I thought of Liskov substitution principle as a Michael> piece of basic CS background that everyone learned in school Michael> (or from the net, or wherever they learned Michael> programming). Clearly, that's not true. Note that some us were long out of school by the time Barbara Liskov first published the idea (in 1988 according to http://c2.com/cgi/wiki?LiskovSubstitutionPrinciple). Also, since it pertains to OO programming it was probably not taught widely until the mid-90s. That means a fair number of people will have never heard about it. Michael> Guido writes: >> How about SubstitutabilityError? I don't think that's any better. At the very least, people can Google for "Liskov violation" to educate themselves. I'm not sure that the results of a Google search for "Subtitutability Error" will be any clearer. Michael> It would be less precise and informative to ME but apparently Michael> more so to a beginner. Obviously, we should support the Michael> beginner! I don't think that's appropriate in this case. Liskov violation is something precise. I don't think that changing what you call it will help beginners understand it any better in this case. I say leave it as it and make sure it's properly documented. Skip From aleax at aleax.it Wed Jan 12 21:05:54 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 21:05:59 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> References: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> Message-ID: <5DAAD430-64D5-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 20:39, Phillip J. Eby wrote: ... >> > it's difficult because intuitively an interface defines a >> *requirement*, so >> > it seems logical to inherit from an interface in order to add >> requirements! >> >> Yes... I would fall into this trap as well until I'd been burned a >> few times. > > It's burned me more than just a few times, and I *still* sometimes > make it if I'm not paying attention. It's just too easy to make the > mistake. So, I'm actually open to considering dropping interface > inheritance. What about accepting Microsoft's QueryInterface precedent for this? I know that "MS" is a dirty word to many, but I did like much of what they did in COM, personally. The QI precedent would be: you can inherit interface from interface, but that does NOT intrinsically imply substitutability -- it just means the inheriting interface has all the methods of the one being subclassed, with the same signatures, without having to do a nasty copy-and-paste. Of course, one presumably could use NO_ADAPTER_NEEDED to easily (but explicitly: that makes a difference!) implement the common case in which the inheriting interface DOES want to assert that it's perfectly / losslessly / etc substitutable for the one being inherited. Alex From pje at telecommunity.com Wed Jan 12 21:14:08 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 21:13:25 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5DAAD430-64D5-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112150933.03378630@mail.telecommunity.com> At 09:05 PM 1/12/05 +0100, Alex Martelli wrote: >On 2005 Jan 12, at 20:39, Phillip J. Eby wrote: >>It's burned me more than just a few times, and I *still* sometimes make >>it if I'm not paying attention. It's just too easy to make the >>mistake. So, I'm actually open to considering dropping interface inheritance. > >What about accepting Microsoft's QueryInterface precedent for this? I >know that "MS" is a dirty word to many, but I did like much of what they >did in COM, personally. The QI precedent would be: you can inherit >interface from interface, but that does NOT intrinsically imply >substitutability -- it just means the inheriting interface has all the >methods of the one being subclassed, with the same signatures, without >having to do a nasty copy-and-paste. Of course, one presumably could use >NO_ADAPTER_NEEDED to easily (but explicitly: that makes a difference!) >implement the common case in which the inheriting interface DOES want to >assert that it's perfectly / losslessly / etc substitutable for the one >being inherited. Well, you and I may agree to this, but we can't agree on behalf of everybody else who hasn't been bitten by this problem, I'm afraid. I checked PEAK and about 62 out of 150 interfaces inherited from anything else; it would not be a big deal to explicitly do the NO_ADAPTER_NEEDED thing, especially since PyProtocols has an 'advise' keyword that does the declaration, anyway; inheritance is just a shortcut for that declaration when you are using only one kind of interface, so the "explicit" way of defining NO_ADAPTER_NEEDED between two interfaces has only gotten used when mixing Zope or Twisted interfaces w/PyProtocols. Anyway, I'm at least +0 on dropping this; the reservation is just because I don't think everybody else will agree with this, and don't want to be appearing to imply that consensus between you and me implies any sort of community consensus on this point. That is, the adaptation from "Alex and Phillip agree" to "community agrees" is noisy at best! ;) From carribeiro at gmail.com Wed Jan 12 21:21:38 2005 From: carribeiro at gmail.com (Carlos Ribeiro) Date: Wed Jan 12 21:21:41 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <864d370905011212213cf3aa16@mail.gmail.com> On Wed, 12 Jan 2005 10:16:14 -0800, Guido van Rossum wrote: > But now, since I am still in favor of automatic "combined" adaptation > *as a last resort*, I ask you to consider that Python is not C++, and > that perhaps we can make the experience in Python better than it was > in C++. Perhaps allowing more control over when automatic adaptation > is acceptable? > > For example, inteface B (or perhaps this should be a property of the > adapter for B->C?) might be marked so as to allow or disallow its > consideration when looking for multi-step adaptations. We could even > make the default "don't consider", so only people who have to deal > with the multiple A's and/or multiple C's all adaptable via the same B > could save themselves some typing by turning it on. +1. BTW, I _do_ use adaptation, including the 'lossy' one described in this scenario (where the mapping is imperfect, or incomplete). So having some way to tell the adaptation framework that a particular adapter is not suited to use in a transitive chain is a good thing IMHO. Generically speaking, anything that puts some control on the hands of the programmer - as long it does not stand in the way between him and the problem - is good. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com From kbk at shore.net Wed Jan 12 21:26:33 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed Jan 12 21:27:05 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <16869.33426.883395.345417@montanaro.dyndns.org> (Skip Montanaro's message of "Wed, 12 Jan 2005 14:03:30 -0600") References: <1105553300.41e56794d1fc5@mcherm.com> <16869.33426.883395.345417@montanaro.dyndns.org> Message-ID: <87oefu1jie.fsf@hydra.bayview.thirdcreek.com> Skip Montanaro writes: > I don't think that's appropriate in this case. Liskov violation is > something precise. I don't think that changing what you call it will help > beginners understand it any better in this case. I say leave it as it and > make sure it's properly documented. +1 -- KBK From just at letterror.com Wed Jan 12 21:27:25 2005 From: just at letterror.com (Just van Rossum) Date: Wed Jan 12 21:27:43 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <16869.33426.883395.345417@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > > Michael> This must be one of those cases where I am mislead by my > Michael> background... I thought of Liskov substitution principle > Michael> as a piece of basic CS background that everyone learned > Michael> in school (or from the net, or wherever they learned > Michael> programming). Clearly, that's not true. > > Note that some us were long out of school by the time Barbara Liskov > first published the idea (in 1988 according to > http://c2.com/cgi/wiki?LiskovSubstitutionPrinciple). Also, since it > pertains to OO programming it was probably not taught widely until > the mid-90s. That means a fair number of people will have never > heard about it. ...and then there are those Python users who have no formal CS background at all. Python is used quite a bit by people who's main job is not programming. I'm one of those, and whatever I know about CS, I owe it mostly to the Python community. I learned an awful lot just by hanging out on various Python mailing lists. > Michael> Guido writes: > >> How about SubstitutabilityError? > > I don't think that's any better. At the very least, people can > Google for "Liskov violation" to educate themselves. I'm not sure > that the results of a Google search for "Subtitutability Error" will > be any clearer. Well, with a bit of luck Google will point to the Python documentation then... Just From aleax at aleax.it Wed Jan 12 21:30:58 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 12 21:31:08 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com> References: <79990c6b05011206001a5a3805@mail.gmail.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011206001a5a3805@mail.gmail.com> <5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com> Message-ID: On 2005 Jan 12, at 20:51, Phillip J. Eby wrote: ... > There's a very simple reason. If one is using only non-noisy > adapters, there is absolutely no reason to ever define more than one > adapter between the *same* two points. If you do, ...but there's no harm whatsoever done, either. If I have four interfaces I use regularly, A, B, C, D, and I have the need to adapt A->B, A->C, B->D, C->D, with every one of these four adaptations being "the absolute best way" (as you stated all interface adaptations must be), then why should that be at all a problem? Maybe sometimes someone will need to adapt A->D, fine -- again, no harm whatsoever, IF everything is as perfect as it MUST be for transitivity to apply unconditionally. Put it another way: say I have the first three of these adaptations, only, so everything is hunky-dory. Now I come upon a situation where I need C->D, fine, I add it: where's the error, if every one of the four adaptations is just perfect? I admit I can't sharply follow your gyrations about what's in what package, who wrote what, and why the fact that interfaces and adaptation (particularly transitive adaptation) are NOT widespread at all so far (are only used by early adopters, on average heads and shoulders above the average Python coder) makes it MORE important to provide an error in a case that, by the premises, cannot be an error (would it be LESS important to provide the error if everybody and their cousin were interfacing and adapting with exhuberance...?). All I can see is: 1. if an interface adapter must ABSOLUTELY be perfect, transitivity is fine, but the error makes no sense 2. if the error makes sense (or the assertion about "less likely to be lossy" makes any sense, etc etc), then transitivity is NOT fine -- adapters can be imperfect, and there is NO way to state that they are, one just gets an error message if one is SO DEUCEDLY LUCKY as to have created in the course of one's bumbling two shortest-paths of the same length I suspect [2] holds. But you're the one with experience, so if you stake that on [1], and the "absolute best way" unconditional assertion, then, fine, I guess, as per my previous message. But the combination of "absolute best way" _AND_ an error when somebody adds C->D is, in my opinion, self-contradictory: experience or not, I can't support asserting something and its contrary at the same time. > then somebody is doing something redundant, and there is a possibility > for error. In Not at all: each of the four above-listed adaptations may be needed to perform an unrelated adapt(...) operation. How can you claim that set of four adaptations is REDUNDANT, when adding a FIFTH one (a direct A->D) would make it fine again per your rules? This is the first time I've heard an implied claim that redundancy is something that can be eliminated by ADDING something, without taking anything away. >> Personally, I disagree with having transitivity at all, unless >> perhaps it be restricted to adaptations specifically and explicitly >> stated to be "perfect and lossless"; PJE claims that ALL adaptations >> MUST, ALWAYS, be "perfect and lossless" -- essentially, it seems to >> me, he _has_ to claim that, to defend transitivity being applied >> automatically, relentlessly, NON-optionally, NON-selectively (but >> then the idea of giving an error when two or more shortest-paths have >> the same length becomes dubious). > > No, it follows directly from the premise. If adapters are non-noisy, > why do you need more than one adapter chain of equal length between > two points? If you have such a condition, you I don't NEED the chain, but I may well need each step; and by the premise of "absolute best way" which you maintain, it must be innocuous if the separate steps I need end up producing more than one chain -- what difference can it make?! > have a redundancy at the least, and more likely a programming error -- > surely BOTH of those adapters are not correct, unless you have that > excruciatingly-rare case I mentioned above. Each of the FOUR adapters coded can be absolutely perfect. Thus, the composite adapters which your beloved transitivity builds will also be perfect, and it will be absolutely harmless to pick one of them at random. >> BTW, Microsoft's COM's interfaces ONLY have the "inferior" kind of >> inheritance. You can say that interface ISub inherits from IBas: >> this means that ISub has all the same methods as IBas with the same >> signatures, plus it may have other methods; it does *NOT* mean that >> anything implementing ISub must also implement IBas, nor that a >> QueryInterface on an ISub asking for an IBas must succeed, or >> anything of that kind. In many years of COM practice I have NEVER >> found this issue to be a limitation -- it works just fine. > > I'm actually open to at least considering dropping interface > inheritance transitivity, due to its actual problems in practice. > Fewer than half of the interfaces in PEAK do any inheritance, so > having to explicitly declare that one interface implies another isn't > a big deal. Now that is something I'd really love, as per my previous msg. > Such a practice might seem very strange to Java programers, however, > since it means that if you declare (in Python) a method to take IBas, > it will not accept an ISub, unless the object has explicitly declared > that it supports both. (Whereas in Java it suffices for the class to > declare that it supports ISub.) Often the author of ISub will be able to declare support for IBas as well as inheriting (widening) of it; when that is not possible, the Java programmer, although surprised, will most likely be better off for having to be a tad more explicit. Alex From pje at telecommunity.com Wed Jan 12 21:42:32 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 21:41:51 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <777FF1CC-64D4-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> At 08:59 PM 1/12/05 +0100, Alex Martelli wrote: >Even though Guido claimed I have been belaboring the following point, I do >think it's crucial and I still haven't seen you answer it. My post on that probably crossed with this post of yours; it contains an excruciating analysis of why I chose to consider such paths dubious. However, I'll briefly answer your specific questions here. (Well, briefly for ME! ;) ) > If *any* I1->I2 adapter, by the very fact of its existence, asserts > it's the *absolute best way* to adapt ANY implementation of I1 into I2; > then why should the existence of two equal-length shortest paths > A->I1->I2 and A->I3->I2 be considered a problem in any sense? Pick > either, at random or by whatever rule: you trust that they're BOTH the > absolute best, so they must be absolutely identical anyway. Because if you have asserted that it is the absolute best, why did you write *another* one that's equally good? This suggests that at least one of the paths you ended up with was unintentional: created, for example, via inappropriate use of interface inheritance. Anyway, the other post has a detailed analysis for all the circumstances I can think of where you *might* have such a set of ambiguous adapter paths, and why it's excruciatingly rare that you would not in fact care when such a situation existed, and why the error is therefore valuable in pointing out the (almost certainly unintended) duplication. >If you agree that this is the only sensible behavior, and PyProtocols' >current behavior (TypeError for two paths of equal length save in a few >special cases), then I guess can accept your stance that providing >adaptation between interfaces implies the strongest possible degree of >commitment to perfection, and that this new conception of *absolute best >way* entirely and totally replaces previous weaker and more sensible >descriptions, such as for example in > that >shorter chains "are less likely to be a ``lossy'' conversion". >``less likely'' and ``absolute best way'' just can't coexist. Two >"absolute best ways" to do the same thing are exactly equally likely to be >``lossy'': that likelihood is ZERO, if "absolute" means anything. First off, as a result of our conversations here, I'm aware that the PyProtocols documentation needs updating; it was based on some of my *earliest* thinking about adaptation, before I realized how critical the distinction between class-to-interface and interface-to-interface adaptation really was. And, my later thinking has only really been properly explained (even to my satisfaction!) during this discussion. Indeed, there are lots of things I know now about when to adapt and when not to, that I had only the faintest idea of when I originally wrote the documentation. Second, if the error PyProtocols produces became a problem in practice, it could potentially be downgraded to a warning, or even disabled entirely. However, my experience with it has been that the *real* reason to flag adapter ambiguities is that they usually reveal some *other* problem, that would be much harder to find otherwise. >((Preferring shorter chains as a heuristic for faster ones may be very >reasonable approach if performance is a secondary consideration, as I've >already mentioned; if performance were more important than that, then >other ``costs'' besides the extreme NO_ADAPTER_NEEDED [[0 cost]] and >DOES_NOT_SUPPORT [[infinite cost]] should be accepted, and the >minimal-cost path ensured -- I do not think any such complication is >warranted)). Actually, the nature of the transitive algorithm PyProtocols uses is that it must track these running costs and pass them around anyway, so it is always possible to call one of its primitive APIs to force a certain cost consideration. However, I have never actually had to use it, and I discourage others from playing with it, because I think the need to use it would be highly indicative of some other problem, like inappropriate use of adaptation or at least of I-to-I relationships. >If you agree that it cannot be an error to have two separate paths of >"absolute best ways" (thus equally perfect) of equal length, then I can >accept your stance that one must ensure the "absolute best way" each time >one codes and registers an I -> I adapter (and each time one interface >inherits another interface, apparently); I can then do half the rewrite of >the PEP 246 draft (the changes already mentioned and roughly agreed) and >turn it over to you as new first author to complete with the transitivity >details &c. > >If there is any doubt whatsoever marring that perfection, that "absolute >best way", then I fear we're back at square one. The only doubt is that somebody may have *erroneously* created a duplicate adapter, or an unintended duplicate path via a NO_ADAPTER_NEEDED link (e.g. by declaring that a class implements an interface directly, or interface inheritance). Thus, even though such an adapter is technically correct and acceptable, it is only so if that's what you really *meant* to do. But, if the adapters are *really* perfect, then by definition you are wasting your time defining "more than one way to do it", so it probably means you are making *some* kind of mistake, even if it's only the mistake of duplicating effort needlessly! More likely, however, it means you have made some other mistake, like inappropriate interface inheritance. At least 9 times out of 10, when I receive an ambiguous adapter path error, it's because I just added some kind of NO_ADAPTER_NEEDED link: either class-implements-interface, or interface-subclasses-interface, and I did so without having thought about the consequences of having that path. The error tells me, "hey, you need to think about what you're doing here and be more explicit about what is going on, because there are some broader implications to what you just did." It's not always immediately obvious how to fix it, but it's almost always obvious that I actually *have* done something wrong, as soon as I think about it; it's not just a spurious error. Anyway, hopefully this post and the other one will be convincing that considering ambiguity to be an error *reinforces* the idea of I-to-I perfection, rather than undermining it. (After all, if you've written a perfect one, and there's already one there, then either one of you is mistaken, or you are wasting your time writing one!) From gvanrossum at gmail.com Wed Jan 12 22:15:20 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Jan 12 22:15:26 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050112195711.GA1813@prometheusresearch.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <20050112195711.GA1813@prometheusresearch.com> Message-ID: [Clark] > - add a flag to adapt, allowTransitive, which defaults to False That wouldn't work very well when most adapt() calls are invoked implicitly through signature declarations (per my blog's proposal). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Wed Jan 12 22:50:52 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Jan 12 22:50:11 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com> <79990c6b05011206001a5a3805@mail.gmail.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011206001a5a3805@mail.gmail.com> <5.1.1.6.0.20050112101324.03a77090@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112154647.0327ca40@mail.telecommunity.com> At 09:30 PM 1/12/05 +0100, Alex Martelli wrote: >On 2005 Jan 12, at 20:51, Phillip J. Eby wrote: > ... >>There's a very simple reason. If one is using only non-noisy adapters, >>there is absolutely no reason to ever define more than one adapter >>between the *same* two points. If you do, > >...but there's no harm whatsoever done, either. If I have four interfaces >I use regularly, A, B, C, D, and I have the need to adapt A->B, A->C, >B->D, C->D, with every one of these four adaptations being "the absolute >best way" (as you stated all interface adaptations must be), then why >should that be at all a problem? It isn't a problem, but *only* if A is an interface. If it's a concrete class, then A->B and A->C are not "perfect" adapters, so it *can* make a difference which one you pick, and you should be explicit. However, implementing an algorithm to ignore only interface-to-interface ambiguity is more complex than just hollering whenever *any* ambiguity is found. Also, people make mistakes and may have declared something they didn't mean to. The cost to occasionally be a bit more explicit is IMO outweighed by the benefit of catching bugs that might otherwise go unnoticed, but produce an ambiguity as a side-effect of the buggy part. It's *possible* that you'd still catch almost as many bugs if you ignored pure I-to-I diamonds, but I don't feel entirely comfortable about giving up that extra bit of protection, especially since it would make the checker *more* complex to try to *not* warn about that situation. Also, in general I'm wary of introducing non-determinism into a system's behavior. I consider keeping e.g. the first path declared or the last path declared to be a form of non-determinism because it makes the system sensitive to trivial things like the order of import statements. The current algorithm alerts you to this non-determinism. Perhaps it would be simplest for Python's interface system to issue a warning about ambiguities, but allow execution to proceed? >(would it be LESS important to provide the error if everybody and their >cousin were interfacing and adapting with exhuberance...?) Only in the use case where two people might legitimately create the same adapter, but neither of them can stop using their adapter in favor of the other person's, thus forcing them to work around the error. Or, in the case where lots of people try to define adapter diamonds and don't want to go to the trouble of having their program behave deterministically. :) >1. if an interface adapter must ABSOLUTELY be perfect, transitivity is >fine, but the error makes no sense The error only makes no sense if we assume that the human(s) really *mean* to be ambiguous. Ambiguity suggests, however, that something *else* may be wrong. >I suspect [2] holds. But you're the one with experience, so if you stake >that on [1], and the "absolute best way" unconditional assertion, then, >fine, I guess, as per my previous message. But the combination of >"absolute best way" _AND_ an error when somebody adds C->D is, in my >opinion, self-contradictory: experience or not, I can't support asserting >something and its contrary at the same time. It's not contrary; it's a warning that "Are you sure you want to waste time writing another way to do the same thing when there's already a perfectly valid way to do it with a comparable number of adaptation steps involved? Maybe your adapter is better-performing or less buggy in some way, but I'm just a machine so how would I know? Please tell me which of these adapters is the *really* right one to use, thanks." (Assuming that the machine is tactful enough to leave out mentioning that maybe you just made a mistake and declared the adapter between the wrong two points, you silly human you.) >How can you claim that set of four adaptations is REDUNDANT, when adding a >FIFTH one (a direct A->D) would make it fine again per your rules? This >is the first time I've heard an implied claim that redundancy is something >that can be eliminated by ADDING something, without taking anything away. PyProtocols doesn't say the situation is redundant, it says it's *ambiguous*, which implies a *possible* redundancy. I'm also saying that the ambiguity is nearly always (for me) an indicator of a *real* problem, not merely a not-so-explicit diamond. >I don't NEED the chain, but I may well need each step; and by the premise >of "absolute best way" which you maintain, it must be innocuous if the >separate steps I need end up producing more than one chain -- what >difference can it make?! Fair enough; however I think that in the event that the system must make such a choice, it must at *least* warn about non-deterministic behavior. Even if you are claiming perfect adaptation, that doesn't necessarily mean you are correct in your claim! >Each of the FOUR adapters coded can be absolutely perfect. Thus, the >composite adapters which your beloved transitivity builds will also be >perfect, and it will be absolutely harmless to pick one of them at random. Right, but the point of my examples was that in all but one extremely rare scenario, a *real* ambiguity of this type is trivial to fix by being explicit. *But*, it's more often the case that this ambiguity reflects an actual problem or error of some kind (at least IME to date), than that it indicates a harmless adapter diamond. And when you attempt to make the path more explicit, you then discover what that other mistake was. So, sometimes you are "wasting time" declaring that extra explicitness, and sometimes you save time because of it. Whether this tradeoff is right for everybody, I can't say; it's a little bit like static typing, but OTOH ISTM that it comes up much less often than static typing errors do, and it only has to be fixed once for each diamond. (i.e., it doesn't propagate into every possible aspect of your program.) From ianb at colorstudy.com Wed Jan 12 23:07:37 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Jan 12 23:07:59 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> Message-ID: <41E59FA9.4050605@colorstudy.com> Phillip J. Eby wrote: > Anyway, I'm honestly curious as to whether anybody can find a real > situation where transitive adapter composition is an *actual* problem, > as opposed to a theoretical one. I've heard a lot of people talk about > what a bad idea it is, but I haven't heard any of them say they actually > tried it. Conversely, I've also heard from people who *have* tried it, > and liked it. However, at this point I have no way to know if this > dichotomy is just a reflection of the fact that people who don't like > the idea don't try it, and the people who either like the idea or don't > care are open to trying it. I haven't read through the entire thread yet, so forgive me if I'm redundant. One case occurred to me with the discussion of strings and files, i.e., adapting from a string to a file. Let's say an IReadableFile, since files are too ambiguous. Consider the case where we are using a path object, like Jason Orendorff's or py.path. It seems quite reasonable and unambiguous that a string could be adapted to such a path object. It also seems quite reasonable and unambiguous that a path object could be adapted to a IReadableFile by opening the file at the given path. It's also quite unambiguous that a string could be adapted to a StringIO object, though I'm not sure it's reasonable. In fact, it seems like an annoying but entirely possible case that some library would register such an adapter, and mess things up globally for everyone who didn't want such an adaptation to occur! But that's an aside. The problem is with the first example, where two seemingly innocuous adapters (string->path, path->IReadableFile) allow a new adaptation that could cause all sorts of problems (string->IReadableFile). Ideally, if I had code that was looking for a file object and I wanted to accept filenames, I'd want to try to adapt to file, and if that failed I'd try to adapt to the path object and then from there to the file object. Or if I wanted it to take strings (that represented content) or file-like objects, I'd adapt to a file object and if that failed I'd adapt to a string, then convert to a StringIO object. A two-step adaptation encodes specific intention that it seems transitive adaption would be blind to. As I think these things through, I'm realizing that registered adaptators really should be 100% accurate (i.e., no information loss, complete substitutability), because a registered adapter that seems pragmatically useful in one place could mess up unrelated code, since registered adapters have global effects. Perhaps transitivity seems dangerous because that has the potential to dramatically increase the global effects of those registered adapters. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org From steven.bethard at gmail.com Wed Jan 12 23:19:09 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed Jan 12 23:19:11 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <41E59FA9.4050605@colorstudy.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> Message-ID: On Wed, 12 Jan 2005 16:07:37 -0600, Ian Bicking wrote: > One case occurred to me with the discussion of strings and files, i.e., > adapting from a string to a file. Let's say an IReadableFile, since > files are too ambiguous. > > Consider the case where we are using a path object, like Jason > Orendorff's or py.path. It seems quite reasonable and unambiguous that > a string could be adapted to such a path object. It also seems quite > reasonable and unambiguous that a path object could be adapted to a > IReadableFile by opening the file at the given path. This strikes me as a strange use of adaptation -- I don't see how a string can act-as-a path object, or how a path object can act-as-a file. I see that you might be able to *create* a path object from-a string, or a file from-a path object, but IMHO this falls more into the category of object construction than object adaptation... Are these the sorts of things we can expect people to be doing with adaptation? Or is in really intended mainly for the act-as-a behavior that I had assumed...? Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From carribeiro at gmail.com Wed Jan 12 23:26:09 2005 From: carribeiro at gmail.com (Carlos Ribeiro) Date: Wed Jan 12 23:26:11 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <41E59FA9.4050605@colorstudy.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> Message-ID: <864d370905011214267edda37e@mail.gmail.com> On Wed, 12 Jan 2005 16:07:37 -0600, Ian Bicking wrote: > As I think these things through, I'm realizing that registered > adaptators really should be 100% accurate (i.e., no information loss, > complete substitutability), because a registered adapter that seems > pragmatically useful in one place could mess up unrelated code, since > registered adapters have global effects. Perhaps transitivity seems > dangerous because that has the potential to dramatically increase the > global effects of those registered adapters. To put it quite bluntly: many people never bother to implement the _full_ interface of something if all they need is a half baked implementation. For example, I may get away with a sequence-like object in many situations without slice suport in getitem, or a dict with some of the iteration methods. Call it lazyness, but this is known to happen quite often, and Python is quite forgiving in this respect. Add the global scope of the adapter registry & transitivity to this and things may become much harder to debug... ...but on the other hand, transitivity is a powerful tool in the hands of an expert programmer, and allows to write much shorter & cleaner code. Some balance is needed. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com From cce at clarkevans.com Wed Jan 12 23:54:46 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Wed Jan 12 23:54:49 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <41E59FA9.4050605@colorstudy.com> References: <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> Message-ID: <20050112225446.GA43203@prometheusresearch.com> On Wed, Jan 12, 2005 at 04:07:37PM -0600, Ian Bicking wrote: | A two-step adaptation encodes specific intention that it seems transitive | adaption would be blind to. Exactly. Nice example Ian. To parrot your example a bit more concretely, the problem happens when you get two different adaptation paths. String -> PathName -> File String -> StringIO -> File Originally, Python may ship with the String->StringIO and StringIO->File adapters pre-loaded, and if my code was reliant upon this transitive chain, the following will work just wonderfully, def parse(file: File): ... parse("helloworld") by parsing "helloworld" content via a StringIO intermediate object. But then, let's say a new component "pathutils" registers another adapter pair: String->PathName and PathName->File This ambiguity causes a few problems: - How does one determine which adapter path to use? - If a different path is picked, what sort of subtle bugs occur? - If the default path isn't what you want, how do you specify the other path? I think Phillip's suggestion is the only resonable one here, ambiguous cases are an error; ask the user to register the adapter they need, or do a specific cast when calling parse(). | As I think these things through, I'm realizing that registered | adaptators really should be 100% accurate (i.e., no information loss, | complete substitutability), because a registered adapter that seems | pragmatically useful in one place could mess up unrelated code, since | registered adapters have global effects. I think this isn't all that useful; it's unrealistic to assume that adapters are always perfect. If transitive adaptation is even permitted, it should be unambiguous. Demanding that adaption is 100% perfect is a matter of perspective. I think String->StringIO and StringIO->File are perfectly pure. | Perhaps transitivity seems dangerous because that has the potential to | dramatically increase the global effects of those registered adapters. I'd prefer, 1. adaptation to _not_ be transitive (be explicit) 2. a simple mechanism for a user to register an explicit adaptation path from a source to a destination: adapt.path(String,PathName,File) to go from String->File, using PathName as an intermediate. 3. an error message, AdaptationError, to list all possible adaptation paths: Could not convert 'String' object to 'File' beacuse there is not a suitable adapter. Please consider an explicit conversion, or register a composite adapter with one of the following paths: adapt.path(String,PathName,File) adapt.path(String,StringIO,File) 3. raise an exception when _registering_ a 'path' which would conflict with any existing adapter: "Could not complete adapt.path(String,PathName,File) since an existing direct adapter from String to Path already exists." "Could not complete adapt.path(String,PathName,File) since an existing path String->StringIO->File is already registered". I'd rather have the latter error occur when "importing" modules rather than at run-time. This way, the exception is pinned on the correct library developer. Best, Clark From andrewm at object-craft.com.au Wed Jan 12 23:55:25 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Wed Jan 12 23:55:32 2005 Subject: [Python-Dev] Re: [Csv] csv module and universal newlines In-Reply-To: <16868.33914.837771.954739@montanaro.dyndns.org> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050110044441.250103C889@coffee.object-craft.com.au> <16868.33914.837771.954739@montanaro.dyndns.org> Message-ID: <20050112225525.236BE3C889@coffee.object-craft.com.au> >You can argue that reading csv data from/writing csv data to a file on >Windows if the file isn't opened in binary mode is an error. Perhaps we >should enforce that in situations where it matters. Would this be a start? > > terminators = {"darwin": "\r", > "win32": "\r\n"} > > if (dialect.lineterminator != terminators.get(sys.platform, "\n") and > "b" not in getattr(f, "mode", "b")): > raise IOError, ("%s not opened in binary mode" % > getattr(f, "name", "???")) > >The elements of the postulated terminators dictionary may already exist >somewhere within the sys or os modules (if not, perhaps they should be >added). The idea of the check is to enforce binary mode on those objects >that support a mode if the desired line terminator doesn't match the >platform's line terminator. Where that falls down, I think, is where you want to read an alien file - in fact, under unix, most of the CSV files I read use \r\n for end-of-line. Also, I *really* don't like the idea of looking for a mode attribute on the supplied iterator - it feels like a layering violation. We've advertised the fact that it's an iterator, so we shouldn't be using anything but the iterator protocol. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From Jack.Jansen at cwi.nl Thu Jan 13 00:02:39 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Thu Jan 13 00:02:23 2005 Subject: [Python-Dev] Re: [Csv] csv module and universal newlines In-Reply-To: <16868.33914.837771.954739@montanaro.dyndns.org> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050110044441.250103C889@coffee.object-craft.com.au> <16868.33914.837771.954739@montanaro.dyndns.org> Message-ID: <0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl> On 12-jan-05, at 2:59, Skip Montanaro wrote: > terminators = {"darwin": "\r", > "win32": "\r\n"} > > if (dialect.lineterminator != terminators.get(sys.platform, "\n") > and > "b" not in getattr(f, "mode", "b")): > raise IOError, ("%s not opened in binary mode" % > getattr(f, "name", "???")) On MacOSX you really want universal newlines. CSV files produced by older software (such as AppleWorks) will have \r line terminators, but lots of other programs will have files with normal \n terminators. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From pje at telecommunity.com Thu Jan 13 00:06:34 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 00:05:53 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> Message-ID: <5.1.1.6.0.20050112180414.02f886a0@mail.telecommunity.com> At 03:19 PM 1/12/05 -0700, Steven Bethard wrote: >On Wed, 12 Jan 2005 16:07:37 -0600, Ian Bicking wrote: > > One case occurred to me with the discussion of strings and files, i.e., > > adapting from a string to a file. Let's say an IReadableFile, since > > files are too ambiguous. > > > > Consider the case where we are using a path object, like Jason > > Orendorff's or py.path. It seems quite reasonable and unambiguous that > > a string could be adapted to such a path object. It also seems quite > > reasonable and unambiguous that a path object could be adapted to a > > IReadableFile by opening the file at the given path. > >This strikes me as a strange use of adaptation -- I don't see how a >string can act-as-a path object, or how a path object can act-as-a >file. I see the former, but not the latter. A string certainly can act-as-a path object; there are numerous stdlib functions that take a string and then use it "as a" path object. In principle, a future version of Python might take path objects for these operations, and automatically adapt strings to them. But a path can't act as a file; that indeed makes no sense. From pje at telecommunity.com Thu Jan 13 00:09:31 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 00:08:52 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <41E59FA9.4050605@colorstudy.com> References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> At 04:07 PM 1/12/05 -0600, Ian Bicking wrote: >It also seems quite reasonable and unambiguous that a path object could be >adapted to a IReadableFile by opening the file at the given path. Not if you think of adaptation as an "as-a" relationship, like using a screwdriver "as a" hammer (really an IPounderOfNails, or some such). It makes no sense to use a path "as a" readable file, so this particular adaptation is bogus. >The problem is with the first example, where two seemingly innocuous >adapters (string->path, path->IReadableFile) allow a new adaptation that >could cause all sorts of problems (string->IReadableFile). Two problems with this thought: 1) path->IReadableFile is bogus 2) even if you had path->IReadableFile, you're not broken unless you extend transitivity to pass through concrete target types (which I really don't recommend) >Ideally, if I had code that was looking for a file object and I wanted to >accept filenames, I'd want to try to adapt to file, and if that failed I'd >try to adapt to the path object and then from there to the file object. There are two reasonable ways to accomplish this. You can have code that expects an open stream -- in which case what's the harm in wrapping "open()" arount the value you pass if you want it to be opened? OR, you can have code that expects an "openable stream", in which case you can pass it any of these: 1. an already-open stream (that then adapts to an object with a trivial 'open()' method), 2. a path object that implements "openable stream" 3. a string that adapts to "openable stream" by conversion to a path object The only thing you can't implicitly pass in that case is a string-to-be-a-StringIO; you have to explicitly make it a StringIO. In *either* case, you can have a string adapt to either a path object or to a StringIO; you just can't have both then come back to a common interface. >As I think these things through, I'm realizing that registered adaptators >really should be 100% accurate (i.e., no information loss, complete >substitutability), because a registered adapter that seems pragmatically >useful in one place could mess up unrelated code, since registered >adapters have global effects. Perhaps transitivity seems dangerous >because that has the potential to dramatically increase the global effects >of those registered adapters. However, if you: 1) have transitivity only for interface-to-interface relationships (allowing only one class-to-interface link at the start of the path), and 2) use adaptation only for "as a" relationships, not to represent operations on objects you avoid these problems. For example, avoiding the one adapter you presented that's not "as a", the adapter diamond becomes a triangle. The longer the discussion goes on, however, the more I realize that like the internet, transitivity depends on the continued goodwill of your neighbors, and it only takes one fool to ruin things for a lot of people. On the other hand, I also hate the idea of having to kludge workarounds like the one James Knight was doing, in order to get a simple adaptation to work. From sxanth at cs.teiath.gr Thu Jan 13 10:26:51 2005 From: sxanth at cs.teiath.gr (stelios xanthakis) Date: Thu Jan 13 00:19:13 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <16869.33426.883395.345417@montanaro.dyndns.org> References: <1105553300.41e56794d1fc5@mcherm.com> <16869.33426.883395.345417@montanaro.dyndns.org> Message-ID: <41E63EDB.40008@cs.teiath.gr> Skip Montanaro wrote: > Michael> Guido writes: > >> How about SubstitutabilityError? > >I don't think that's any better. At the very least, people can Google for >"Liskov violation" to educate themselves. I'm not sure that the results of >a Google search for "Subtitutability Error" will be any clearer > ... > >I don't think that's appropriate in this case. Liskov violation is >something precise. I don't think that changing what you call it will help >beginners understand it any better in this case. I say leave it as it and >make sure it's properly documented. > > > Yes but in order to fall into a Liskov Violation, one will have to use extreme OOP features (as I understand from the ongoing discussion for which, honestly, I understand nothing:). So it's not like it will happen often and when it happens it will make sense to the architects who made such complex things. +1 on SubstitutabilityError or something easier and moreover because of the fact that some people really don't care who Liskov is and what he/she discovered, and whether that same thing would had been discovered anyway 2 mothns later by somebody else if the Liskov person wasn't there. St. From tjreedy at udel.edu Thu Jan 13 00:22:47 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Thu Jan 13 00:23:01 2005 Subject: [Python-Dev] Re: Recent IBM Patent releases References: Message-ID: "Scott David Daniels" > IBM has recently released 500 patents for use in opensource code. > > http://www.ibm.com/ibm/licensing/patents/pledgedpatents.pdf > > "...In order to foster innovation and avoid the possibility that a > party will take advantage of this pledge and then assert patents or > other intellectual property rights of its own against Open Source > Software, thereby limiting the freedom of IBM or any other Open > Source developer to create innovative software programs, the > commitment not to assert any of these 500 U.S. patents and all > counterparts of these patents issued in other countries is > irrevocable except that IBM reserves the right to terminate this > patent pledge and commitment only with regard to any party who files > a lawsuit asserting patents or other intellectual property rights > against Open Source Software." The exception is, of course, aimed for now at SCO and their ridiculous lawsuit against Linux and IBM with respect to Linux. from another post > I believe our current policy is that the author warrants that the code > is his/her own work and not encumbered by any patent. Without a qualifier such as 'To the best of my knowledge', the latter is an impossible warrant both practically, for an individual author without $1000s to spend on a patent search, and legally. Legally, there is no answer until the statute of limitations runs out or until there is an after-the-fact final answer provided by the court system. Terry J. Reedy From fredrik at pythonware.com Thu Jan 13 00:32:14 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Jan 13 00:32:12 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name References: <16869.33426.883395.345417@montanaro.dyndns.org> Message-ID: Just van Rossum wrote: > ...and then there are those Python users who have no formal CS > background at all. Python is used quite a bit by people who's main job > is not programming. ...and among us who do programming as a main job, I can assure that I'm not the only one who, if told by a computer that something I did was a LSP violation, would take that computer out in the backyard and shoot it. or at least hit it with a shovel, or something. From pje at telecommunity.com Thu Jan 13 01:49:06 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 01:48:27 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <16869.33426.883395.345417@montanaro.dyndns.org> References: <1105553300.41e56794d1fc5@mcherm.com> <1105553300.41e56794d1fc5@mcherm.com> Message-ID: <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> At 02:03 PM 1/12/05 -0600, Skip Montanaro wrote: >I don't think that's appropriate in this case. Liskov violation is >something precise. I don't think that changing what you call it will help >beginners understand it any better in this case. I say leave it as it and >make sure it's properly documented. Actually, the whole discussion is kind of backwards; you should never *get* a Liskov violation error, because it's raised strictly for control flow inside of __conform__ and caught by adapt(). So the *only* way you can see this error is if you call __conform__ directly, and somebody added code like this: raise LiskovViolation So, it's not something you need to worry about a newbie seeing. The *real* problem with the name is knowing that you need to use it in the first place! IMO, it's simpler to handle this use case by letting __conform__ return None, since this allows people to follow the One Obvious Way to not conform to a particular protocol. Then, there isn't a need to even worry about the exception name in the first place, either... From steven.bethard at gmail.com Thu Jan 13 01:54:41 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Jan 13 01:54:44 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> References: <1105553300.41e56794d1fc5@mcherm.com> <16869.33426.883395.345417@montanaro.dyndns.org> <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> Message-ID: On Wed, 12 Jan 2005 19:49:06 -0500, Phillip J. Eby wrote: > So the *only* way you can see > this error is if you call __conform__ directly, and somebody added code > like this: > > raise LiskovViolation > > So, it's not something you need to worry about a newbie seeing. The *real* > problem with the name is knowing that you need to use it in the first place! > > IMO, it's simpler to handle this use case by letting __conform__ return > None, since this allows people to follow the One Obvious Way to not conform > to a particular protocol. Not that my opinion counts for much =), but returning None does seem much simpler to me. I also haven't seen any arguments against this route of handling protocol nonconformance... Is there a particular advantage to the exception-raising scheme? Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From pje at telecommunity.com Thu Jan 13 02:18:48 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 02:18:09 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: References: <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> <1105553300.41e56794d1fc5@mcherm.com> <16869.33426.883395.345417@montanaro.dyndns.org> <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com> At 05:54 PM 1/12/05 -0700, Steven Bethard wrote: >Not that my opinion counts for much =), but returning None does seem >much simpler to me. I also haven't seen any arguments against this >route of handling protocol nonconformance... Is there a particular >advantage to the exception-raising scheme? Only if there's any objection to giving the 'object' type a default __conform__ method that returns 'self' if 'isinstance(protocol,ClassTypes) and isinstance(self,protocol)'. From skip at pobox.com Thu Jan 13 03:36:54 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 13 03:46:27 2005 Subject: [Python-Dev] Re: [Csv] csv module and universal newlines In-Reply-To: <20050112225525.236BE3C889@coffee.object-craft.com.au> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050110044441.250103C889@coffee.object-craft.com.au> <16868.33914.837771.954739@montanaro.dyndns.org> <20050112225525.236BE3C889@coffee.object-craft.com.au> Message-ID: <16869.57030.306263.612202@montanaro.dyndns.org> >> The idea of the check is to enforce binary mode on those objects that >> support a mode if the desired line terminator doesn't match the >> platform's line terminator. Andrew> Where that falls down, I think, is where you want to read an Andrew> alien file - in fact, under unix, most of the CSV files I read Andrew> use \r\n for end-of-line. Well, you can either require 'b' in that situation or "know" that 'b' isn't needed on Unix systems. Andrew> Also, I *really* don't like the idea of looking for a mode Andrew> attribute on the supplied iterator - it feels like a layering Andrew> violation. We've advertised the fact that it's an iterator, so Andrew> we shouldn't be using anything but the iterator protocol. The fundamental problem is that the iterator protocol on files is designed for use only with text mode (or universal newline mode, but that's just as much of a problem in this context). I think you either have to abandon the iterator protocol or peek under the iterator's covers to make sure it reads and writes in binary mode. Right now, people on windows create writers like this writer = csv.writer(open("somefile", "w")) and are confused when their csv files contain blank lines. I think the reader and writer objects have to at least emit a warning when they discover a source or destination that violates the requirements. Skip From skip at pobox.com Thu Jan 13 03:39:41 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 13 03:46:35 2005 Subject: [Python-Dev] Re: [Csv] csv module and universal newlines In-Reply-To: <0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050110044441.250103C889@coffee.object-craft.com.au> <16868.33914.837771.954739@montanaro.dyndns.org> <0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl> Message-ID: <16869.57197.95323.656027@montanaro.dyndns.org> Jack> On MacOSX you really want universal newlines. CSV files produced Jack> by older software (such as AppleWorks) will have \r line Jack> terminators, but lots of other programs will have files with Jack> normal \n terminators. Won't work. You have to be able to write a Windows csv file on any platform. Binary mode is the only way to get that. Skip From bob at redivi.com Thu Jan 13 03:56:05 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu Jan 13 03:56:11 2005 Subject: [Python-Dev] Re: [Csv] csv module and universal newlines In-Reply-To: <16869.57197.95323.656027@montanaro.dyndns.org> References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050110044441.250103C889@coffee.object-craft.com.au> <16868.33914.837771.954739@montanaro.dyndns.org> <0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl> <16869.57197.95323.656027@montanaro.dyndns.org> Message-ID: On Jan 12, 2005, at 21:39, Skip Montanaro wrote: > Jack> On MacOSX you really want universal newlines. CSV files > produced > Jack> by older software (such as AppleWorks) will have \r line > Jack> terminators, but lots of other programs will have files with > Jack> normal \n terminators. > > Won't work. You have to be able to write a Windows csv file on any > platform. Binary mode is the only way to get that. Isn't universal newlines only used for reading? I have had no problems using the csv module for reading files with universal newlines by opening the file myself or providing an iterator. Unicode, on the other hand, I have had problems with. -bob From pje at telecommunity.com Thu Jan 13 03:57:07 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 03:56:33 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <20050112225446.GA43203@prometheusresearch.com> References: <41E59FA9.4050605@colorstudy.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> Message-ID: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> This is a pretty long post; it starts out as discussion of educational issues highlighted by Clark and Ian, but then it takes the motivation for PEP 246 in an entirely new direction -- possibly one that could be more intuitive than interfaces and adapters as they are currently viewed in Zope/Twisted/PEAK etc., and maybe one that could be a much better fit with Guido's type declaration ideas. OTOH, everybody may hate the idea and think it's stupid, or if they like it, then Alex may want to strangle me for allowing doubt about PEP 246 to re-enter Guido's head. Either way, somebody's going to be unhappy. At 05:54 PM 1/12/05 -0500, Clark C. Evans wrote: > String -> PathName -> File > String -> StringIO -> File Okay, after reading yours and Ian's posts and thinking about them some more, I've learned some really interesting things. First, adapter abuse is *extremely* attractive to someone new to the concept -- so from here on out I'm going to forget about the idea that we can teach people to avoid this solely by telling them "the right way to do it" up front. The second, much subtler point I noticed from your posts, was that *adapter abuse tends to sooner or later result in adapter diamonds*. And that is particularly interesting because the way that I learned how NOT to abuse adapters, was by getting slapped upside the head by PyProtocols pointing out when adapter diamonds had resulted! Now, that's not because I'm a genius who put the error in because I realized that adapter abuse causes diamonds. I didn't really understand adapter abuse until *after* I got enough errors to be able to have a good intuition about what "as a" really means. Now, I'm not claiming that adapter abuse inevitably results in a detectable ambiguity, and certainly not that it does so instantaneously. I'm also not claiming that some ambiguities reported by PyProtocols might not be perfectly harmless. So, adaptation ambiguity is a lot like a PyChecker warning: it might be a horrible problem, or it might be that you are just doing something a little unusual. But the thing I find interesting is that, even with just the diamonds I ended up creating on my own, I was able to infer an intuitive concept of "as a", even though I hadn't fully verbalized the concepts prior to this lengthy debate with Alex forcing me to single-step through my thought processes. What that suggests to me is that it might well be safe enough in practice to let new users of adaptation whack their hand with the mallet now and then, given that *now* it's possible to give a much better explanation of "as a" than it was before. Also, consider this... The larger an adapter network there is, the *greater* the probability that adapter abuse will create an ambiguity -- which could mean faster learning. If the ambiguity error is easily looked up in documentation that explains the as-a concept and the intended working of adaptation, so much the better. But in the worst case of a false alarm (the ambiguity was harmless), you just resolve the ambiguity and move on. >Originally, Python may ship with the String->StringIO and >StringIO->File adapters pre-loaded, and if my code was reliant upon >this transitive chain, the following will work just wonderfully, > > def parse(file: File): > ... > > parse("helloworld") > >by parsing "helloworld" content via a StringIO intermediate object. But >then, let's say a new component "pathutils" registers another adapter pair: > > String->PathName and PathName->File > >This ambiguity causes a few problems: > > - How does one determine which adapter path to use? > - If a different path is picked, what sort of subtle bugs occur? > - If the default path isn't what you want, how do you specify > the other path? The *real* problem here isn't the ambiguity, it's that Pathname->File is "adapter abuse". However, the fact that it results in an ambiguity is a useful clue to fixing the problem. Each time I sat down with one of these detected ambiguities, I learned better how to define sensible interfaces and meaningful adaptation. I would not have learned these things by simply not having transitive adaptation. >| As I think these things through, I'm realizing that registered >| adaptators really should be 100% accurate (i.e., no information loss, >| complete substitutability), because a registered adapter that seems >| pragmatically useful in one place could mess up unrelated code, since >| registered adapters have global effects. > >I think this isn't all that useful; it's unrealistic to assume that >adapters are always perfect. If transitive adaptation is even >permitted, it should be unambiguous. Demanding that adaption is >100% perfect is a matter of perspective. I think String->StringIO >and StringIO->File are perfectly pure. The next thing that I realized from your posts is that there's another education issue for people who haven't used adaptation, and that's just how precisely interfaces need to be specified. For example, we've all been talking about StringIO like it means something, but really we need to talk about whether it's being used to read or write or both. There's a reason why PEAK and Zope tend to have interface names like 'IComponentFactory' and 'IStreamSource' and other oddball names you'd normally not give to a concrete class. An interface has to be really specific -- in the degenerate case an interface can end up being just one method. In fact, I think that something like 10-15% of interfaces in PEAK have only one method; I don't know if it's that high for Zope and Twisted, although I do know that small interfaces (5 or fewer methods) are pretty normal. What this also suggests to me is that maybe adaptation and interfaces are the wrong solution to the problems we've been trying to solve with them -- adding more objects to solve the problems created by having lots of objects. :) As a contrasting example, consider the Dylan language. The Dylan concept of a "protocol" is a set of generic functions that can be called on any number of object types. This is just like an interface, but inside-out... maybe you could call it an "outerface". :) The basic idea is that a file protocol would consist of functions like 'read(stream,byteCount)'. If you implement a new file-like type, you "add a method" to the 'read' generic function that implements 'read' for your type. If a type already exists that you'd like to use 'read' with, you can implement the new method yourself. There are some important ramifications there. First, there's no requirement to implement a complete interface; the system is already reduced to *operations* rather than interfaces. Second, a different choice of method names isn't a reason to need more interfaces and adapters. As more implementations of some basic idea (like stream-ness) exist, it becomes more and more natural to *share* common generic functions and put them in the stdlib, even without any concrete implementation for them, because they now form a standard "meeting point" for other libraries. Third, Ka-Ping Yee has been arguing that Python should be able to define interfaces that contain abstract implementation. Well, generic functions can actually *do* this in a straightforward fashion; just define the default implementation of that operation as delegating to other operations. There still needs to be some way to "bottom out" so you don't end up with endless recursive delegation -- although you could perhaps just catch the recursion error and inspect the traceback to tell the user, "must implement one of these operations for type X". (And this could perhaps be done automatically if you can declare that this delegating implementation is an "abstract method".) Fourth, and this is *really* interesting (but also rather lengthy to explain)... if all functions are generic (just using a fast-path for the nominal case of only one implementation), then you can actually construct adapters automatically, knowing precisely when an operation is "safe". Let me explain. Suppose that we have a type, SomeType. It doesn't matter if this type is concrete or an interface, we really don't care. The point is that this type defines some operations, and there is an outside operation 'foo' that relies on some set of those operations. We then have OtherType, a concrete type we want to pass to 'foo'. All we need in order to make it work, is *extend the generic functions in SomeType with methods that take a different 'self' type*! Then, the operation 'adapt(instOfOtherType,SomeType)' can assemble a simple proxy containing methods for just the generic functions that have an implementation available for OtherType. The result of this is that now any type can be the basis for an interface, which is very intuitive. That is, I can say, "implement file.read()" for my object, and somebody who has an argument declared as "file" will be able to use my object as long as they only need the operations I've implemented. However, unlike using method names alone, we have unambiguous semantics, because all operations are grounded in some fixed type or location of definition that specifies the *meaning* of that operation. Another benefit of this approach is that it lessens the need for transitive adaptation, because over time people converge towards using common operations, rather than continually reinventing new ones. In this approach, all "adaptation" is endpoint to endpoint, but there are rarely any actual adapters involved, unless a set of related operations actually requires keeping some state. Instead, you simply define an implementation of an operation for some concrete type. I'm running out of time to explore this idea further, alas. Up to this point, what I'm proposing would work *beautifully* for adaptations that don't require the adapter to add state to the underlying object, and ought to be intuitively obvious, given an appropriate syntax. E.g.: class StringIO: def read(self, bytes) implements file.read: # etc... could be used to indicate the simple case where you are conforming to an existing operation definition. A third-party definition, of the same thing might look like this: def file.read(self: StringIO, bytes): return self.read(bytes) Assuming, of course, that that's the syntax for adding an implementation to an existing operation. Hm. You know, I think the stateful adapter problem could be solved too, if *properties* were also operations. For example, if 'file.fileno' was implemented as a set of three generic functions (get/set/delete), then you could maybe do something like: class socket: # internally declare that our fileno has the semantics # of file.fileno: fileno: int implements file.fileno or maybe just: class socket implements file: ... could be shorthand for saying that anything with the same name as what's in 'file' has the same semantics. OTOH, that could break between Python versions if a new operation were added to 'file', so maybe as verbose as the blow-by-blow declarations are, they'd be safer semantically. Anyway, if we were a third party externally declaring the correspondence between socket.fileno and file.fileno, we could say: # declare how to get a file.fileno for a socket instance def file.fileno.__get__(self: socket): return self.fileno Now, there isn't any need to have a separate "adapter" to store additional state; with appropriate name mangling it can be stored in the unadapted object, if you like. This isn't a fully thought-out proposal; it's all a fairly spur-of-the-moment idea. I've been playing with generic functions for a while now, but only recently started doing any "heavy lifting" with them. However, in one instance, I refactored a PEAK module from being 400+ lines of implementation (plus 8 interfaces and lots of adaptation) down to just 140 lines implementation and one interface -- with the interface being pure documentation. And the end result was more flexible than the original code. So since then I've been considering whether adaptation is really the be-all end-all for this sort of thing, and Clark and Ian's posts made me start thinking about it even more seriously. (One interesting data point: the number of languages with some kind of pattern matching, "guards" or other generic function variants seems to be growing, while Java (via Eclipse) is the only other system I know of that has anything remotely like PEP 246.) So maybe the *real* answer here is that we should be looking at solutions that might prevent the problems that adapters are meant to solve, from arising in the first place! Generic functions might be a good place to look for one, although the downside is that they might make Python look like a whole new language. OTOH, type declarations might do that anyway. A big plus, by the way, of the generic function approach is that it does away with the requirement for interfaces altogether, except as a semantic grouping of operations. Lots of people dislike interfaces, and after all this discussion about how perfect interface-to-interface adaptation has to be, I'm personally becoming a lot less enamored with interfaces too! In general, Python seems to like to let "natural instinct" prevail. What could be more natural than saying "this is how to implement a such-and-such method like what that other guy's got"? It ain't transitive, but if everybody tends to converge on a common "other guy" to define stuff in terms of (like 'file' in the stdlib), then you don't *need* transitivity in the long run, except for fairly specialized situations like pluggable IDE's (e.g. Eclipse) that need to dynamically connect chains between different plugins. Even there, the need could be minimized by most operations grounding in "official" abstract types. And abstract methods -- like a 'file.readline()' implementation for any object that supports 'file.read()' -- could possibly take care of most of the rest. Generic functions are undoubtedly more complex to implement than PEP 246 adaptation. My generic function implementation comprises 3323 lines of Python, and it actually *uses* PEP 246 adaptation internally for many things, although with more work it could probably do without it. However, almost half of those lines of code are consumed by a mini-compiler and mini-interpreter for Python expressions; a built-in implementation of generic functions might be able to get away without having those parts, or at least not so many of them. Also, my implementation supports full predicate dispatch, not just multimethod dispatch, so there's probably even more code that could be eliminated if it was decided not to do the whole nine yards. Back on the downside, this looks like an invitation to another "language vs. stdlib" debates, since PEP 246 in and of itself is pure library. OTOH, Guido's changing the language to add type declarations anyway, and generic functions are an excellent use case for them. Since he's going to be flamed for changing the language anyway, he might as well be hanged for a sheep as for a goat. :) Oh, and back on the upside again, it *might* be easier to implement actual type checking with this technique than with PEP 246, because if I write a method expecting a 'file' and somebody calls it with a 'Foo' instance, I can maybe now look at the file operations actually used by the method, and then see if there's an implementation for e.g. 'file.read' defined anywhere for 'Foo'. And, comparable type checking algorithms are more likely to already exist for other languages that include generic functions, than to exist for PEP 246-style adaptation. Okay, I'm really out of time now. Hate to dump this in as a possible spoiler on PEP 246, because I was just as excited as Alex about the possibility of it going in. But this whole debate has made me even less enamored of adaptation, and more interested in finding a cleaner, more intuitive way to do it. From andrewm at object-craft.com.au Thu Jan 13 04:21:41 2005 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Thu Jan 13 04:21:44 2005 Subject: [Python-Dev] Re: [Csv] csv module and universal newlines In-Reply-To: References: <20050105070643.5915B3C8E5@coffee.object-craft.com.au> <20050110044441.250103C889@coffee.object-craft.com.au> <16868.33914.837771.954739@montanaro.dyndns.org> <0E6093F4-64EE-11D9-B7C6-000D934FF6B4@cwi.nl> <16869.57197.95323.656027@montanaro.dyndns.org> Message-ID: <20050113032141.78EB13C889@coffee.object-craft.com.au> >Isn't universal newlines only used for reading? That right. And the CSV reader has it's own version of univeral newlines anyway (from the py1.5 days). >I have had no problems using the csv module for reading files with >universal newlines by opening the file myself or providing an iterator. Neither have I, funnily enough. >Unicode, on the other hand, I have had problems with. Ah, so somebody does want it then? Good to hear. Hard to get motivated to make radical changes without feedback. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From ianb at colorstudy.com Thu Jan 13 04:50:14 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu Jan 13 04:50:15 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> Message-ID: <41E5EFF6.9090408@colorstudy.com> Phillip J. Eby wrote: > At 04:07 PM 1/12/05 -0600, Ian Bicking wrote: > >> It also seems quite reasonable and unambiguous that a path object >> could be adapted to a IReadableFile by opening the file at the given >> path. > > > Not if you think of adaptation as an "as-a" relationship, like using a > screwdriver "as a" hammer (really an IPounderOfNails, or some such). It > makes no sense to use a path "as a" readable file, so this particular > adaptation is bogus. I started to realize that in a now-aborted reply to Steven, when my defense of the path->IReadableFile adaptation started making less sense. It's *still* not intuitively incorrect to me, but there's a couple things I can think of... (a) After you adapted the path to the file, and have a side-effect of opening a file, it's unclear who is responsible for closing it. (b) The file object clearly has state the path object doesn't have, like a file position. (c) You can't go adapting the path object to a file whenever you wanted, because of those side effects. So those are some more practical reasons that it *now* seems bad to me, but that wasn't my immediate intuition, and I could have happily written out all the necessary code without countering that intuition. In fact, I've misused adaptation before (I think) though in different ways, and it those mistakes haven't particularly improved my intuition on the matter. If you can't learn from mistakes, how can you learn? One way is with principles and rules, even if they are flawed or incomplete. Perhaps avoiding adaptation diamonds is one such rule; it may not be necessarily and absolutely a bad thing that there is a diamond, but it is often enough a sign of problems elsewhere that it may be best to internalize that belief anyway. Avoiding diamonds alone isn't enough of a rule, but maybe it's a start. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Thu Jan 13 05:19:32 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 05:17:58 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> References: <20050112225446.GA43203@prometheusresearch.com> <41E59FA9.4050605@colorstudy.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> Message-ID: <5.1.1.6.0.20050112225358.0307b160@mail.telecommunity.com> At 09:57 PM 1/12/05 -0500, Phillip J. Eby wrote: > class StringIO: > > def read(self, bytes) implements file.read: > # etc... > >could be used to indicate the simple case where you are conforming to an >existing operation definition. A third-party definition, of the same >thing might look like this: > > def file.read(self: StringIO, bytes): > return self.read(bytes) > >Assuming, of course, that that's the syntax for adding an implementation >to an existing operation. After some more thought, I think this approach: 1. Might not actually need generic functions to be implemented. I need to think some more about properties and Ka-Ping Yee's abstract method idea, to make sure they can be made to work without "real" generic functions, but a basic version of this approach should be implementable with just a handful of dictionaries and decorators. 2. Can be prototyped in today's Python, whether generic functions are used or not (but the decorator syntax might be ugly, and the decorator implementations might be hacky) 3. May still have some rough bits with respect to subclassing & Liskov; I need to work through that part some more. My preliminary impression is that it might be safe to consider inherited (but not overridden) methods as being the same logical operation. That imposes some burden on subclassers to redeclare compatibility on overridden methods, but OTOH would be typesafe by default. 4. Might be somewhat more tedious to declare adaptations with, than it currently is with tools like PyProtocols. Anyway, the non-generic-function implementation would be to have 'adapt()' generate (and cache!) an adapter class by going through all the methods of the target class and then looking them up in its 'implements' registry, while walking up the source class' __mro__ to find the most-specific implementation for that type (while checking for overridden-but-not-declared methods along the way). There would be no __conform__ or __adapt__ hooks needed. Interestingly, C# requires you to declare when you are intentionally overriding a base class method, in order to avoid accidentally overriding a new method added to a base class later. This concept actually contains a germ of the same idea, requiring overrides to specify that they still conform to the base class' operations. Maybe this weekend I'll be able to spend some time on whipping up some sort of prototype, and hopefully that will answer some of my open questions. It'll also be interesting to see if I can actually use the technique directly on existing interfaces and adaptation, i.e. get some degree of PyProtocols backward-compatibility. It might also be possible to get backward-compatibility for Zope too. In each case, the backward compatibility mechanism would be to change the adapter/interface declaration APIs to be equivalent to assertions about all the operations defined in a particular interface, against the concrete class you're claiming implements the interface. However, for both PEAK and Zope, it would likely be desirable to migrate any interfaces like "mapping object" to be based off of operations in e.g. the 'dict' type rather than rolling their own IReadMapping and such. From cce at clarkevans.com Thu Jan 13 05:26:06 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu Jan 13 05:26:08 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> References: <41E59FA9.4050605@colorstudy.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> Message-ID: <20050113042605.GA58003@prometheusresearch.com> Phillip, In my mind, the driving use-case for PEP 246 was to allow causual programmers to plug components together and have it 'just work'; it does this by enabling the component vendors to carry on a discussion via __adapt__ and __conform__ to work together. I was not picturing that your average developer would be using this sort of thing. On Wed, Jan 12, 2005 at 09:57:07PM -0500, Phillip J. Eby wrote: | First, adapter abuse is *extremely* attractive to someone new to the | concept -- so from here on out I'm going to forget about the idea that we | can teach people to avoid this solely by telling them "the right way to | do it" up front. | | The second, much subtler point I noticed from your posts, was that | *adapter abuse tends to sooner or later result in adapter diamonds*. However, I'd like to assert that these cases emerge when you have a registry /w automatic transitive adaptation. These problems can be avoided quite easily by: - not doing transitive adaptation automatically - making it an error to register more than one adapter from A to Z at any given time; in effect, ban diamonds from ever being created - make it easy for a user to construct and register an adapter from A to Z, via an intermediate X, adapt.registerTransitive(A,X,Z) - if an adaptation from A to Z isn't possible, give a very meaningful error listing the possible pathways that one could build a 'transitive adaption', adaptation path', perhaps even showing the command that will do it: adapt.registerTransitive(A,B,C,Z) adapt.registerTranstive(A,Q,Z) adapt.registerTranstive(A,X,Z) The results of this operation: - most component vendors will use __adapt__ and __conform__ rather than use the 'higher-precedent' registry; therefore, transitive adaption isn't that common to start with - if two libraries register incompatible adpater chains during the 'import' of the module, then it will be an error that the casual developer will associate with the module, and not with their code - casual users are given a nice message, like "Cannot automatically convert a String to a File. Perhaps you should do a manual conversion of your String to a File. Alternatively, there happen to be two adaptation paths which could do this for you, but you have to explicitly enable the pathway which matches your intent: To convert a String to a File via StringIO, call: adapt.registerTranstive(String,StringIO,File) To convert a String to a File via FileName, call: adapt.registerTranstive(String,FileName,File)" | What that suggests to me is that it might well be safe enough in practice | to let new users of adaptation whack their hand with the mallet now and | then, given that *now* it's possible to give a much better explanation of | "as a" than it was before. By disabling (the quite dangerous?) transitive adaptation, one could guide the user along to the result they require without having them shoot themselves in the foot first. | What this also suggests to me is that maybe adaptation and interfaces are | the wrong solution to the problems we've been trying to solve with them | -- adding more objects to solve the problems created by having lots of | objects. :) I didn't see how your remaining post, in particular, Dylan's protocol was much different from an mixin/abstract-base-class. Regardless, getting back to the main goal I had when writing PEP 246 -- your alternative proposal still doesn't seem to provide a mechanism for component developers to have a dialogue with one another to connect components without involving the application programmer. Cheers! Clark From pje at telecommunity.com Thu Jan 13 05:48:47 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 05:47:13 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <20050113042605.GA58003@prometheusresearch.com> References: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> At 11:26 PM 1/12/05 -0500, Clark C. Evans wrote: >Regardless, >getting back to the main goal I had when writing PEP 246 -- your >alternative proposal still doesn't seem to provide a mechanism for >component developers to have a dialogue with one another to connect >components without involving the application programmer. Eh? You still have adapt(); you still have adapters. The only difference is that I've specified a way to not need "interfaces" - instead interfaces can be defined in terms of individual operations, and those operations can be initially defined by an abstract base, concrete class, or an "interface" object. Oh, and you don't have to write adapter *classes* - you write adapting *methods* for individual operations. This can be done by the original author of a class or by a third party -- just like with PEP 246. From michael.walter at gmail.com Thu Jan 13 06:01:14 2005 From: michael.walter at gmail.com (Michael Walter) Date: Thu Jan 13 06:01:17 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> <20050113042605.GA58003@prometheusresearch.com> <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> Message-ID: <877e9a1705011221013c9de8f7@mail.gmail.com> > instead interfaces can be defined in terms of individual operations, and > those operations can be initially defined by an abstract base, concrete > class, or an "interface" object. I think this is quite problematic in the sense that it will force many dummy interfaces to be created. At least without type inference, this is a no-no. Consider: In order to type a function like: def f(x): # ... x.foo() # ... ...so that type violations can be detected before the real action takes place, you would need to create a dummy interface as in: interface XAsFUsesIt: def foo(): pass def f(x : XAsFUsesIt): # ... ...or you would want type inference (which at compile time types x as "a thing which has a 'nullary' foo() function) and a type system like System CT. Former appears cumbersome (as it should really be done for every function), latter too NIMPY-ish . What am I missing? Sleepingly yours, Michael On Wed, 12 Jan 2005 23:48:47 -0500, Phillip J. Eby wrote: > At 11:26 PM 1/12/05 -0500, Clark C. Evans wrote: > >Regardless, > >getting back to the main goal I had when writing PEP 246 -- your > >alternative proposal still doesn't seem to provide a mechanism for > >component developers to have a dialogue with one another to connect > >components without involving the application programmer. > > Eh? You still have adapt(); you still have adapters. The only difference > is that I've specified a way to not need "interfaces" - instead interfaces > can be defined in terms of individual operations, and those operations can > be initially defined by an abstract base, concrete class, or an "interface" > object. Oh, and you don't have to write adapter *classes* - you write > adapting *methods* for individual operations. This can be done by the > original author of a class or by a third party -- just like with PEP 246. > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com > From pje at telecommunity.com Thu Jan 13 07:04:01 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 07:02:28 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <877e9a1705011221013c9de8f7@mail.gmail.com> References: <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> <20050113042605.GA58003@prometheusresearch.com> <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com> At 12:01 AM 1/13/05 -0500, Michael Walter wrote: >What am I missing? The fact that this is a type-declaration issue, and has nothing to do with *how* types are checked. Note that I'm only proposing: 1) a possible replacement for PEP 246 that leaves 'adapt()' as a function, but uses a different internal implementation, 2) a very specific notion of what an operation is, that doesn't require an interface to exist if there is already some concrete type that the interface would be an abstraction of, 3) a strawman syntax for declaring the relationship between operations In other words, compared to the previous state of things, this should actually require *fewer* interfaces to accomplish the same use cases, and it doesn't require Python to have a built-in notion of "interface", because the primitive notion is an operation, not an interface. Oh, and I think I've now figured out how to define a type-safe version of Ping's "abstract operations" concept that can play in the non-generic-function implementation, but I really need some sleep, so I might be hallucinating the solution. :) Anyway, so far it seems like it can all be done with a handful of decorators: @implements(base_operation, for_type=None) (for_type is the "adapt *from*" type, defaulting to the enclosing class if used inside a class body) @override (means the method is overriding the one in a base class, keeping the same operation correspondence(s) defined for the method in the base class) @abstract(base_operation, *required_operations) (indicates that this implementation of base_operation requires the ability to use the specified required_operations on a target instance. The adapter machinery can then "safely fail" if the operations aren't available, or if it detects a cycle between mutually-recursive abstract operations that don't have a non-abstract implementation. An abstract method can be used to perform the operation on any object that provides the required operations, however.) Anyway, from the information provided by these decorators, you can generate adapter classes for any operation-based interfaces. I don't have a planned syntax or API for defining attribute correspondences as yet, but it should be possible to treat them internally as a get/set/del operation triplet, and then just wrap them in a descriptor on the adapter class. By the way, saying "generate" makes it sound more complex than it is: just a subclass of 'object' with a single slot that points to the wrapped source object, and contains simple descriptors for each available operation of the "protocol" type that call the method implementations, passing in the wrapped object. So really "generate" means, "populate a dictionary with descriptors and then call 'type(name,(object,),theDict)'". A side effect of this approach, by the way, is that since adapters are *never* composed (transitively or otherwise), we can *always* get back to the "original" object. So, in theory we could actually have 'adapt(x,object)' always convert back to the original unwrapped object, if we needed it. Likewise, adapting an already-adapted object can be safe because the adapter machinery knows when it's dealing with one of its own adapters, and unwrap it before rewrapping it with a new adapter. Oh, btw, it should at least produce a warning to declare multiple implementations for the same operation and source type, if not an outright error. Since there's no implicit transitivity in this system (either there's a registered implementation for something or there isn't), there's no other form of ambiguity besides dual declarations of a point-to-point adaptation. Hm. You know, this also solves the interface inheritance problem; under this scheme, if you inherit an operation from a base interface, it doesn't mean that you provide the base interface. Oh, actually, you can still also do interface adaptation in a somewhat more restrictive form; you can declare abstract operations for the target interface in terms of operations in the base interface. But it's much more controlled because you never stack adapters on adapters, and the system can tell at adaptation time what operations are and aren't actually available. Even more interesting: Alex's "loss of middle name" example can't be recreated in this system as a problem, at least if I'm still thinking clearly. But I'm probably not, so I'm going to bed now. :) From michael.walter at gmail.com Thu Jan 13 07:23:38 2005 From: michael.walter at gmail.com (Michael Walter) Date: Thu Jan 13 07:23:41 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> <20050113042605.GA58003@prometheusresearch.com> <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> <877e9a1705011221013c9de8f7@mail.gmail.com> <5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com> Message-ID: <877e9a17050112222376178511@mail.gmail.com> On Thu, 13 Jan 2005 01:04:01 -0500, Phillip J. Eby wrote: > At 12:01 AM 1/13/05 -0500, Michael Walter wrote: > >What am I missing? > > The fact that this is a type-declaration issue, and has nothing to do with > *how* types are checked. I was talking about how you declare such types, sir :] (see the interface pseudo code sample -- maybe my reference to type inference lead you to think the opposite.) > In other words, compared to the previous state of things, this should > actually require *fewer* interfaces to accomplish the same use cases, and > it doesn't require Python to have a built-in notion of "interface", because > the primitive notion is an operation, not an interface. Yepyep, but *how* you declare types now? Can you quickly type the function def f(x): x.read()? without needing an interface interface x_of_f: def read(): pass or a decorator like @foo(x.read)? I've no idea what you mean, really :o) Michael From aleax at aleax.it Thu Jan 13 08:50:54 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 08:50:59 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> References: <41E59FA9.4050605@colorstudy.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> Message-ID: On 2005 Jan 13, at 03:57, Phillip J. Eby wrote: > Okay, I'm really out of time now. Hate to dump this in as a possible > spoiler on PEP 246, because I was just as excited as Alex about the > possibility of it going in. But this whole debate has made me even > less enamored of adaptation, and more interested in finding a cleaner, > more intuitive way to do it. Perfectly reasonable, of course. Doubts about the class / inheritance / interface / instance / method / ... "canon" as the OTW to do OOP are almost as old as that canon itself, and have evolved along the years, producing many interesting counterexamples and variations, and I fully share your interest in them. Adaptation is rather ``ensconced'' in that canon, and the conceptual and practical issues of IS-A which pervade the canon are all reflected in the new ``(can be automatically adapted to be used) AS-A'' which adaptation introduces. If adaptation cannot survive some vigorous critical appraisal, it's much better to air the issues now than later. Your proposals are novel and interesting. They also go WAY deeper into a critical reappraisal of the whole object model of Python, which has always been quite reasonably close to the above-mentioned "canon" and indeed has been getting _more_ so, rather than less, since 2.2 (albeit in a uniquely Pythonical way, as is Python's wont -- but not conceptually, nor, mostly, practically, all that VERY far from canonic OOP). Moreover, your proposals are at a very early stage and no doubt need a lot more experience, discussion, maturation, and give-and-take. Further, you have indicated that, far from _conflicting_ with PEP 246, your new ideas can grow alongside and on top of it -- if I read you correctly, you have prototyped some variations of them using PEP 246 for implementation, you have some ideas of how 'adapt' could in turn be recast by using your new ideas as conceptual and practical foundations, etc, etc. So, I think the best course of action at this time might be for me to edit PEP 246 to reflect some of this enormously voluminous discussion, including points of contention (it's part of a PEP's job to also indicate points of dissent, after all); and I think you should get a new PEP number to use for your new ideas, and develop them on that separate PEP, say PEP XYZ. Knowing that a rethink of the whole object-model and related canon is going on at the same time should help me keep PEP 246 reasonably minimal and spare, very much in the spirit of YAGNI -- as few features as possible, for now. If Guido, in consequence, decides to completely block 246's progress while waiting for the Copernican Revolution of your new PEP XYZ to mature, so be it -- his ``nose'' will no doubt be the best guide to him on the matter. But I hope that, in the same pragmatic and minimalist spirit as his "stop the flames" Artima post -- proposing minimalistic interfaces and adaptation syntax as a starting point, while yet keeping as a background reflection the rich and complicated possibilities of parameterized types &c as discussed in his previous Artima entries -- he'll still give a minimalistic PEP 246 the go-ahead so that widespread, real-world experimentation with adaptation and his other proposals can proceed, and give many Pythonistas some practical experience which will make future discussions and developments much sounder-based and productive. So, what do you think -- does this new plan of action sound reasonable to you? Alex From aleax at aleax.it Thu Jan 13 09:00:19 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 09:00:25 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com> References: <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> <1105553300.41e56794d1fc5@mcherm.com> <16869.33426.883395.345417@montanaro.dyndns.org> <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> <5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com> Message-ID: <2AFC1C53-6539-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 13, at 02:18, Phillip J. Eby wrote: > At 05:54 PM 1/12/05 -0700, Steven Bethard wrote: >> Not that my opinion counts for much =), but returning None does seem >> much simpler to me. I also haven't seen any arguments against this >> route of handling protocol nonconformance... Is there a particular >> advantage to the exception-raising scheme? > > Only if there's any objection to giving the 'object' type a default > __conform__ method that returns 'self' if > 'isinstance(protocol,ClassTypes) and isinstance(self,protocol)'. In the spirit of minimalism in which I propose to rewrite PEP 246 (as per my latest post: make a simple, noninvasive, unassuming PEP 246 while new ``copernical revolution'' ideas which you proposed mature in another PEP), I'd rather not make a change to built-in ``object'' a prereq for PEP 246; so, I think the reference implementation should avoid assuming such changes, if it's at all possible to avoid them (while, no doubt, indicating the desirability of such changes for simplification and acceleration). Incidentally, "get this specialmethod from the type (with specialcasing for classic classes &c)" is a primitive that PEP 246 needs as much as, say, copy.py needs it. In the light of the recent discussions of how to fix copy.py etc, I'm unsure about what to assume there, in a rewrite of PEP 246: that getattr(obj, '__aspecial__', None) always does the right thing via special descriptors, that I must spell everything out, or, what else...? If anybody has advice or feedback on these points, it will be welcome! Alex From arigo at tunes.org Thu Jan 13 11:16:33 2005 From: arigo at tunes.org (Armin Rigo) Date: Thu Jan 13 11:28:04 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <20050113101633.GA5193@vicky.ecs.soton.ac.uk> Hi Guido, On Wed, Jan 12, 2005 at 09:59:13AM -0800, Guido van Rossum wrote: > The descriptor for __getattr__ and other special attributes could > claim to be a "data descriptor" This has the nice effect that x[y] and x.__getitem__(y) would again be equivalent, which looks good. On the other hand, I fear that if there is a standard "metamethod" decorator (named after Phillip's one), it will be misused. Reading the documentation will probably leave most programmers with the feeling "it's something magical to put on methods with __ in their names", and it won't be long before someone notices that you can put this decorator everywhere in your classes (because it won't break most programs) and gain a tiny performance improvement. I guess that a name-based hack in type_new() to turn all __*__() methods into data descriptors would be even more obscure? Finally, I wonder if turning all methods whatsoever into data descriptors (ouch! don't hit!) would be justifiable by the feeling that it's often bad style and confusing to override a method in an instance (as opposed to defining a method in an instance when there is none on the class). (Supporting this claim: Psyco does this simplifying hypothesis for performance reasons and I didn't see yet a bug report for this.) In all cases, I'm +1 on seeing built-in method objects (PyMethodDescr_Type) become data descriptors ("classy descriptors?" :-). Armin From aleax at aleax.it Thu Jan 13 11:31:30 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 11:31:39 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> References: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> Message-ID: <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 21:42, Phillip J. Eby wrote: ... > Anyway, hopefully this post and the other one will be convincing that > considering ambiguity to be an error *reinforces* the idea of I-to-I > perfection, rather than undermining it. (After all, if you've written > a perfect one, and there's already one there, then either one of you > is mistaken, or you are wasting your time writing one!) I'd just like to point out, as apparently conceded in your "fair enough" sentence in another mail, that all of this talk of "wasting your time writing" is completely unfounded. Since that "fair enough" of yours was deeply buried somewhere inside this huge conversation, some readers might miss the fact that your numerous repetitions of the similar concept in different words are just invalid, because, to recap: Given four interfaces A, B, C, D, there may be need of each of the single steps A->B, A->C, B->D, C->D. Writing each of these four adapters can IN NO WAY be considered "wasting your time writing one", because there is no way a set of just three out of the four can be used to produce the fourth one. The only "redundancy" comes strictly because of transitivity being imposed automatically: at the moment the fourth one of these four needed adapters gets registered, there appear to be two same-length minimal paths A->x->D (x in {B, C}). But inferring _from this consequence of transitivity_ that there's ANYTHING wrong with any of the four needed adapters is a big unwarranted logical jump -- IF one really trusted all interface->interface adapters to be perfect, as is needed to justify transitivity and as you here claims gets "reinforced" (?!). Thinking of it as "redundancy" is further shown to be fallacious because the only solution, if each of those 4 adapters is necessary, is to write and register a FIFTH one, A->D directly, even if one has no interest whatsoever in A->D adaptation, just to shut up the error or warning (as you say, there may be some vague analogy to static typing here, albeit in a marginal corner of the stage rather than smack in the spotlight;-). Yes, there is (lato sensu) "non-determinism" involved, just like in, say: for k in d: print k for a Python dictionary d - depending on how d was constructed and modified during its lifetime (which may in turn depend on what order modules were imported, etc), this produces different outputs. Such non-determinism may occasionally give some problems to unwary programmers (who could e.g. expect d1==d2 <--> repr(d1)==repr(d2) when keys and values have unique repr's: the right-pointing half of this implication doesn't hold, so using repr(d) to stand in for d when you need, e.g., a set of dictionaries, is not quite sufficient); such problems at the margin appear to be generally considered acceptable, though. I seems to me that you do at least feel some unease at the whole arrangement, given that you say "this whole debate has made me even less enamored of adaptation", as it's not clear to me that any _other_ aspect of "this whole debate" was quite as problematic (e.g. issues such as "how to best get a special method from class rather than instance" -- while needing to be resolved for adaptation just as much as for copy.py etc -- hardly seem likely to have been the ones prompting you to go looking for "a cleaner, more intuitive way to do it" outside of the canonical, widespread approach to OOP). Anyway -- I'm pointing out that what to put in a rewrite of PEP 246 as a result of all this is anything but obvious at this point, at least to me. Alex From p.f.moore at gmail.com Thu Jan 13 11:35:39 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Thu Jan 13 11:35:42 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <41E5EFF6.9090408@colorstudy.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> Message-ID: <79990c6b05011302352cbd41de@mail.gmail.com> On Wed, 12 Jan 2005 21:50:14 -0600, Ian Bicking wrote: > Phillip J. Eby wrote: > > At 04:07 PM 1/12/05 -0600, Ian Bicking wrote: > > > >> It also seems quite reasonable and unambiguous that a path object > >> could be adapted to a IReadableFile by opening the file at the given > >> path. > > > > > > Not if you think of adaptation as an "as-a" relationship, like using a > > screwdriver "as a" hammer (really an IPounderOfNails, or some such). It > > makes no sense to use a path "as a" readable file, so this particular > > adaptation is bogus. > > I started to realize that in a now-aborted reply to Steven, when my > defense of the path->IReadableFile adaptation started making less sense. I think I'm getting a clearer picture here (at last!) One thing I feel is key is the fact that adaptation is a *tool*, and as such will be used in different ways by different people. That is not a bad thing, even if it does mean that some people will abuse the tool. Now, a lot of the talk has referred to "implicit" adaptation. I'm still struggling to understand how that concept applies in practice, beyond the case of adaptation chains - at some level, all adaptation is "explicit", insofar as it is triggered by an adapt() call. James Knight's example (which seemed to get lost in the discussion, or at least no-one commented on it) brought up a new point for me, namely the fact that it's the library writer who creates interfaces, and calls adapt(), but it's the library *user* who says what classes support (can be adapted to) what interface. I hadn't focused on the different people involved before this point. Now, if we have a transitive case A->B->C, where A is written by "the user", and C is part of "the library" and library code calls adapt(x,C) where x is a variable which the user supplies as an object of type A, then WHO IS RESPONSIBLE FOR B???? And does it matter, and if it does, then what are the differences? As I write this, being careful *not* to talk interms of "interfaces" and "classes", I start to see Philip's point - in my mind, A (written by the user) is a class, and C (part of the library) is an "interface". So the answer to the question above about B is that it depends on whether B is an interface or a class - and the sensible transitivity rules could easily (I don't have the experience to decide) depend on whether B is a class or an interface. BUT, and again, Philip has made this point, I can't reason about interfaces in the context of PEP 246, because interfaces aren't defined there. So PEP 246 can't make a clear statement about transitivity, precisely because it doesn't define interfaces. But does this harm PEP 246? I'm not sure. > It's *still* not intuitively incorrect to me, but there's a couple > things I can think of... > > (a) After you adapted the path to the file, and have a side-effect of > opening a file, it's unclear who is responsible for closing it. > (b) The file object clearly has state the path object doesn't have, like > a file position. > (c) You can't go adapting the path object to a file whenever you > wanted, because of those side effects. In the context of my example above, I was assuming that C was an "interface" (whatever that might be). Here, you're talking about adapting to a file (a concrete class), which I find to be a much muddier concept. This is very much a "best practices" type of issue, though. I don't see PEP 246 mandating that you *cannot* adapt to concrete classes, but I can see that it's a dangerous thing to do. Even the string->path adaptation could be considered suspect. Rather, you "should" be defining an IPath *interface*, with operations such as join, basename, and maybe open. Then, the path class would have a trivial adaptation to IPath, and adapting a string to an IPath would likely do so by constructing a path object from the string. From a practical point of view, the IPath interface adds nothing over adapting direct to the path class, but for the purposes of clarity, documentation, separation of concepts, etc, I can see the value. > So those are some more practical reasons that it *now* seems bad to me, > but that wasn't my immediate intuition, and I could have happily written > out all the necessary code without countering that intuition. In fact, > I've misused adaptation before (I think) though in different ways, and > it those mistakes haven't particularly improved my intuition on the > matter. If you can't learn from mistakes, how can you learn? > > One way is with principles and rules, even if they are flawed or > incomplete. Perhaps avoiding adaptation diamonds is one such rule; it > may not be necessarily and absolutely a bad thing that there is a > diamond, but it is often enough a sign of problems elsewhere that it may > be best to internalize that belief anyway. Avoiding diamonds alone > isn't enough of a rule, but maybe it's a start. Some mistakes are easier to avoid if you have the correct conceptual framework. I suspect that interfaces are the conceptual framework which make adaptation fall into place. If so, then PEP 246, and adaptation per se, is always going to be hard to reason about for people without a background in interfaces. Hmm. I think I just disqualified myself from making any meaningful comments :-) Paul. From aleax at aleax.it Thu Jan 13 11:47:59 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 11:48:03 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <97350DCE-6550-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 12, at 18:59, Guido van Rossum wrote: ... > [Alex] >> Armin's fix was to change: ... > [And then proceeds to propose a new API to improve the situation] > > I wonder if the following solution wouldn't be more useful (since less > code will have to be changed). > > The descriptor for __getattr__ and other special attributes could > claim to be a "data descriptor" which means that it gets first pick > *even if there's also a matching entry in the instance __dict__*. ... > Normal methods are not data descriptors, so they can be overridden by > something in __dict__; but it makes some sense that for methods > implementing special operations like __getitem__ or __copy__, where > the instance __dict__ is already skipped when the operation is invoked > using its special syntax, it should also be skipped by explicit > attribute access (whether getattr(x, "__getitem__") or x.__getitem__ > -- these are entirely equivalent). A very nice idea for how to proceed in the future, and I think definitely the right solution for Python 2.5. But maybe we need to think about a bugfix for 2.3/2.4, too. > We would need to introduce a new decorator so that classes overriding > these methods can also make those methods "data descriptors", and so > that users can define their own methods with this special behavior > (this would be needed for __copy__, probably). > > I don't think this will cause any backwards compatibility problems -- > since putting a __getitem__ in an instance __dict__ doesn't override > the x[y] syntax, it's unlikely that anybody would be using this. ...in new-style classes, yes. And classic types and old-style classes would keep behaving the old-way (with per-instance override) so the bug that bit the effbot would disappear... in Python 2.5. But the bug is there in 2.3 and 2.4, and it seems to me we should still find a fix that is applicable there, even though the fix won't need to get into the 2.5 head, just the 2.3 and 2.4 bugfix branches. > "Ordinary" methods will still be overridable. > > PS. The term "data descriptor" now feels odd, perhaps we can say "hard > descriptors" instead. Hard descriptors have a __set__ method in > addition to a __get__ method (though the __set__ method may always > raise an exception, to implement a read-only attribute). Good terminology point, and indeed explaining the ``data'' in "data descriptor" has always been a problem. "Hard" or "Get-Set" descriptors or other terminology yet will make explanation easier; to pick the best terminology we should also think of the antonym, since ``non-data'' won't apply any more ("soft descriptors", "get-only descriptors", ...). ``strong'' descriptors having a __set__, and ``weak'' ones not having it, is another possibility. But back to the bugfix for copy.py (and I believe at least pprint.py too, though of course that's more marginal than copy.py!) in 2.3 and 2.4: am I correct that this new descriptor idea is too big/invasive for ths bugfix, and thus we should still be considering localized changes (to copy.py and pprint.py) via a function copy._get_special (or whatever) in 2.3.5 and 2.4.1? This small, local, minimally invasive change to copy.py would go well with the other one we need (as per my latest post with Subject: Re: [Python-Dev] Re: copy confusion Date: 2005 January 12 10:52:10 CET ) -- having the check for issubclass(cls, type) in copy.copy() just as we have it in copy.deepcopy() and for the same reason (which is a bit wider than the comment in copy.deepcopy about old versions of Boost might suggest). Alex From ncoghlan at iinet.net.au Thu Jan 13 13:18:27 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Thu Jan 13 13:18:32 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <877e9a17050112222376178511@mail.gmail.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> <20050113042605.GA58003@prometheusresearch.com> <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> <877e9a1705011221013c9de8f7@mail.gmail.com> <5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com> <877e9a17050112222376178511@mail.gmail.com> Message-ID: <41E66713.2060305@iinet.net.au> Michael Walter wrote: > Yepyep, but *how* you declare types now? Can you quickly type the function > def f(x): x.read()? without needing an interface interface x_of_f: def > read(): pass or a decorator like @foo(x.read)? I've no idea what you > mean, really :o) Why would something like def f(x): x.read() do any type checking at all? Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From skip at pobox.com Thu Jan 13 03:45:41 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 13 13:59:57 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <41E63EDB.40008@cs.teiath.gr> References: <1105553300.41e56794d1fc5@mcherm.com> <16869.33426.883395.345417@montanaro.dyndns.org> <41E63EDB.40008@cs.teiath.gr> Message-ID: <16869.57557.795447.53311@montanaro.dyndns.org> stelios> Yes but in order to fall into a Liskov Violation, one will have stelios> to use extreme OOP features (as I understand from the ongoing stelios> discussion for which, honestly, I understand nothing:). The first example here: http://www.compulink.co.uk/~querrid/STANDARD/lsp.htm Looks pretty un-extreme to me. It may not be detectable without the pep 246 stuff, but I suspect it's pretty common. Skip From pje at telecommunity.com Thu Jan 13 14:52:21 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 14:50:47 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <41E66713.2060305@iinet.net.au> References: <877e9a17050112222376178511@mail.gmail.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> <20050113042605.GA58003@prometheusresearch.com> <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> <877e9a1705011221013c9de8f7@mail.gmail.com> <5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com> <877e9a17050112222376178511@mail.gmail.com> Message-ID: <5.1.1.6.0.20050113084947.020f9020@mail.telecommunity.com> At 10:18 PM 1/13/05 +1000, Nick Coghlan wrote: >Michael Walter wrote: >>Yepyep, but *how* you declare types now? Can you quickly type the function >>def f(x): x.read()? without needing an interface interface x_of_f: def >>read(): pass or a decorator like @foo(x.read)? I've no idea what you >>mean, really :o) > >Why would something like > > def f(x): > x.read() > >do any type checking at all? It wouldn't. The idea is to make this: def f(x:file): x.read() automatically find a method declared '@implements(file.read,X)' where X is in x.__class__.__mro__ (or the equivalent of MRO if x.__class__ is classic). From ncoghlan at iinet.net.au Thu Jan 13 15:30:17 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Thu Jan 13 15:30:22 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050112150933.03378630@mail.telecommunity.com> References: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> <5.1.1.6.0.20050112150933.03378630@mail.telecommunity.com> Message-ID: <41E685F9.2010606@iinet.net.au> Phillip J. Eby wrote: > Anyway, I'm at least +0 on dropping this; the reservation is just > because I don't think everybody else will agree with this, and don't > want to be appearing to imply that consensus between you and me implies > any sort of community consensus on this point. That is, the adaptation > from "Alex and Phillip agree" to "community agrees" is noisy at best! ;) You seem to be doing a pretty good job of covering the bases, though. . . Anyway, I'd like to know if the consensus I think you've reached is the one the pair of you think you've reached :) That is, with A being our starting class, C being a target class, and F being a target interface, the legal adaptation chains are: # Class to class A->C # Class to interface, possibly via other interfaces A(->F)*->F With a lookup sequence of: 1. Check the global registry for direct adaptations 2. Ask the object via __conform__ 3a. Check using isinstance() unless 2 raised LiskovViolation 3b. Nothing, since object.__conform__ does an isinstance() check 4. Ask the interface via __adapt__ 5. Look for transitive chains of interfaces in the global registry. 3a & 3b are the current differing answers to the question of who should be checking for inheritance - the adaptation machinery or the __conform__ method. If classes wish to adapt to things which their parents adapt to, they must delegate to their parent's __conform__ method as needed (or simply not override __conform__). The ONLY automatic adaptation links are those that allow a subtype to be used in place of its parent type, and this can be overriden using __conform__. (FWIW, this point about 'adapting to things my parent can adapt to' by delegating in __conform__ inclines me in favour of option 3b for handling subtyping. However, I can appreciate wanting to keep the PEP free of proposing any changes to the core - perhaps mention both, and leave the decision to the BDFL?) One question - is the presence of __adapt__ enough to mark something as an interface in the opinion of the adaptation machinery (for purposes of step 5)? Second question - will there be something like __conformant__ and __conforming__ to allow classes and interfaces to provide additional information in the transitive search in step 5? Or are both of these questions more in PEP 245 territory? Cheers, Nick. Almost sent this to c.l.p by mistake . . . -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From pje at telecommunity.com Thu Jan 13 15:32:49 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 15:31:15 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050113085248.020ffec0@mail.telecommunity.com> At 08:50 AM 1/13/05 +0100, Alex Martelli wrote: >Your proposals are novel and interesting. They also go WAY deeper into a >critical reappraisal of the whole object model of Python, which has always >been quite reasonably close to the above-mentioned "canon" and indeed has >been getting _more_ so, rather than less, since 2.2 (albeit in a uniquely >Pythonical way, as is Python's wont -- but not conceptually, nor, mostly, >practically, all that VERY far from canonic OOP). Actually, the whole generic function thing was just a way to break out of the typing problems the Python community has struggled with for years. Every attempt to bring typing or interfaces to Python has run aground on simple, practical concepts like, "what is the abstract type of dict? file?" In essence, the answer is that Python's object model *already* has a concept of type... duck typing! Basically, I'm proposing a way to "formalize" duck typing... when you say you want a 'file', then when you call the object's read method, it "reads like a file". This isn't a critical reappraisal of Python's current object model at all! It's a reappraisal of the ways we've been trying (and largely failing) to make it fit into the mold of other languages' object models. > Moreover, your proposals are at a very early stage and no doubt need a > lot more experience, discussion, maturation, and give-and-take. Agreed. In particular, operations on numeric types are the main area that needs conceptual work; most of the rest is implementation details. I hope to spend some time prototyping an implementation this weekend (I start a new contract today, so I'll be busy with "real work" till then). Nonetheless, the idea is so exciting I could barely sleep, and even as I woke this morning my head was spinning with comparisons of adaptation networks and operation networks, finding them isomorphic no matter what I tried. One really exciting part is that this concept basically allows you to write "good" adapters, while making it very difficult or impossible to write "bad" ones! Since it forces adapters to have no per-adapter state, it almost literally forces you to only create "as-a" adapters. For example, if you try to define file operations on a string, you're dead in the water before you even start: strings have no place to *store* any state. So, you can't adapt an immutable into a mutable. You *can*, however, add extra state to a mutable, but it has to be per-object state, not per-adapter state. (Coincidentally, this eliminates the need for PyProtocols' somewhat kludgy concept of "sticky" adapters.) As a result of this, this adapter model is shaped like a superset of COM - you can adapt an object as many times as you like, and the adapters you get are basically "pointers to interfaces" on the same object, with no adapter composition. And adapt(x,object) can always give you back the "original" object. This model also has some potential to improve performance: adapter classes have all their methods in one dictionary, so there's no __mro__ scan, and they have no instance dictionary, so there's no __dict__ lookup. This should mean faster lookups of methods that would otherwise be inherited, even without any special interpreter support. And it also allows a possible fast-path opcode to be used for calling methods on a type-declared parameter or variable, possibly eventually streamlined to a vtable-like structure eliminating any dictionary lookups at all (except at function definition time, to bind names in the code object to vtable offsets obtained from the types being bound to the function signature). But of course all that is much further down the road. Anyway, I suppose the *really* exciting thing about all this is how *many* different problems the approach seems to address. :) (Like automatically detecting certain classes of Liskov violations, for example. And not having to create adapter classes by hand. And being able to support Ping's abstract operations. Etc., etc., etc.) >So, I think the best course of action at this time might be for me to edit >PEP 246 to reflect some of this enormously voluminous discussion, >including points of contention (it's part of a PEP's job to also indicate >points of dissent, after all); and I think you should get a new PEP number >to use for your new ideas, and develop them on that separate PEP, say PEP >XYZ. Knowing that a rethink of the whole object-model and related canon >is going on at the same time should help me keep PEP 246 reasonably >minimal and spare, very much in the spirit of YAGNI -- as few features as >possible, for now. Sounds good to me. >If Guido, in consequence, decides to completely block 246's progress while >waiting for the Copernican Revolution of your new PEP XYZ to mature, so be >it -- his ``nose'' will no doubt be the best guide to him on the matter. It's not that big of a revolution, really. This should make Python *more* like Python, not less. > But I hope that, in the same pragmatic and minimalist spirit as his > "stop the flames" Artima post -- proposing minimalistic interfaces and > adaptation syntax as a starting point, while yet keeping as a background > reflection the rich and complicated possibilities of parameterized types > &c as discussed in his previous Artima entries -- he'll still give a > minimalistic PEP 246 the go-ahead so that widespread, real-world > experimentation with adaptation and his other proposals can proceed, and > give many Pythonistas some practical experience which will make future > discussions and developments much sounder-based and productive. Well, as a practical matter, every current interface package for Python already implements the "old" PEP 246, so nothing stops people from experimenting with it now, any more than in the past. But, since I think that my approach can simply be implemented as a more sophisticated version of PEP 246's new global adapter registry, there is no real reason for the PEPs to actually *conflict*. The only area of potential conflict is that I think __conform__ and __adapt__ might be able to wither away completely in the presence of a suitably powerful registry. >So, what do you think -- does this new plan of action sound reasonable to you? Yes. I'll prototype and PEP, unless somebody else gets as excited as I am and does it first. ;) I'll probably call it "Duck Typing and Adaptation" or some such. From michael.walter at gmail.com Thu Jan 13 15:33:17 2005 From: michael.walter at gmail.com (Michael Walter) Date: Thu Jan 13 15:33:20 2005 Subject: [Python-Dev] Son of PEP 246, redux In-Reply-To: <5.1.1.6.0.20050113084947.020f9020@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <41E59FA9.4050605@colorstudy.com> <5.1.1.6.0.20050112194955.03081ec0@mail.telecommunity.com> <20050113042605.GA58003@prometheusresearch.com> <5.1.1.6.0.20050112234334.03cbebd0@mail.telecommunity.com> <877e9a1705011221013c9de8f7@mail.gmail.com> <5.1.1.6.0.20050113002459.02bb61c0@mail.telecommunity.com> <877e9a17050112222376178511@mail.gmail.com> <41E66713.2060305@iinet.net.au> <5.1.1.6.0.20050113084947.020f9020@mail.telecommunity.com> Message-ID: <877e9a17050113063374b133ca@mail.gmail.com> Ahhh, there we go, so "file" is type you declare. All I was asking for, I thought you were thinking in a different/"more sophisticated" direction (because what "f" actually wants is not a file, but a "thing which has a read() like file" -- I thought one would like to manifest that in the type instead of implicitely by the code). Your concept is cool, tho :-) Michael On Thu, 13 Jan 2005 08:52:21 -0500, Phillip J. Eby wrote: > At 10:18 PM 1/13/05 +1000, Nick Coghlan wrote: > >Michael Walter wrote: > >>Yepyep, but *how* you declare types now? Can you quickly type the function > >>def f(x): x.read()? without needing an interface interface x_of_f: def > >>read(): pass or a decorator like @foo(x.read)? I've no idea what you > >>mean, really :o) > > > >Why would something like > > > > def f(x): > > x.read() > > > >do any type checking at all? > > It wouldn't. The idea is to make this: > > def f(x:file): > x.read() > > automatically find a method declared '@implements(file.read,X)' where X is > in x.__class__.__mro__ (or the equivalent of MRO if x.__class__ is classic). > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com > From cce at clarkevans.com Thu Jan 13 15:34:21 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu Jan 13 15:34:28 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <79990c6b05011302352cbd41de@mail.gmail.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> Message-ID: <20050113143421.GA39649@prometheusresearch.com> On Thu, Jan 13, 2005 at 10:35:39AM +0000, Paul Moore wrote: | One thing I feel is key is the fact that adaptation is a *tool*, and | as such will be used in different ways by different people. That is | not a bad thing, even if it does mean that some people will abuse the tool. | | Now, a lot of the talk has referred to "implicit" adaptation. I'm | still struggling to understand how that concept applies in practice, | beyond the case of adaptation chains - at some level, all adaptation | is "explicit", insofar as it is triggered by an adapt() call. The 'implicit' adaptation refers to the automagical construction of composite adapters assuming that a 'transitive' property holds. I've seen nothing in this thread to explain why this is so valueable, why it shouldn't be explicit, and on the contrary, most of the "problems with adapt()" seem to stem from this aggressive extension of what was proposed: Automatic construction of adapter chains is _not_ part of the original PEP 246 and I hope it remains that way. I've outlined in several posts how this case could be made easy for a application developer to do: - transitive adapters should always be explicit - it should be an error to have more than one adapter from A to Z in the registry - when adaptation fails, an informative error message can tell the application developer of possible "chains" which could work - registration of transitive adapters can be simple command application developers use: adapt.transitive(from=A,to=Z,via=M) error message can tell an application developer | James Knight's example (which seemed to get lost in the discussion, or | at least no-one commented on it) brought up a new point for me, namely | the fact that it's the library writer who creates interfaces, and | calls adapt(), but it's the library *user* who says what classes | support (can be adapted to) what interface. I hadn't focused on the | different people involved before this point. I'd say the more common pattern is three players. The framework builder, the component budiler, and the application designer. Adapt provides a mechansim for the framework builder (via __adapt__) and the component builder (via __conform__) to work together without involving the application designer. The 'registry' idea (which was not explored in the PEP) emerges from the need, albeit limited, for the application developer who is plugging a component into a framework, to have some say in the process. I think that any actions taken by the user, by registering an adapter, should be explicit. The 'diamond' problem discussed by Phillip has only confirmed this belief. You don't want the adapt() system going around assuming transitivity. However, if the application developer is certain that a conversion path from A to Z going through B, and/or Y will work, then it should be easy for them to specify this adaptation path. | Now, if we have a transitive case A->B->C, where A is written by "the | user", and C is part of "the library" and library code calls | adapt(x,C) where x is a variable which the user supplies as an object | of type A, then WHO IS RESPONSIBLE FOR B???? And does it matter, and | if it does, then what are the differences? Great question. But I'd like to rephrase that C is probably a framework, A and B are probably components; and we assume that either the framework or component developers have enabled A->B and B->C. If the user wishes to make an adapter from A->C assuming no (or acceptable for his purposes) information loss from A->C through B, then this is his/her choice. However, it shouldn't be done by the framework or component developers unless it is a perfect adaptation, and it certainly shouldn't be automagic. I don't think who owns B is particularly more important than A or C. | As I write this, being careful *not* to talk interms of "interfaces" | and "classes", I start to see Philip's point - in my mind, A (written | by the user) is a class, and C (part of the library) is an | "interface". So the answer to the question above about B is that it | depends on whether B is an interface or a class - and the sensible | transitivity rules could easily (I don't have the experience to | decide) depend on whether B is a class or an interface. I'd like to say that _any_ transitivity rule should be explicit; there is a point where you make it easy for the programmer, but for heavens sake, let's not try to do their job. | BUT, and again, Philip has made this point, I can't reason about | interfaces in the context of PEP 246, because interfaces aren't | defined there. So PEP 246 can't make a clear statement about | transitivity, precisely because it doesn't define interfaces. But does | this harm PEP 246? I'm not sure. Well, PEP 246 should be edited, IMHO, to assert that all 'implicit' adaptions are out-of-scope, and if they are supported should be done so under the direct control of the application developer. -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From pje at telecommunity.com Thu Jan 13 15:36:38 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 15:35:03 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name In-Reply-To: <2AFC1C53-6539-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com> <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> <1105553300.41e56794d1fc5@mcherm.com> <16869.33426.883395.345417@montanaro.dyndns.org> <5.1.1.6.0.20050112194512.0307cc40@mail.telecommunity.com> <5.1.1.6.0.20050112201718.02f85760@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050113093338.03e58c20@mail.telecommunity.com> At 09:00 AM 1/13/05 +0100, Alex Martelli wrote: >Incidentally, "get this specialmethod from the type (with specialcasing >for classic classes &c)" is a primitive that PEP 246 needs as much as, >say, copy.py needs it. In the light of the recent discussions of how to >fix copy.py etc, I'm unsure about what to assume there, in a rewrite of >PEP 246: that getattr(obj, '__aspecial__', None) always does the right >thing via special descriptors, that I must spell everything out, or, what >else...? I think you can make it a condition that metaclasses with __conform__ or __adapt__ must use a data descriptor like my "metamethod" decorator. Then, there is no metaconfusion since metaconfusion requires a metaclass to exist, and you're requiring that in that case, they must use a descriptor to avoid the problem. From pje at telecommunity.com Thu Jan 13 15:41:31 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 15:39:56 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: <20050113101633.GA5193@vicky.ecs.soton.ac.uk> References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <5.1.1.6.0.20050113093750.03e5a2d0@mail.telecommunity.com> At 10:16 AM 1/13/05 +0000, Armin Rigo wrote: >On the other hand, I fear that if there is a standard "metamethod" decorator >(named after Phillip's one), it will be misused. Reading the documentation >will probably leave most programmers with the feeling "it's something magical >to put on methods with __ in their names", Possible solution: have it break when it's used in a non-subtype of 'type'. That is to say, when it's not used in a metaclass. >Finally, I wonder if turning all methods whatsoever into data descriptors >(ouch! don't hit!) would be justifiable by the feeling that it's often bad >style and confusing to override a method in an instance (as opposed to >defining a method in an instance when there is none on the class). Hm. I look at this the opposite way: sometimes it's nice to provide a default version of a callable that's supposed to be stuck on the object later, just like it's nice to have a default initial value for a variable supplied by the type. I don't think that doing away with this feature for non-special methods is a step forwards. >In all cases, I'm +1 on seeing built-in method objects (PyMethodDescr_Type) >become data descriptors ("classy descriptors?" :-). Heh. :) From pje at telecommunity.com Thu Jan 13 16:08:10 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 16:06:36 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com> At 11:31 AM 1/13/05 +0100, Alex Martelli wrote: >On 2005 Jan 12, at 21:42, Phillip J. Eby wrote: > ... >>Anyway, hopefully this post and the other one will be convincing that >>considering ambiguity to be an error *reinforces* the idea of I-to-I >>perfection, rather than undermining it. (After all, if you've written a >>perfect one, and there's already one there, then either one of you is >>mistaken, or you are wasting your time writing one!) > >I'd just like to point out, as apparently conceded in your "fair enough" >sentence in another mail, that all of this talk of "wasting your time >writing" is completely unfounded. The above refers to two I-to-I adapters between the same two points, not the addition of an adapter that creates an adaptation diamond. I have indeed agreed that an "innocent" adapter diamond can trigger a "false alarm" from PyProtocols with respect to duplication. > Since that "fair enough" of yours was deeply buried somewhere inside > this huge conversation, some readers might miss the fact that your > numerous repetitions of the similar concept in different words are just > invalid, because, to recap: > >Given four interfaces A, B, C, D, there may be need of each of the single >steps A->B, A->C, B->D, C->D. Writing each of these four adapters can IN >NO WAY be considered "wasting your time writing one", because there is no >way a set of just three out of the four can be used to produce the fourth one. Right, I agreed to this. However, the fact that a test can produce false positives does not in and of itself mean that it's "just invalid" -- the question is how often the result is *useful*. >I seems to me that you do at least feel some unease at the whole >arrangement, given that you say "this whole debate has made me even less >enamored of adaptation", as Not exactly, because my experience to date has been that false alarms are exceedingly rare and I have yet to experience a *useful* adapter diamond of the type you've described in an actual real-life interface, as opposed to a made up group of A,B,C,D. So, my unease in relation to the adapter diamond issue is only that I can't say for absolutely certain that if PEP 246 use became widespread, the problem of accidental I-to-I adapter diamonds might not become much more common than it is now. >it's not clear to me that any _other_ aspect of "this whole debate" was >quite as problematic (e.g. issues such as "how to best get a special >method from class rather than instance" -- while needing to be resolved >for adaptation just as much as for copy.py etc -- hardly seem likely to >have been the ones prompting you to go looking for "a cleaner, more >intuitive way to do it" outside of the canonical, widespread approach to OOP). No, the part that made me seek another solution is the reactions of people who were relatively fresh to the debate and the concepts of adaptation, interfaces, etc. in Python. The fact that virtually every single one of them immediately reached for what developers who were more "seasoned" in this concept thought of as "adapter abuse", meant to me that: 1) Any solution that relied on people doing the right thing right out of the gate wasn't going to work 2) The true "Pythonic" solution would be one that requires the least learning of interfaces, covariance, contravariance, Liskov, all that other stuff 3) And to be Pythonic, it would have to provide only one "obvious way to do it", and that way should be in some sense the "right" way, making it at least a little harder to do something silly. In other words, I concluded that we "seasoned" developers might be right about what adaptation is supposed to be, but that our mere presentation of the ideas wasn't going to sway real users. So no, my goal wasn't to fix the adapter diamond problem per se, although I believe that the equivalent concept in duck adaptation (overlapping abstract methods) will *also* be able to trivially ignore adapter diamonds and still warn about meaningful ambiguities. Instead, the goal was to make it so that people who try to abuse adapters will quickly discover that they *can't*, and second, as soon as they ask how, the obvious answer will be, "well, that's because you're doing a type conversion. You need to create an instance of the thing you want, because you can't use that thing "as a" such-and-such." And then they will go, "Ah, yes, I see... that makes sense," and then go and sin no more. With the previous PEP, people could create all sorts of subtle problems in their code (with or without transitivity!) and have no direct indicator of a problem. Clark and Ian made me realize this with their string/file/path discussions -- *nobody* is safe from implicit adaptation if adaptation actually creates new objects with independent state! An adapter's state needs to be kept with the original object, or not at all, and most of the time "not at all" is the correct answer. >Anyway -- I'm pointing out that what to put in a rewrite of PEP 246 as a >result of all this is anything but obvious at this point, at least to me. LOL. Me either! From carribeiro at gmail.com Thu Jan 13 16:13:51 2005 From: carribeiro at gmail.com (Carlos Ribeiro) Date: Thu Jan 13 16:13:55 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com> Message-ID: <864d3709050113071350454789@mail.gmail.com> On Thu, 13 Jan 2005 10:08:10 -0500, Phillip J. Eby wrote: > With the previous PEP, people could create all sorts of subtle problems in > their code (with or without transitivity!) and have no direct indicator of > a problem. Clark and Ian made me realize this with their string/file/path > discussions -- *nobody* is safe from implicit adaptation if adaptation > actually creates new objects with independent state! An adapter's state > needs to be kept with the original object, or not at all, and most of the > time "not at all" is the correct answer. +1, specially for the last sentence. An adapter with local state is not an adapter anymore! It's funny how difficult it's to get this... but it's obvious once stated. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com From pje at telecommunity.com Thu Jan 13 16:17:49 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 16:16:15 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <79990c6b05011302352cbd41de@mail.gmail.com> References: <41E5EFF6.9090408@colorstudy.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> Message-ID: <5.1.1.6.0.20050113100908.020fc920@mail.telecommunity.com> At 10:35 AM 1/13/05 +0000, Paul Moore wrote: >Now, a lot of the talk has referred to "implicit" adaptation. I'm >still struggling to understand how that concept applies in practice, >beyond the case of adaptation chains - at some level, all adaptation >is "explicit", insofar as it is triggered by an adapt() call. It's "implicit" in that the caller of the code that contains the adapt() call carries no visible indication that adaptation will take place. > > It's *still* not intuitively incorrect to me, but there's a couple > > things I can think of... > > > > (a) After you adapted the path to the file, and have a side-effect of > > opening a file, it's unclear who is responsible for closing it. > > (b) The file object clearly has state the path object doesn't have, like > > a file position. > > (c) You can't go adapting the path object to a file whenever you > > wanted, because of those side effects. > >In the context of my example above, I was assuming that C was an >"interface" (whatever that might be). Here, you're talking about >adapting to a file (a concrete class), which I find to be a much >muddier concept. > >This is very much a "best practices" type of issue, though. I don't >see PEP 246 mandating that you *cannot* adapt to concrete classes, but >I can see that it's a dangerous thing to do. > >Even the string->path adaptation could be considered suspect. Rather, >you "should" be defining an IPath *interface*, with operations such as >join, basename, and maybe open. Then, the path class would have a >trivial adaptation to IPath, and adapting a string to an IPath would >likely do so by constructing a path object from the string. From a >practical point of view, the IPath interface adds nothing over >adapting direct to the path class, but for the purposes of clarity, >documentation, separation of concepts, etc, I can see the value. This confusion was another reason for the "Duck-Typing Adaptation" proposal; it's perfectly fine to take a 'path' class and "duck-type" an interface from it: i.e when you adapt to 'path', then if you call 'basename' on the object, you will either: 1. Invoke a method that someone has claimed is semantically equivalent to path.basename, OR 2. Get a TypeError indicating that the object you're using doesn't have such an operation available. In effect, this is the duck-typing version of a Java cast: it's more dynamic because it doesn't require you to implement all operations "up front", and also because third parties can implement the operations and add them, and because you can define abstract operations that can implement operations "in terms of" other operations. >Some mistakes are easier to avoid if you have the correct conceptual >framework. I suspect that interfaces are the conceptual framework >which make adaptation fall into place. If so, then PEP 246, and >adaptation per se, is always going to be hard to reason about for >people without a background in interfaces. Exactly, and that's a problem -- so, I think I've invented (or reinvented, one never knows) the concept of a "duck interface", that requires no special background to understand or use, because (for example) it has no inheritance except normal inheritance, and involves no "adapter classes" anywhere. Therefore, the reasoning you already apply to ordinary Python classes "just works". (Versus e.g. the interface-logic of Zope and PyProtocols, which is *not* ordinary Python inheritance.) >Hmm. I think I just disqualified myself from making any meaningful >comments :-) And I just requalified you. Feel free to continue commenting. :) From pje at telecommunity.com Thu Jan 13 16:26:54 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 16:25:20 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050113143421.GA39649@prometheusresearch.com> References: <79990c6b05011302352cbd41de@mail.gmail.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> Message-ID: <5.1.1.6.0.20050113102018.020f8860@mail.telecommunity.com> At 09:34 AM 1/13/05 -0500, Clark C. Evans wrote: >On Thu, Jan 13, 2005 at 10:35:39AM +0000, Paul Moore wrote: >| One thing I feel is key is the fact that adaptation is a *tool*, and >| as such will be used in different ways by different people. That is >| not a bad thing, even if it does mean that some people will abuse the tool. >| >| Now, a lot of the talk has referred to "implicit" adaptation. I'm >| still struggling to understand how that concept applies in practice, >| beyond the case of adaptation chains - at some level, all adaptation >| is "explicit", insofar as it is triggered by an adapt() call. > >The 'implicit' adaptation refers to the automagical construction of >composite adapters assuming that a 'transitive' property holds. Maybe some folks are using the term that way; I use it to mean that in this code: someOb.something(aFoo) 'aFoo' may be "implicitly adapted" because the 'something' method has a type declaration on the parameter. Further, 'something' might call another method with another type declaration, passing the adapted version of 'foo', which results in you possibly getting implicit transitive adaptation *anyway*, without having intended it. Also, if adapters have per-adapter state, and 'someOb.something()' is expecting 'aFoo' to keep some state it puts there across calls to methods of 'someOb', then this code won't work correctly. All of these things are "implicit adaptation" issues, IMO, and exist even withoutPyProtocols-style transitivity. "Duck adaptation" solves these issues by prohibiting per-adapter state and by making adaptation order-insensitive. (I.e. adapt(adapt(a,B),C) should always produce the same result as adapt(a,C).) From mchermside at ingdirect.com Thu Jan 13 16:42:44 2005 From: mchermside at ingdirect.com (Chermside, Michael) Date: Thu Jan 13 16:42:50 2005 Subject: [Python-Dev] Re: PEP 246: LiskovViolation as a name Message-ID: <0CFFADBB825C6249A26FDF11C1772AE101F6861E@ingdexj1.ingdirect.com> Phillip writes: > IMO, it's simpler to handle this use case by letting __conform__ return > None, since this allows people to follow the One Obvious Way to not conform > to a particular protocol. > > Then, there isn't a need to even worry about the exception name in the > first place, either... +1. Writing a default __conform__ for object is reasonable. Alex writes: > I'd rather not make a change to built-in ``object'' a prereq for PEP 246 Why not? Seems quite reasonable. Before __conform__ existed, there wasn't one for object; now that it exists, object needs one. -- Michael Chermside This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it. From ncoghlan at iinet.net.au Thu Jan 13 17:03:57 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Thu Jan 13 17:04:02 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <864d3709050113071350454789@mail.gmail.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com> <864d3709050113071350454789@mail.gmail.com> Message-ID: <41E69BED.9050508@iinet.net.au> Carlos Ribeiro wrote: > On Thu, 13 Jan 2005 10:08:10 -0500, Phillip J. Eby > wrote: > >>With the previous PEP, people could create all sorts of subtle problems in >>their code (with or without transitivity!) and have no direct indicator of >>a problem. Clark and Ian made me realize this with their string/file/path >>discussions -- *nobody* is safe from implicit adaptation if adaptation >>actually creates new objects with independent state! An adapter's state >>needs to be kept with the original object, or not at all, and most of the >>time "not at all" is the correct answer. > > > +1, specially for the last sentence. An adapter with local state is > not an adapter anymore! It's funny how difficult it's to get this... > but it's obvious once stated. +lots Now that it's been stated, I think this is similar to where implicit type conversions in C++ go wrong, and to the extent that PEP 246 aligns with those. . . *shudder*. I've also learned from this discussion just how wrong my own ideas about how to safely use adaptation were. Most Python programmers aren't going to have the benefit of listening to some smart people work through the various issues in public. Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From p.f.moore at gmail.com Thu Jan 13 17:19:20 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Thu Jan 13 17:19:24 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050113102018.020f8860@mail.telecommunity.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5.1.1.6.0.20050113102018.020f8860@mail.telecommunity.com> Message-ID: <79990c6b050113081924bcf274@mail.gmail.com> On Thu, 13 Jan 2005 10:26:54 -0500, Phillip J. Eby wrote: > At 09:34 AM 1/13/05 -0500, Clark C. Evans wrote: > >On Thu, Jan 13, 2005 at 10:35:39AM +0000, Paul Moore wrote: > >| One thing I feel is key is the fact that adaptation is a *tool*, and > >| as such will be used in different ways by different people. That is > >| not a bad thing, even if it does mean that some people will abuse the tool. > >| > >| Now, a lot of the talk has referred to "implicit" adaptation. I'm > >| still struggling to understand how that concept applies in practice, > >| beyond the case of adaptation chains - at some level, all adaptation > >| is "explicit", insofar as it is triggered by an adapt() call. > > > >The 'implicit' adaptation refers to the automagical construction of > >composite adapters assuming that a 'transitive' property holds. > > Maybe some folks are using the term that way; I use it to mean that in this > code: > > someOb.something(aFoo) > > 'aFoo' may be "implicitly adapted" because the 'something' method has a > type declaration on the parameter. Whoa! At this point in time, parameters do not have type declarations, and PEP 246 does NOTHING to change that. In terms of Python *now* you are saying that if someOb.something is defined like so: def someOb.something(f): adapted_f = adapt(f, ISomethingOrOther) then aFoo is being "implicitly adapted". I'm sorry, but this seems to me to be a completely bogus argument. The caller of someOb.something has no right to know what goes on internal to the method. I would assume that the documented interface of someOb.something would state that its parameter "must be adaptable to ISomethingOrOther" - but there's nothing implicit going on here, beyond the entirely sensible presumption that if a method requires an argument to be adaptable, it's because it plans on adapting it! And as a user of someOb.something, I would be *entirely* comfortable with being given the responsibility of ensuring that relevant adaptations exist. > Further, 'something' might call another method with another type > declaration, passing the adapted version of 'foo', which results in you > possibly getting implicit transitive adaptation *anyway*, without having > intended it. So you think it's reasonable for someOb.something to pass adapted_f on to another function? I don't - it should pass f on. OK, that's a major disadvantage of Guido's type declaration proposal - the loss of the original object - but raise that with Guido, not with PEP 246. I'd suspect that one answer from the POV of Guido's proposal would be a way of introspecting the original object - but even that would be horribly dangerous because of the significant change in semantics of an argument when a type declaration is added. (This may kill type-declaration-as-adaptation, so it's certainly serious, but *not* in terms of PEP 246). > Also, if adapters have per-adapter state, and 'someOb.something()' is > expecting 'aFoo' to keep some state it puts there across calls to methods > of 'someOb', then this code won't work correctly. That one makes my brain hurt. But you've already said that per-adapter state is bad, so maybe the pain is good for me :-) > All of these things are "implicit adaptation" issues, IMO, and exist even > withoutPyProtocols-style transitivity. I'd certainly not characterise them as "implicit adaptation" issues (except for the type declaration one, which doesn't apply to Python as it is now), and personally (as I hope I've explained above) I don't see them as PEP 246 issues, either. > ... "Duck adaptation" solves these issues ... ... and may indeed be a better way of using Guido's type declarations than making them define implicit adaptation. So I do support a separate PEP for this. But I suspect its implementation timescale will be longer than that of PEP 246... Paul. From cce at clarkevans.com Thu Jan 13 17:57:02 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu Jan 13 17:57:04 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <41E69BED.9050508@iinet.net.au> References: <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com> <864d3709050113071350454789@mail.gmail.com> <41E69BED.9050508@iinet.net.au> Message-ID: <20050113165701.GC14084@prometheusresearch.com> On Fri, Jan 14, 2005 at 02:03:57AM +1000, Nick Coghlan wrote: | Carlos Ribeiro wrote: | > On Thu, 13 Jan 2005 10:08:10 -0500, Phillip J. Eby wrote: | > > With the previous PEP, people could create all sorts of subtle problems | > > in their code (with or without transitivity!) and have no direct | > > indicator of a problem. Clark and Ian made me realize this with their | > > string/file/path discussions -- *nobody* is safe from implicit | > > adaptation if adaptation actually creates new objects with independent | > > state! An adapter's state needs to be kept with the original object, | > > or not at all, and most of the time "not at all" is the correct answer. | > | >+1, specially for the last sentence. An adapter with local state is | >not an adapter anymore! It's funny how difficult it's to get this... | >but it's obvious once stated. | | +lots -1 There is nothing wrong with an adapter from String to File, one which adds the current read position in its state. No adapter is a perfect translation -- or you woudn't need them in the first place. An adapter, by default, does just that, it wraps the object to make it compliant with another interface. To disallow it from having local state is, like taking the wheels off a car and expecting it to be useful (in somecases it is, for ice fishing, but that's another and rather oblique story). Ian stated the issue properly: adapters bring with them an intent, which cannot, in general way, be expressed in code. Therefore, combining adapters haphazardly will, of course, get you into trouble. The solution is simple -- don't do that. PEP 246 should at the very least remain silent on this issue, it should not encourage or specify automagic transitive adaptation. If a user blows their foot off, by their own actions, they will be able to track it down and learn; if the system shots their foot off, by some automatic transitive adaption; well, that's another issue. | Now that it's been stated, I think this is similar to where implicit type | conversions in C++ go wrong, and to the extent that PEP 246 aligns with | those. . . *shudder*. PEP 246 doesn't align at all with this problem. Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From gvanrossum at gmail.com Thu Jan 13 18:02:10 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Jan 13 18:02:12 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: <20050113101633.GA5193@vicky.ecs.soton.ac.uk> References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113101633.GA5193@vicky.ecs.soton.ac.uk> Message-ID: > > The descriptor for __getattr__ and other special attributes could > > claim to be a "data descriptor" > > This has the nice effect that x[y] and x.__getitem__(y) would again be > equivalent, which looks good. > > On the other hand, I fear that if there is a standard "metamethod" decorator > (named after Phillip's one), it will be misused. Reading the documentation > will probably leave most programmers with the feeling "it's something magical > to put on methods with __ in their names", and it won't be long before someone > notices that you can put this decorator everywhere in your classes (because it > won't break most programs) and gain a tiny performance improvement. > > I guess that a name-based hack in type_new() to turn all __*__() methods into > data descriptors would be even more obscure? To the contary, I just realized in this would in fact be the right approach. In particular, any *descriptor* named __*__ would be considered a "data descriptor". Non-descriptors with such names can still be overridden in the instance __dict__ (I believe this is used by Zope). > Finally, I wonder if turning all methods whatsoever into data descriptors > (ouch! don't hit!) would be justifiable by the feeling that it's often bad > style and confusing to override a method in an instance (as opposed to > defining a method in an instance when there is none on the class). > (Supporting this claim: Psyco does this simplifying hypothesis for performance > reasons and I didn't see yet a bug report for this.) Alas, it's a documented feature that you can override a (regular) method by placing an appropriate callable in the instance __dict__. > In all cases, I'm +1 on seeing built-in method objects (PyMethodDescr_Type) > become data descriptors ("classy descriptors?" :-). Let's do override descriptors. And please, someone fix copy.py in 2.3 and 2.4. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From cce at clarkevans.com Thu Jan 13 18:09:15 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu Jan 13 18:09:18 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <41E685F9.2010606@iinet.net.au> References: <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> <5.1.1.6.0.20050112140929.03d52e30@mail.telecommunity.com> <5.1.1.6.0.20050112150933.03378630@mail.telecommunity.com> <41E685F9.2010606@iinet.net.au> Message-ID: <20050113170915.GD14084@prometheusresearch.com> On Fri, Jan 14, 2005 at 12:30:17AM +1000, Nick Coghlan wrote: | Anyway, I'd like to know if the consensus I think you've reached is the | one the pair of you think you've reached :) This stated position is not the current PEP status, and it's not a concensus position I share. | That is, with A being our starting class, C being a target class, and F | being a target interface, the legal adaptation chains are: | # Class to class | A->C | # Class to interface, possibly via other interfaces | A(->F)*->F PEP246 should not talk about legal or illegal adaption chains, all adaption chains should be explicit, and if they are explicit, the programmer who specified them has made them legal. | With a lookup sequence of: | 1. Check the global registry for direct adaptations | 2. Ask the object via __conform__ | 3a. Check using isinstance() unless 2 raised LiskovViolation | 3b. Nothing, since object.__conform__ does an isinstance() check | 4. Ask the interface via __adapt__ These are OK up to 4. | 5. Look for transitive chains of interfaces in the global registry. No! No! No! Perhaps... 5. Raise a AdaptionFailed error, which includes the protocol which is being asked for. This error message _could_ also include a list of possible adaptation chains from the global registry, but this is just a suggestion. | 3a & 3b are the current differing answers to the question of who should | be checking for inheritance - the adaptation machinery or the __conform__ | method. Correct. I think either method is OK, and perfer Phillip's approach. Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From aleax at aleax.it Thu Jan 13 18:27:08 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 18:27:14 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050113143421.GA39649@prometheusresearch.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> Message-ID: <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 13, at 15:34, Clark C. Evans wrote: ... > The 'implicit' adaptation refers to the automagical construction of > composite adapters assuming that a 'transitive' property holds. I've > seen nothing in this thread to explain why this is so valueable, why Let me play devil's advocate: I _have_ seen explanations of why transitive adaptation can be convenient -- the most direct one being an example by Guido which came in two parts, the second one a clarification which came in response to my request about the first one. To summarize it: say we have N concrete classes A1, A2, ... AN which all implement interface I. Now we want to integrate into the system function f1, which requires an argument with interface J1, i.e. def f1(x): x = adapt(x, J1) ... or in Guido's new notation equivalently def f1(x: J1): ... and also f2, ..., fM, requiring an argument with interface J2, ..., JM respectively. Without transitivity, we need to code and register M*N adapters. WITH transitivity, we only need M: I->J1, I->J2, ..., I->JM. The convenience of this is undeniable; and (all other things being equal) convenience raises productivity and thus is valuable. James Knight gave a real-life example, although, since no combinatorial explosion was involved, the extra convenience that he missed in transitivity was minor compared to the potential for it when the N*M issue should arise. > it shouldn't be explicit, On this point I'm partly with you: I do not see any real loss of convenience in requiring that an adapter which is so perfect and lossless as to be usable in transitivity chains be explicitly so registered/defined/marker. E.g., provide a registerAdapter_TRANSITIVITY_SUITABLE(X, Y) entry in addition to the standard registerAdapter which does not supply transitivity (or equivalently an optional suitable_for_transitivity argument to registerAdapter defaulting to False, etc, etc). In terms of "should" as opposed to convenience, though, the argument is that interface to interface adapters SHOULD always, inherently be suitable for transitive chains because there is NO reason, EVER, under ANY circumstances, to have such adapters be less than perfect, lossless, noiseless, etc, etc. I am not entirely convinced of this but "devil's advocacy wise" I could probably argue for it: for the hapless programmers' own good, they should be forced to think very VERY carefully of what they're doing, etc, etc. Yeah, I know, I don't sound convincing because I'm not all that convinced myself;-). > and on the contrary, most of the "problems > with adapt()" seem to stem from this aggressive extension of what > was proposed: Automatic construction of adapter chains is _not_ part Fair enough, except that it's not just chains of explicitly registered adapters: interface inheritance has just the same issues, indeed, in PJE's experience, MORE so, because no code is interposed -- if by inheriting an interface you're asserting 100% no-problem substitutability, the resulting "transitivity" may thus well give problems (PJE and I even came close to agreeing that MS COM's QueryInterface idea that interface inheritance does NOT implicitly and undeniably assert substitutability is very practical, nice, and usable...). > of the original PEP 246 and I hope it remains that way. I've > outlined in several posts how this case could be made easy for a > application developer to do: > > - transitive adapters should always be explicit What about "registered explicitly as being suitable for transitivity", would that suffice? > - it should be an error to have more than one adapter > from A to Z in the registry OK, I think. There has to be a way to replace an adapter with another, I think, but it might be fair to require that this be done in two steps: unregister the old adapter, THEN immediately register the new one so that trying to register an adapter for some A->Z pair which already has one is an error. Replacing adapters feels like a rare enough operation that the tiny inconvenience should not be a problem, it appears to me. > - when adaptation fails, an informative error message can > tell the application developer of possible "chains" > which could work Nice, if not too much work. > - registration of transitive adapters can be simple command > application developers use: adapt.transitive(from=A,to=Z,via=M) > error message can tell an application developer OK, convenient if feasible. Considering all of your proposals, I'm wondering: would you be willing to author the next needed draft of the PEP in the minimal spirit I was proposing? Since I wrote the last round, you might do a better job of editing the next, and if more rounds are needed we could keep alternating (perhaps using private mail to exchange partially edited drafts, too)... > The 'registry' idea (which was not explored in the PEP) emerges from > the need, albeit limited, for the application developer who is > plugging a component into a framework, to have some say in the > process. I think that any actions taken by the user, by registering > an adapter, should be explicit. Jim Fulton is VERY keen to have registration of adapters happen "behind the scenes" at startup, starting from some kind of declarative form (a configuration textfile or the like), given the deployment needs of Zope3 -- that shouldn't be a problem, it seems to me (if we have a way to to explicit registrations, zope3 can have a startup component that finds configuration files and does the registration calls based on that 'declarative' form). This potentially opens the door to N-players scenarios for N>3, but, like going from 3-tier to N-tier applications, that's not as huge a jump as that from N==2 to N==3;-). > | BUT, and again, Philip has made this point, I can't reason about > | interfaces in the context of PEP 246, because interfaces aren't > | defined there. So PEP 246 can't make a clear statement about > | transitivity, precisely because it doesn't define interfaces. But > does > | this harm PEP 246? I'm not sure. > > Well, PEP 246 should be edited, IMHO, to assert that all 'implicit' > adaptions are out-of-scope, and if they are supported should be done > so under the direct control of the application developer. So, are you willing to do that round of editing to PEP 246...? I'll then to the NEXT one which will still undoubtedly be needed... Alex From aleax at aleax.it Thu Jan 13 18:35:52 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 18:35:57 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com> References: <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com> Message-ID: <928A1C83-6589-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 13, at 16:08, Phillip J. Eby wrote: > this with their string/file/path discussions -- *nobody* is safe from > implicit adaptation if adaptation actually creates new objects with > independent state! An adapter's state needs to be kept with the > original object, or not at all, and most of the time "not at all" is > the correct answer. So, no way to wrap a str with a StringIO to adapt to "IReadableFile"...? Ouch:-( That was one of my favourite trivial use cases... >> Anyway -- I'm pointing out that what to put in a rewrite of PEP 246 >> as a result of all this is anything but obvious at this point, at >> least to me. > > LOL. Me either! ...so let's hope Clark has clearer ideas, as it appears he does: as per a previous msg, I've asked him if he could be the one doing the next round of edits, instead of me... Alex From pje at telecommunity.com Thu Jan 13 18:38:46 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 13 18:37:13 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <20050113101633.GA5193@vicky.ecs.soton.ac.uk> <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113101633.GA5193@vicky.ecs.soton.ac.uk> Message-ID: <5.1.1.6.0.20050113123715.030839b0@mail.telecommunity.com> At 09:02 AM 1/13/05 -0800, Guido van Rossum wrote: >[Armin] > > I guess that a name-based hack in type_new() to turn all __*__() > methods into > > data descriptors would be even more obscure? > >To the contary, I just realized in this would in fact be the right >approach. In particular, any *descriptor* named __*__ would be >considered a "data descriptor". Non-descriptors with such names can >still be overridden in the instance __dict__ (I believe this is used >by Zope). It should check that the __*__-named thing isn't *already* an override descriptor, though. From aleax at aleax.it Thu Jan 13 18:43:16 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 18:43:24 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <864d3709050113071350454789@mail.gmail.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050113094204.03e5b180@mail.telecommunity.com> <864d3709050113071350454789@mail.gmail.com> Message-ID: <9AD5E532-658A-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 13, at 16:13, Carlos Ribeiro wrote: ... > +1, specially for the last sentence. An adapter with local state is > not an adapter anymore! It's funny how difficult it's to get this... > but it's obvious once stated. ...? A StringIO instance adapting a string to be used as a readablefile is not an adapter?! It's definitely a pristine example of the Adapter Design Pattern (per Gof4), anyway... and partly because of that I think it SHOULD be just fine as an ``adapter''... honestly I fail to see what's wrong with the poor StringIO instance keeping the "we have read as far as HERE" index as its "local state" (imagine a readonlyStringIO if you want, just to make for simpler concerns). Or, consider a View in a Model-View-Controller arrangement; can't we get THAT as an adapter either, because (while getting most data from the Model) it must still record some few presentation-only details locally, such as, say, the font to use? I'm not sure I'm all that enthusiastic about this crucial aspect of PJE's new "more pythonic than Python" [r]evolution, if it's being presented correctly here. Alex From cce at clarkevans.com Thu Jan 13 18:46:43 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu Jan 13 18:46:45 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <20050112195711.GA1813@prometheusresearch.com> Message-ID: <20050113174643.GB35655@prometheusresearch.com> On Wed, Jan 12, 2005 at 01:15:20PM -0800, Guido van Rossum wrote: | [Clark] | > - add a flag to adapt, allowTransitive, which defaults to False | | That wouldn't work very well when most adapt() calls are invoked | implicitly through signature declarations (per my blog's proposal). Understood. This was a side-suggestion -- not the main thrust of my response. I'm writing to convince you that automatic "combined" adaptation, even as a last resort, is a bad idea. It should be manual, but we can provide easy mechanisms for application developers to specify combined adapters easily. On Wed, Jan 12, 2005 at 02:57:11PM -0500, Clark C. Evans wrote: | On Wed, Jan 12, 2005 at 10:16:14AM -0800, Guido van Rossum wrote: | | But now, since I am still in favor of automatic "combined" adaptation | | *as a last resort* A few problems with automatic "combined" adaptation: 1. Handling the case of multiple adaptation pathways is one issue; how do you choose? There isn't a good cost algorithem since the goodness of an adapter depends largely on the programmer's need. 2. Importing or commenting out the import of a module that may seem to have little bearing on a given chunk of code could cause subtle changes in behavior or adaptation errors, as a new path becomes available, or a previously working path is disabled. 3. The technique causes people to want to say what is and isn't an adapter -- when this choice should be soly up to the appropriate developers. I'd rather not have to standardize that FileName -> File is a _bad_ adaption, but File -> String is a good adaption. Or whatever is in vogue that year. 4. It's overly complicated for what it does. I assert that this is a very minor use case. When transitive adaptation is needed, an explicit registration of an adapter can be made simple. My current suggestion to make 'transitive adaption' easy for a application builder (one putting togeher components) has a few small parts: - If an adaptation is not found, raise an error, but list in the error message two additional things: (a) what possible adaptation paths exist, and (b) how to register one of these paths in their module. - A simple method to register an adaption path, the error message above can even give the exact line needed, adapt.registerPath(from=A,to=C,via=B) - Make it an error to register more than one adapter from A to C, so that conflicts can be detected. Also, registrations could be 'module specific', or local, so that adapters used by a library need necessarly not be global. In general, I think registries suffer all sorts of namespace and scoping issues, which is why I had proposed __conform__ and __adapt__ Extending registry mechanism with automatic 'transitive' adapters makes things even worse. Cheers, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From aleax at aleax.it Thu Jan 13 18:58:36 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 18:58:43 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113101633.GA5193@vicky.ecs.soton.ac.uk> Message-ID: On 2005 Jan 13, at 18:02, Guido van Rossum wrote: ... >> In all cases, I'm +1 on seeing built-in method objects >> (PyMethodDescr_Type) >> become data descriptors ("classy descriptors?" :-). > > Let's do override descriptors. A Pronouncement!!! > And please, someone fix copy.py in 2.3 and 2.4. Sure -- what way, though? The way I proposed in my last post about it? Alex From cce at clarkevans.com Thu Jan 13 19:21:42 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu Jan 13 19:21:46 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <20050113182142.GC35655@prometheusresearch.com> On Thu, Jan 13, 2005 at 06:27:08PM +0100, Alex Martelli wrote: | >The 'implicit' adaptation refers to the automagical construction of | >composite adapters assuming that a 'transitive' property holds. I've | >seen nothing in this thread to explain why this is so valueable, why | | Let me play devil's advocate: I _have_ seen explanations of why | transitive adaptation can be convenient -- the most direct one being an | example by Guido which came in two parts, the second one a | clarification which came in response to my request about the first one. hypothetical pseudocode ;) | To summarize it: say we have N concrete classes A1, A2, ... AN which | all implement interface I. | Now we want to integrate into the system function f1, which requires an | argument with interface J1, i.e. | def f1(x): | x = adapt(x, J1) | ... | or in Guido's new notation equivalently | def f1(x: J1): | ... | and also f2, ..., fM, requiring an argument with interface J2, ..., JM | respectively. | | Without transitivity, we need to code and register M*N adapters. Are you _sure_ you have M*N adapters here? But even so, for j in (J1,J2,J3,J4,...,JM) for i in (I1,I2,...,IN): register(j,i) | WITH transitivity, we only need M: I->J1, I->J2, ..., I->JM. Without transitivity, a given programmer, in a given module will probably only use a few of these permutations; and in each case, you can argue that the developer should be aware of the 'automatic' conversions that are going on. Imagine an application developer plugging a component into a framework and getting this error: """Adaption Error Could not convert A1 to a J1. There are two adaption pathways which you could register to do this conversion for you: # A1->I1 followed by I1->J1 adapt.registerPath((A1,I1),(I1,J1)) # A1->X3 followed by X3 -> PQ follwed by PQ -> J1 adapt.registerPath((A1,X3),(X3,PQ),(PQ,J1)) """ The other issue with registries (and why I avoided them in the origional PEP) is that they often require a scoping; in this case, the path taken by one module might be different from the one needed by another. | The convenience of this is undeniable; and (all other things being | equal) convenience raises productivity and thus is valuable. It also hides assumptions. If you are doing adaptation paths | James Knight gave a real-life example, although, since no combinatorial | explosion was involved, the extra convenience that he missed in | transitivity was minor compared to the potential for it when the N*M | issue should arise. Right. And that's more like it. | >it shouldn't be explicit, | | On this point I'm partly with you: I do not see any real loss of | convenience in requiring that an adapter which is so perfect and | lossless as to be usable in transitivity chains be explicitly so | registered/defined/marker. E.g., provide a | registerAdapter_TRANSITIVITY_SUITABLE(X, Y) | entry in addition to the standard registerAdapter which does not supply | transitivity (or equivalently an optional suitable_for_transitivity | argument to registerAdapter defaulting to False, etc, etc). Ok. I just think you all are solving a problem that doesn't exist, and in the process hurting a the more common use case: A component developer X and a framework developer Y both have stuff that an application developer A is putting together. The goal is for A to not worry about _how_ the components and the framework fit; to automatically "find" the glue code. The assertion that you can layer glue... is well, tenuous at best. | In terms of "should" as opposed to convenience, though, the argument is | that interface to interface adapters SHOULD always, inherently be | suitable for transitive chains because there is NO reason, EVER, under | ANY circumstances, to have such adapters be less than perfect, | lossless, noiseless, etc, etc. I strongly disagree; the most useful adapters are the ones that discard unneeded information. The big picture above, where you're plugging components into the framework will in most cases be lossy -- or the frameworks / components would be identical and you woudn't want to hook them up. Frankly, I think the whole idea of "perfect adapters" is just, well, arrogant. | > and on the contrary, most of the "problems | >with adapt()" seem to stem from this aggressive extension of what | >was proposed: Automatic construction of adapter chains is _not_ part | | Fair enough, except that it's not just chains of explicitly registered | adapters: interface inheritance has just the same issues, indeed, in | PJE's experience, MORE so, because no code is interposed -- if by | inheriting an interface you're asserting 100% no-problem | substitutability, the resulting "transitivity" may thus well give | problems (PJE and I even came close to agreeing that MS COM's | QueryInterface idea that interface inheritance does NOT implicitly and | undeniably assert substitutability is very practical, nice, and | usable...). | | >of the original PEP 246 and I hope it remains that way. I've | >outlined in several posts how this case could be made easy for a | >application developer to do: | > | > - transitive adapters should always be explicit | | What about "registered explicitly as being suitable for transitivity", | would that suffice? I suppose so. But I think it is a bad idea for a few reasons: 1. it seems to add complexity without a real-world justifcation, let's go without it; and add it in a later version if it turns out to be as valueable as people think 2. different adapters have different intents, and I think a given adapter may be perfect in one situation, it may royally screw up in another; users of systems often break interfaces to meet immediate needs. In your strawman I can think of several such twists-and-turns that an "obviously perfect" adapter would fail to handle: - In the 'structure' variety (where the middle name is not necessarly placed in the middle), someone decides to store one's title... beacuse, well, the slot is there and they need to store this information - In the 'ordered' variety, "John P. Smith", you might have "Murata Makoto". If you thought Makoto was the last name... you'd be wrong. In short, unless a human is giving the 'ok' to an adapter's use, be it the application, framework, or component developer, then I'd expect wacko bugs. | >The 'registry' idea (which was not explored in the PEP) emerges from | >the need, albeit limited, for the application developer who is | >plugging a component into a framework, to have some say in the | >process. I think that any actions taken by the user, by registering | >an adapter, should be explicit. | | Jim Fulton is VERY keen to have registration of adapters happen "behind | the scenes" at startup, starting from some kind of declarative form (a | configuration textfile or the like), given the deployment needs of | Zope3 -- that shouldn't be a problem, it seems to me (if we have a way | to to explicit registrations, zope3 can have a startup component that | finds configuration files and does the registration calls based on that | 'declarative' form). That's fine. | This potentially opens the door to N-players scenarios for N>3, but, | like going from 3-tier to N-tier applications, that's not as huge a | jump as that from N==2 to N==3;-). The problem with registries is that often times scope is needed; just beacuse my module wants to use this adaption path, doesn't mean your module will make the same choice. I avoided registries in the first pass of the draft to avoid this issue. So, if we are going to add registries, then namespaces for the registries need to also be discussed. | So, are you willing to do that round of editing to PEP 246...? I'll | then to the NEXT one which will still undoubtedly be needed... I could make a wack at it this weekend. Best, clark From Scott.Daniels at Acm.Org Thu Jan 13 19:51:17 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Thu Jan 13 19:49:56 2005 Subject: [Python-Dev] Re: Recent IBM Patent releases In-Reply-To: References: Message-ID: Terry Reedy wrote: > "Scott David Daniels" >>I believe our current policy is that the author warrants that the code >>is his/her own work and not encumbered by any patent. > > Without a qualifier such as 'To the best of my knowledge', the latter is an > impossible warrant both practically, for an individual author without > $1000s to spend on a patent search, and legally. Legally, there is no > answer until the statute of limitations runs out or until there is an > after-the-fact final answer provided by the court system. Absolutely. I should have written that in the first place. I was trying to generate a little discussion about a particular case (the released IBM patents) where we might want to say, "for these patents, feel free to include code based on them." My understanding is that we will remove patented code if we get notice that it _is_ patented, and that we strive to not put any patented code into the source. --Scott David Daniels Scott.Daniels@Acm.Org From aleax at aleax.it Thu Jan 13 20:40:56 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 20:41:01 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050113182142.GC35655@prometheusresearch.com> References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> Message-ID: <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 13, at 19:21, Clark C. Evans wrote: ... > Are you _sure_ you have M*N adapters here? But even so, Yep. > for j in (J1,J2,J3,J4,...,JM) > for i in (I1,I2,...,IN): > register(j,i) Uh? WHAT are you registering for each j->i...? > The other issue with registries (and why I avoided them in the > origional > PEP) is that they often require a scoping; in this case, the path taken > by one module might be different from the one needed by another. I think that should not be supported, just like different modules cannot register different ways to copy.copy(X) for the same X. One size had better fit all, be it a single specific adapter or potentially a path thereof. > | The convenience of this is undeniable; and (all other things being > | equal) convenience raises productivity and thus is valuable. > > It also hides assumptions. If you are doing adaptation paths Not sure if it hides them very deeply, but yes, there may be some aspects of "information hiding" -- which is not necessarily a bad thing. > Ok. I just think you all are solving a problem that doesn't exist, Apparently, the existence of the problem is testified by the experience of the Eclipse developers (who are, specifically, adapting plugins: Eclipse being among the chief examples f plugin-based architecture... definitely an N-players scenario). > and in the process hurting a the more common use case: > > A component developer X and a framework developer Y both > have stuff that an application developer A is putting > together. The goal is for A to not worry about _how_ the > components and the framework fit; to automatically "find" > the glue code. > > The assertion that you can layer glue... is well, tenuous at best. If you ever did any glueing (of the traditional kind, e.g. in historical-furniture restoration, as opposed to relatively new miracle glues) you know you typically DO layer glue -- one layer upon one of the pieces of wood you're glueing; one layer on the other; let those two dry a bit; then, you glue the two together with a third layer in the middle. Of course it takes skill (which is why although I know the theory when I have any old valuable piece of furniture needing such restoration I have recourse to professionals;-) to avoid the easy-to-make mistake of getting the glue too thick (or uneven, etc). I'm quite ready to consider the risk of having too-thick combined layers of glue resulting from adaptation (particularly but not exclusively with transitivity): indeed PJE's new ideas may be seen as a novel way to restart-from-scratch and minimize glue thickness in the overall resulting concoction. But the optional ability for particularly skilled glue-layers to have that extra layer which makes everything better should perhaps not be discounted. Although, considering PJE's new just-started effort, it may well be wisest for PEP 246 to stick to a minimalist attitude -- leave open the possibility of future additions or alterations but only specify that minimal core of functionality which we all _know_ is needed. > | In terms of "should" as opposed to convenience, though, the argument > is > | that interface to interface adapters SHOULD always, inherently be > | suitable for transitive chains because there is NO reason, EVER, > under > | ANY circumstances, to have such adapters be less than perfect, > | lossless, noiseless, etc, etc. > > I strongly disagree; the most useful adapters are the ones that > discard unneeded information. The Facade design pattern? It's useful, but I disagree that it's "most useful" when compared to general Adapter. My favourite example is wrapping a str into a StringIO to make a filelike readable object -- that doesn't discard anything, it *adds* a "current reading point" state variable (which makes me dubious of the new "no per-state adapter" craze, which WOULD be just fine if it was true that discarding unneeded info -- facading -- is really the main use case). > The big picture above, where you're > plugging components into the framework will in most cases be lossy > -- or the frameworks / components would be identical and you woudn't > want to hook them up. Frankly, I think the whole idea of "perfect > adapters" is just, well, arrogant. So please explain what's imperfect in wrapping a str into a StringIO? > | What about "registered explicitly as being suitable for > transitivity", > | would that suffice? > > I suppose so. But I think it is a bad idea for a few reasons: > > 1. it seems to add complexity without a real-world justifcation, > let's go without it; and add it in a later version if it turns > out to be as valueable as people think Particularly in the light of PJE's newest ideas, being spare and minimal in PEP 246 does sound good, as long as we're not shutting and bolting doors against future improvements. > 2. different adapters have different intents, and I think a given > adapter may be perfect in one situation, it may royally > screw up in another; users of systems often break interfaces > to meet immediate needs. In your strawman I can think of > several such twists-and-turns that an "obviously perfect" > adapter would fail to handle: > > - In the 'structure' variety (where the middle name is > not necessarly placed in the middle), someone decides > to store one's title... beacuse, well, the slot is > there and they need to store this information > > - In the 'ordered' variety, "John P. Smith", you might > have "Murata Makoto". If you thought Makoto was the > last name... you'd be wrong. If you've ever looked into "quality of data" issues in huge databases, you know that these are two (out of thousands) typical problems -- but not problems in _adaptation_, in fact. > In short, unless a human is giving the 'ok' to an adapter's > use, be it the application, framework, or component developer, > then I'd expect wacko bugs. A lot of the data quality problems in huge databases come exactly from humans -- data entry issues, form design issues, ... all the way to schema-design issues. I don't see why, discussing a data-quality problem, you'd think that having a human OK whatsoever would help wrt having a formalized rule (e.g. a database constraint) do it. > | This potentially opens the door to N-players scenarios for N>3, but, > | like going from 3-tier to N-tier applications, that's not as huge a > | jump as that from N==2 to N==3;-). > > The problem with registries is that often times scope is needed; > just beacuse my module wants to use this adaption path, doesn't > mean your module will make the same choice. I avoided registries > in the first pass of the draft to avoid this issue. So, if we > are going to add registries, then namespaces for the registries > need to also be discussed. If _one_ registry of how to copy/serialize things is good enough for copy.py / copy_reg.py / pickle.py / ..., in the light of minimalism we should specify only one for PEP 246 too. > | So, are you willing to do that round of editing to PEP 246...? I'll > | then to the NEXT one which will still undoubtedly be needed... > > I could make a wack at it this weekend. Great! I assume you have copies of all relevant mails since they all went around this mailing list, but if you need anything just holler, including asking me privately about anything that might be unclear or ambiguous or whatever -- I'll be around all weekend except Sunday night (italian time -- afternoon US time;-). Thanks, Alex From shane.holloway at ieee.org Thu Jan 13 21:42:58 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Thu Jan 13 21:43:27 2005 Subject: [Python-Dev] frame.f_locals is writable Message-ID: <41E6DD52.2080109@ieee.org> For a little background, I'm working on making an edit and continue support in python a little more robust. So, in replacing references to unmodifiable types like tuples and bound-methods (instance or class), I iterate over gc.get_referrers. So, I'm working on frame types, and wrote this code:: def replaceFrame(self, ref, oldValue, newValue): for name, value in ref.f_locals.items(): if value is oldValue: ref.f_locals[name] = newValue assert ref.f_locals[name] is newValue But unfortunately, the assert fires. f_locals is writable, but not modifiable. I did a bit of searching on Google Groups, and found references to a desire for smalltalk like "swap" functionality using a similar approach, but no further ideas or solutions. While I am full well expecting the smack of "don't do that", this functionality would be very useful for debugging long-running applications. Is this possible to implement in CPython and ports? Is there an optimization reason to not do this? At worst, if this isn't possible, direct references in the stack will be wrong above the reload call, and corrected on the invocation of the function. This is a subtle issue with reloading code, and can be documented. And at best there is an effective way to replace it, the system can be changed to a consistent state even in the stack, and I can rejoice. Even if I have to wait until 2.5. ;) Thanks for your time! -Shane Holloway From cce at clarkevans.com Thu Jan 13 22:08:01 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Thu Jan 13 22:08:05 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> References: <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <20050113210801.GA49652@prometheusresearch.com> On Thu, Jan 13, 2005 at 08:40:56PM +0100, Alex Martelli wrote: | >The other issue with registries (and why I avoided them in the | >origional) is that they often require a scoping; in this case, | >the path taken by one module might be different from the one | >needed by another. | | I think that should not be supported, just like different modules | cannot register different ways to copy.copy(X) for the same X. One | size had better fit all, be it a single specific adapter or potentially | a path thereof. Sounds good. | >Ok. I just think you all are solving a problem that doesn't exist, | | Apparently, the existence of the problem is testified by the experience | of the Eclipse developers (who are, specifically, adapting plugins: | Eclipse being among the chief examples f plugin-based architecture... | definitely an N-players scenario). Some specific examples from Eclipse developers would help then, especially ones that argue strongly for automagical transitive adaptation. That is, ones where an alternative approach that is not automatic is clearly inferior. | > A component developer X and a framework developer Y both | > have stuff that an application developer A is putting | > together. The goal is for A to not worry about _how_ the | > components and the framework fit; to automatically "find" | > the glue code. ... | I'm quite ready to consider the risk of having too-thick combined | layers of glue resulting from adaptation (particularly but not | exclusively with transitivity): indeed PJE's new ideas may be seen as a | novel way to restart-from-scratch and minimize glue thickness in the | overall resulting concoction. I do like PJE's idea, since it seems to focus on declaring individual functions rather than on sets of functions; but I'm still unclear what problem it is trying to solve. | But the optional ability for | particularly skilled glue-layers to have that extra layer which makes | everything better should perhaps not be discounted. Although, | considering PJE's new just-started effort, it may well be wisest for | PEP 246 to stick to a minimalist attitude -- leave open the possibility | of future additions or alterations but only specify that minimal core | of functionality which we all _know_ is needed. I'd rather not be pushing for a powerful regsistry mechansim unless we have solid evidence that the value it provides outweighs the costs that it incurres. | >I strongly disagree; the most useful adapters are the ones that | >discard unneeded information. | | The Facade design pattern? It's useful, but I disagree that it's "most | useful" when compared to general Adapter My qualification was not very well placed. That said, I don't see any reason why a facade can't also be asked for via the adapt() mechanism. | So please explain what's imperfect in wrapping a str into a StringIO? It adds information, and it implies mutability which the underlying object is not. In short, it's quite a different animal from a String, which is why String->StringIO is a great example for an adapter. | >| What about "registered explicitly as being suitable for transitivity", | >| would that suffice? | > | >I suppose so. But I think it is a bad idea for a few reasons: | > | > 1. it seems to add complexity without a real-world justifcation, | > let's go without it; and add it in a later version if it turns | > out to be as valueable as people think | | Particularly in the light of PJE's newest ideas, being spare and | minimal in PEP 246 does sound good, as long as we're not shutting and | bolting doors against future improvements. Agreed! | > 2. different adapters have different intents... | | If you've ever looked into "quality of data" issues in huge databases, | you know that these are two (out of thousands) typical problems -- but | not problems in _adaptation_, in fact. I deal with these issues all of the time; but what I'm trying to express with the example is that someone may _think_ that they are writing a perfect adapter; but they may be wrong, on a number of levels. It's not so much to say what is good, but rather to challenge the notion of a 'perfect adapter'. | >In short, unless a human is giving the 'ok' to an adapter's | >use, be it the application, framework, or component developer, | >then I'd expect wacko bugs. | | A lot of the data quality problems in huge databases come exactly from | humans -- data entry issues, form design issues, ... all the way to | schema-design issues. I don't see why, discussing a data-quality | problem, you'd think that having a human OK whatsoever would help wrt | having a formalized rule (e.g. a database constraint) do it. The point I was trying to make is that automatically constructing adapters isn't a great idea unless you have someone who can vouche for the usefulness. In other words, I picture this as a physics story problem, where a bunch of numbers are given with units. While the units may keep you "in check", just randomly combining figures with the right units can give you the wrong answer. | >| So, are you willing to do that round of editing to PEP 246...? I'll | >| then to the NEXT one which will still undoubtedly be needed... | > | >I could make a wack at it this weekend. | | Great! I assume you have copies of all relevant mails since they all | went around this mailing list, but if you need anything just holler, | including asking me privately about anything that might be unclear or | ambiguous or whatever -- I'll be around all weekend except Sunday night | (italian time -- afternoon US time;-). Ok. Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From psoberoi at gmail.com Thu Jan 13 22:43:53 2005 From: psoberoi at gmail.com (Paramjit Oberoi) Date: Thu Jan 13 22:43:56 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: On Thu, 13 Jan 2005 20:40:56 +0100, Alex Martelli wrote: > > So please explain what's imperfect in wrapping a str into a StringIO? If I understand Philip's argument correctly, the problem is this: def print_next_line(f: file): print f.readline() s = "line 1\n" "line 2" print_next_line(s) print_next_line(s) This will print "line 1" twice. -param From bac at OCF.Berkeley.EDU Thu Jan 13 22:50:24 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Jan 13 22:50:31 2005 Subject: [Python-Dev] frame.f_locals is writable In-Reply-To: <41E6DD52.2080109@ieee.org> References: <41E6DD52.2080109@ieee.org> Message-ID: <41E6ED20.50103@ocf.berkeley.edu> Shane Holloway (IEEE) wrote: > For a little background, I'm working on making an edit and continue > support in python a little more robust. So, in replacing references to > unmodifiable types like tuples and bound-methods (instance or class), I > iterate over gc.get_referrers. > > So, I'm working on frame types, and wrote this code:: > > def replaceFrame(self, ref, oldValue, newValue): > for name, value in ref.f_locals.items(): > if value is oldValue: > ref.f_locals[name] = newValue > assert ref.f_locals[name] is newValue > > > But unfortunately, the assert fires. f_locals is writable, but not > modifiable. I did a bit of searching on Google Groups, and found > references to a desire for smalltalk like "swap" functionality using a > similar approach, but no further ideas or solutions. > > While I am full well expecting the smack of "don't do that", this > functionality would be very useful for debugging long-running > applications. Is this possible to implement in CPython and ports? Is > there an optimization reason to not do this? > So it would be doable, but it is not brain-dead simple if you want to keep the interface of a dict. Locals, in the frame, are an array of PyObjects (see PyFrameObject->f_localsplus). When you request f_locals that returns a dict that was created by a function that takes the array, traverses it, and creates a dict with the proper names (using PyFrameObject->f_code->co_varnames for the array offset -> name mapping). The resulting dict gets stored in PyFrameObject->f_locals. So it is writable as you discovered since it is just a dict, but it is not used in Python/ceval.c except for IMPORT_STAR; changes are just never even considered. The details for all of this can be found in Objects/frameobject.c:PyFrame_FastToLocals() . The interesting thing is that there is a corresponding PyFrame_LocalsToFast() function that seems to do what you want; it takes the dict in PyFrameObject->f_locals and propogates the changes into PyFrameObject->f_localsplus (or at least seems to; don't have time to stare at the code long enough to make sure it does that exactly). So the functionality is there (and is in the API even). It just isn't called explicitly except in two points in Python/ceval.c where you can't get at it. =) As to making changes to f_locals actually matter would require either coming up with a proxy object that is stored in f_locals instead of a dict and dynamically grab everything from f_localsplus as needed. That would suck for performance and be a pain to keep the dict API. So you can count that out. Other option would be to add a function that either directly modified single values in f_localsplus, a function that takes a dict and propogates the values, or a function that just calls PyFrame_LocalsToFast() . Personally I am against this, but that is because you would single-handedly ruin my master's thesis and invalidate any possible type inferencing one can do in Python without some semantic change. But then again my thesis shows that amount of type inferencing is not worth the code complexity so it isn't totally devastating. =) And you are right, "don't do that". =) Back to the putrid, boggy marsh of JavaLand for me... -Brett From aleax at aleax.it Thu Jan 13 22:59:53 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 13 22:59:59 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 13, at 22:43, Paramjit Oberoi wrote: > On Thu, 13 Jan 2005 20:40:56 +0100, Alex Martelli > wrote: >> >> So please explain what's imperfect in wrapping a str into a StringIO? > > If I understand Philip's argument correctly, the problem is this: > > def print_next_line(f: file): > print f.readline() > > s = "line 1\n" "line 2" > > print_next_line(s) > print_next_line(s) > > This will print "line 1" twice. Ah! A very clear example, thanks. Essentially equivalent to saying that adapting a list to an iterator ``rewinds'' each time the ``adaptation'' is performed, if one mistakenly thinks of iter(L) as providing an _adapter_: def print_next_item(it: iterator): print it.next() L = ['item 1', 'item 2'] print_next_item(L) print_next_item(L) Funny that the problem was obvious to me for the list->iterator issue and yet I was so oblivious to it for the str->readablefile one. OK, this does show that (at least some) classical cases of Adapter Design Pattern are unsuitable for implicit adaptation (in a language with mutation -- much like, say, a square IS-A rectangle if a language does not allow mutation, but isn't if the language DOES allow it). Thanks! Alex From p.f.moore at gmail.com Thu Jan 13 23:10:49 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Thu Jan 13 23:10:52 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <79990c6b05011314102399f2a3@mail.gmail.com> On Thu, 13 Jan 2005 13:43:53 -0800, Paramjit Oberoi wrote: > On Thu, 13 Jan 2005 20:40:56 +0100, Alex Martelli wrote: > > > > So please explain what's imperfect in wrapping a str into a StringIO? > > If I understand Philip's argument correctly, the problem is this: > > def print_next_line(f: file): > print f.readline() > > s = "line 1\n" "line 2" > > print_next_line(s) > print_next_line(s) > > This will print "line 1" twice. Nice example! The real subtlety here is that f = adapt(s, StringIO) print_next_line(f) print_next_line(f) *does* work - the implication is that for the original example to work, adapt(s, StringIO) needs to not only return *a* wrapper, but to return *the same wrapper* every time. Which may well break *other* uses, which expect a "new" wrapper each time. But the other thing that this tends to make me believe even more strongly is that using Guido's type notation for adaptation is a bad thing. def print_next_line(f): ff = adapt(f, file) print ff.readline() Here, the explicit adaptation step in the definition of the function feels to me a little more obviously a "wrapping" operation which may reinitialise the adapter - and would raise warning bells in my mind if I thought of it in terms of a string->StringIO adapter. Add this to the inability to recover the original object (for readaptation, or passing on as an argument to another function), and I'm very concerned about Guido's type notation being used as an abbreviation for adaptation... Paul. From gvanrossum at gmail.com Fri Jan 14 00:11:43 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 14 00:11:46 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113101633.GA5193@vicky.ecs.soton.ac.uk> Message-ID: > > Let's do override descriptors. > > A Pronouncement!!! > > > And please, someone fix copy.py in 2.3 and 2.4. > > Sure -- what way, though? The way I proposed in my last post about it? This would do it, right? (From your first post in this conversation according to gmail:) > Armin's fix was to change: > > conform = getattr(type(obj), '__conform__', None) > > into: > > for basecls in type(obj).__mro__: > if '__conform__' in basecls.__dict__: > conform = basecls.__dict__['__conform__'] > break > else: > # not found -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleax at aleax.it Fri Jan 14 00:26:02 2005 From: aleax at aleax.it (Alex Martelli) Date: Fri Jan 14 00:26:07 2005 Subject: getting special from type, not instance (was Re: [Python-Dev] copy confusion) In-Reply-To: References: <5.1.1.6.0.20050111172656.0306fa10@mail.telecommunity.com> <83E2B593-6470-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113101633.GA5193@vicky.ecs.soton.ac.uk> Message-ID: <7D1C69B7-65BA-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 14, at 00:11, Guido van Rossum wrote: >>> Let's do override descriptors. >> >> A Pronouncement!!! >> >>> And please, someone fix copy.py in 2.3 and 2.4. >> >> Sure -- what way, though? The way I proposed in my last post about >> it? > > This would do it, right? (From your first post in this conversation > according to gmail:) > >> Armin's fix was to change: >> >> conform = getattr(type(obj), '__conform__', None) >> >> into: >> >> for basecls in type(obj).__mro__: >> if '__conform__' in basecls.__dict__: >> conform = basecls.__dict__['__conform__'] >> break >> else: >> # not found Yes, the code could be expanded inline each time it's needed (for __copy__, __getstate__, and all other special methods copy.py needs to get-from-the-type). It does seem better to write it once as a private function of copy.py, though. Plus, to fix the effbot's bug, we need to have in function copy() a test about object type that currently is in deepcopy() [[for the commented purpose of fixing a problem with Boost's old version -- but it works to make deepcopy work in the effbot's case too]] but not in copy(). Lastly, the tests should also be enriched to make sure they catch the bug (no doc change needed, it seems to me). I can do it this weekend if the general approach is OK, since Clark has kindly agreed to do the next rewrite of PEP 246;-). Alex From cce at clarkevans.com Fri Jan 14 02:03:07 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 02:03:10 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <20050114010307.GA51446@prometheusresearch.com> Ok. I think we have identified two sorts of restrictions on the sorts of adaptations one may want to have: `stateless' the adaptation may only provide a result which does not maintain its own state `lossless' the adaptation preserves all information available in the original object, it may not discard state If we determined that these were the 'big-ones', we could possibly allow for the signature of the adapt request to be parameterized with these two designations, with the default to accept any sort of adapter: adapt(object, protocol, alternative = None, stateless = False, lossless = False) __conform__(self, protocol, stateless, lossless) __adapt__(self, object, stateless, lossless) Then, Guido's 'Optional Static Typing', def f(X: Y): pass would be equivalent to def f(X): X = adapt(Y, True, True) In other words, while calling adapt directly would allow for any adapter; using the 'Static Typing' short-cut one would be asking for adapters which are both stateless and lossless. Since __conform__ and __adapt__ would sprout two new arguments, it would make those writing adapters think a bit more about the kind of adapter that they are providing. Furthermore, perhaps composite adapters can be automatically generated from 'transitive' adapters (that is, those which are both stateless and lossless). But adaptations which were not stateless and lossless would not be used (by default) in an automatic adapter construction. Your thoughts? Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From bob at redivi.com Fri Jan 14 02:08:43 2005 From: bob at redivi.com (Bob Ippolito) Date: Fri Jan 14 02:08:47 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114010307.GA51446@prometheusresearch.com> References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> Message-ID: On Jan 13, 2005, at 20:03, Clark C. Evans wrote: > Ok. I think we have identified two sorts of restrictions on the > sorts of adaptations one may want to have: > > `stateless' the adaptation may only provide a result which > does not maintain its own state > > `lossless' the adaptation preserves all information available > in the original object, it may not discard state > > If we determined that these were the 'big-ones', we could possibly > allow for the signature of the adapt request to be parameterized with > these two designations, with the default to accept any sort of adapter: > > adapt(object, protocol, alternative = None, > stateless = False, lossless = False) > > __conform__(self, protocol, stateless, lossless) > > __adapt__(self, object, stateless, lossless) > > Then, Guido's 'Optional Static Typing', > > def f(X: Y): > pass > > would be equivalent to > > def f(X): > X = adapt(Y, True, True) > > In other words, while calling adapt directly would allow for any > adapter; > using the 'Static Typing' short-cut one would be asking for adapters > which are both stateless and lossless. Since __conform__ and __adapt__ > would sprout two new arguments, it would make those writing adapters > think a bit more about the kind of adapter that they are providing. > > Furthermore, perhaps composite adapters can be automatically generated > from 'transitive' adapters (that is, those which are both stateless > and lossless). But adaptations which were not stateless and lossless > would not be used (by default) in an automatic adapter construction. > > Your thoughts? In some cases, such as when you plan to consume the whole thing in one function call, you wouldn't care so much if it's stateless. -bob From gvanrossum at gmail.com Fri Jan 14 02:52:10 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 14 02:52:14 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114010307.GA51446@prometheusresearch.com> References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> Message-ID: > Then, Guido's 'Optional Static Typing', > > def f(X: Y): > pass > > would be equivalent to > > def f(X): > X = adapt(Y, True, True) > > In other words, while calling adapt directly would allow for any adapter; > using the 'Static Typing' short-cut one would be asking for adapters > which are both stateless and lossless. Since __conform__ and __adapt__ > would sprout two new arguments, it would make those writing adapters > think a bit more about the kind of adapter that they are providing. This may solve the curernt raging argument, but IMO it would make the optional signature declaration less useful, because there's no way to accept other kind of adapters. I'd be happier if def f(X: Y) implied X = adapt(X, Y). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Fri Jan 14 02:54:33 2005 From: python at rcn.com (Raymond Hettinger) Date: Fri Jan 14 02:58:29 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114010307.GA51446@prometheusresearch.com> Message-ID: <000d01c4f9dc$138633a0$e841fea9@oemcomputer> > Ok. I think we have identified two sorts of restrictions on the > sorts of adaptations one may want to have: > > `stateless' the adaptation may only provide a result which > does not maintain its own state > > `lossless' the adaptation preserves all information available > in the original object, it may not discard state +1 on having a provision for adapters to provide some meta-information about themselves. With these two key properties identified at the outset, adapt calls can be made a bit more intelligent (or at least less prone to weirdness). There is some merit to establishing these properties right away rather than trying to retrofit adapters after they've been in the wild for a while. > Since __conform__ and __adapt__ > would sprout two new arguments, it would make those writing adapters > think a bit more about the kind of adapter that they are providing. Using optional arguments may not be the most elegant or extensible approach. Perhaps a registry table or adapter attributes would fare better. Raymond Hettinger From cce at clarkevans.com Fri Jan 14 03:02:54 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 03:02:56 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> Message-ID: <20050114020254.GA87169@prometheusresearch.com> On Thu, Jan 13, 2005 at 08:08:43PM -0500, Bob Ippolito wrote: | >Ok. I think we have identified two sorts of restrictions on the | >sorts of adaptations one may want to have: | > | > `stateless' the adaptation may only provide a result which | > does not maintain its own state | > | > `lossless' the adaptation preserves all information available | > in the original object, it may not discard state | > | >If we determined that these were the 'big-ones', we could possibly | >allow for the signature of the adapt request to be parameterized with | >these two designations, with the default to accept any sort of adapter: | > | > adapt(object, protocol, alternative = None, | > stateless = False, lossless = False) | > | >Then, Guido's 'Optional Static Typing', | > | > def f(X: Y): | > pass | > | > would be equivalent to | > | > def f(X): | > X = adapt(X,Y, stateless = True, lossless = True) .. | | In some cases, such as when you plan to consume the whole thing in one | function call, you wouldn't care so much if it's stateless. etrepum, True False stateless adapter may not add adapter may have its state beyond that already own state, if it wishes provided by the object but additional state is not required lossless adapter must preserve and adapter may discard give all information which information if it wishes the underlying object has So, in this case, if your consumer doesn't care if the adapter is stateless or not, just call adapt(), which defaults to the case that you wish. Is this a better explanation? Or is this whole idea too convoluted? Best, Clark From foom at fuhm.net Fri Jan 14 03:41:13 2005 From: foom at fuhm.net (James Y Knight) Date: Fri Jan 14 03:41:13 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <20050113174643.GB35655@prometheusresearch.com> References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <20050112195711.GA1813@prometheusresearch.com> <20050113174643.GB35655@prometheusresearch.com> Message-ID: On Jan 13, 2005, at 12:46 PM, Clark C. Evans wrote: > My current suggestion to make 'transitive adaption' easy for a > application builder (one putting togeher components) has a few > small parts: > > - If an adaptation is not found, raise an error, but list in > the error message two additional things: (a) what possible > adaptation paths exist, and (b) how to register one of > these paths in their module. > > - A simple method to register an adaption path, the error message > above can even give the exact line needed, > > adapt.registerPath(from=A,to=C,via=B) I'd just like to note that this won't solve my use case for transitive adaptation. To keep backwards compatibility, I can't depend on the application developer to register an adapter path from A through IResource to INewResource. Trying to adapt A to INewResource needs to just work. I can't register the path either, because I (the framework author) don't know anything about A. A solution that would work is if I have to explicitly declare the adapter from IResource to INewResource as 'safe', as long as I don't also have to declare the adapter from A to IResource as 'safe'. (That is, I suppose -- in a transitive adapter chain, all except one adapter in the chain would have to be declared 'safe'). I don't know whether or not it's worthwhile to have this encoded in the framework, as it is clearly possible to do it on my own in any case. I'll leave that for others to debate. :) James From DavidA at ActiveState.com Fri Jan 14 04:08:05 2005 From: DavidA at ActiveState.com (David Ascher) Date: Fri Jan 14 04:08:11 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <41E73795.2070505@ActiveState.com> Alex Martelli wrote: > Yes, there is (lato sensu) "non-determinism" involved, just like in, say: > for k in d: > print k Wow, it took more than the average amount of googling to figure out that lato sensu means "broadly speaking", and occurs as "sensu lato" with a 1:2 ratio. I learned something today! ;-) --david From skip at pobox.com Thu Jan 13 22:56:19 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 14 04:52:31 2005 Subject: [Python-Dev] redux: fractional seconds in strptime Message-ID: <16870.61059.451494.303971@montanaro.dyndns.org> A couple months ago I proposed (maybe in a SF bug report) that time.strptime() grow some way to parse time strings containing fractional seconds based on my experience with the logging module. I've hit that stumbling block again, this time in parsing files with timestamps that were generated using datetime.time objects. I hacked around it again (in miserable fashion), but I really think this shortcoming should be addressed. A couple possibilities come to mind: 1. Extend the %S format token to accept simple decimals that match the re pattern "[0-9]+(?:\.[0-9]+)". 2. Add a new token that accepts decimals as above to avoid overloading the meaning of %S. 3. Add a token that matches integers corresponding to fractional parts. The Perl DateTime module uses %N to match nanoseconds (wanna bet that was added by a physicist?). Arbitrary other units can be specified by sticking a number between the "%" and the "N". I didn't see an example, but I presume "%6N" would match integers that are interpreted as microseconds. The advantage of the third choice is that you can use anything as the "decimal" point. The logging module separates seconds from their fractional part with a comma for some reason. (I live in the USofA where decimal points are usually represented by a period. I would be in favor of replacing the comma with a locale-specific decimal point in a future version of the logging module.) I'm not sure I like the optional exponent thing in Perl's DateTime module but it does make it easy to interpret integers representing fractions of a second when they occur without a decimal point to tell you where it is. I'm open to suggestions and will be happy to implement whatever is agreed to. Skip From bac at OCF.Berkeley.EDU Fri Jan 14 05:16:16 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Jan 14 05:16:26 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <16870.61059.451494.303971@montanaro.dyndns.org> References: <16870.61059.451494.303971@montanaro.dyndns.org> Message-ID: <41E74790.60108@ocf.berkeley.edu> Skip Montanaro wrote: > A couple months ago I proposed (maybe in a SF bug report) http://www.python.org/sf/1006786 that > time.strptime() grow some way to parse time strings containing fractional > seconds based on my experience with the logging module. I've hit that > stumbling block again, this time in parsing files with timestamps that were > generated using datetime.time objects. I hacked around it again (in > miserable fashion), but I really think this shortcoming should be addressed. > > A couple possibilities come to mind: > > 1. Extend the %S format token to accept simple decimals that match > the re pattern "[0-9]+(?:\.[0-9]+)". > > 2. Add a new token that accepts decimals as above to avoid overloading > the meaning of %S. > > 3. Add a token that matches integers corresponding to fractional parts. > The Perl DateTime module uses %N to match nanoseconds (wanna bet that > was added by a physicist?). Arbitrary other units can be specified > by sticking a number between the "%" and the "N". I didn't see an > example, but I presume "%6N" would match integers that are > interpreted as microseconds. > The problem I have always had with this proposal is that the value is worthless, time tuples do not have a slot for fractional seconds. Yes, it could possibly be changed to return a float for seconds, but that could possibly break things. My vote is that if something is added it be like %N but without the optional optional digit count. This allows any separator to be used while still consuming the digits. It also doesn't suddenly add optional args which are not supported for any other directive. -Brett From pje at telecommunity.com Fri Jan 14 05:50:37 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 05:49:01 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114010307.GA51446@prometheusresearch.com> References: <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com> At 08:03 PM 1/13/05 -0500, Clark C. Evans wrote: >Ok. I think we have identified two sorts of restrictions on the >sorts of adaptations one may want to have: > > `stateless' the adaptation may only provide a result which > does not maintain its own state > > `lossless' the adaptation preserves all information available > in the original object, it may not discard state 'lossless' isn't really a good term for non-noisy. The key is that a "noisy" adapter is one that alters the precision of the information it provides, by either claiming greater precision than is actually present, or by losing precision that was present in the meaning of the data. (I.e., truncating 12.3 to 12 loses precision, but dropping the middle name field of a name doesn't because the first and last name are independent from the middle name). Anyway, being non-noisy is only a prerequisite for interface-to-interface adapters, because they're claiming to be suitable for all possible implementations of the source interface. 'statelessness', on the other hand, is primarily useful as a guide to whether what you're building is really an "as-a" adapter. If an adapter has per-adapter state, it's an extremely good indication that it's actually a *decorator* (in GoF pattern terminology). In GoF, an Adapter simply converts one interface to another, it doesn't implement new functionality. A decorator, on the other hand, is used to "add responsibilities to individual objects dynamically and transparently, that is, without affecting other objects." In fact, as far as I can tell from the GoF book, you can't *have* multiple adapter instances for a given object in their definition of the "adapter pattern". IOW, there's no per-adapter state, and their examples never suggest the idea that the adapter pattern is intended to add any per-adaptee state, either. So, by their terminology, PEP 246 is a mechanism for dynamically selecting and obtaining *decorators*, not adapters. As if people weren't already confused enough about decorators. :) Anyway, for type declaration, IMO statelessness is the key criterion. Type declaration "wants" to have true adapters (which can maintain object identity), not decorators (which are distinct objects from the things they add functionality to). >In other words, while calling adapt directly would allow for any adapter; >using the 'Static Typing' short-cut one would be asking for adapters >which are both stateless and lossless. Since __conform__ and __adapt__ >would sprout two new arguments, it would make those writing adapters >think a bit more about the kind of adapter that they are providing. Unfortunately, in practice this will just lead to people ignoring the arguments, because 1) it's easier and 2) it will make their code work with type declarations! So, it won't actually produce any useful effect. From cce at clarkevans.com Fri Jan 14 06:00:52 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 06:00:55 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com> References: <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com> Message-ID: <20050114050052.GB93742@prometheusresearch.com> On Thu, Jan 13, 2005 at 11:50:37PM -0500, Phillip J. Eby wrote: | 'lossless' isn't really a good term for non-noisy. The key is that a | "noisy" adapter is one that alters the precision of the information it | provides, by either claiming greater precision than is actually present, | or by losing precision that was present in the meaning of the data. Noisy doesn't cut it -- my PC fan is noisy. In computer science, noisy usually refers to a flag on an object that tells it to spew debug output... | 'statelessness', on the other hand, is primarily useful as a guide to | whether what you're building is really an "as-a" adapter. If an adapter | has per-adapter state, it's an extremely good indication that it's | actually a *decorator* (in GoF pattern terminology). GoF is very nice, but I'm using a much broader definition of 'adapt': To make suitable to or fit for a specific use or situation By this definition, decorators, facade are both kinds of adapters. | Anyway, for type declaration, IMO statelessness is the key criterion. | Type declaration "wants" to have true adapters (which can maintain object | identity), not decorators (which are distinct objects from the things | they add functionality to). Stateful adapters are very useful, and the value of PEP 246 is significantly reduced without alowing them. | Unfortunately, in practice this will just lead to people ignoring the | arguments, because 1) it's easier and 2) it will make their code work | with type declarations! So, it won't actually produce any useful effect. Hmm. Best, Clark From pje at telecommunity.com Fri Jan 14 06:11:10 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 06:09:35 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> Message-ID: <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> At 05:52 PM 1/13/05 -0800, Guido van Rossum wrote: >This may solve the curernt raging argument, but IMO it would make the >optional signature declaration less useful, because there's no way to >accept other kind of adapters. I'd be happier if def f(X: Y) implied X >= adapt(X, Y). The problem is that type declarations really want more guarantees about object identity and state than an unrestricted adapt() can provide, including sane behavior when objects are passed into the same or different functions repeatedly. See this short post by Paul Moore: http://mail.python.org/pipermail/python-dev/2005-January/051020.html It presents some simple examples that show how non-deterministic adaptation can be in the presence of stateful adapters created "implicitly" by type declaration. It suggests that just avoiding transitive interface adapters may not be sufficient to escape C++ish pitfalls. Even if you're *very* careful, your seemingly safe setup can be blown just by one routine passing its argument to another routine, possibly causing an adapter to be adapted. This is a serious pitfall because today when you 'adapt' you can also access the "original" object -- you have to first *have* it, in order to *adapt* it. But type declarations using adapt() prevents you from ever *seeing* the original object within a function. So, it's *really* unsafe in a way that explicitly calling 'adapt()' is not. You might be passing an adapter to another function, and then that function's signature might adapt it again, or perhaps just fail because you have to adapt from the original object. Clark's proposal isn't going to solve this issue for PEP 246, alas. In order to guarantee safety of adaptive type declarations, the implementation strategy *must* be able to guarantee that 1) adapters do not have state of their own, and 2) adapting an already-adapted object re-adapts the original rather than creating a new adapter. This is what the monkey-typing PEP and prototype implementation are intended to address. (This doesn't mean that explicit adapt() still isn't a useful thing, it just means that using it for type declarations is a bad idea in ways that we didn't realize until after the "great debate".) From pje at telecommunity.com Fri Jan 14 06:28:07 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 06:26:33 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114050052.GB93742@prometheusresearch.com> References: <5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com> <5.1.1.6.0.20050112173908.0212fd90@mail.telecommunity.com> <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050113232719.033223a0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050114001345.02b251e0@mail.telecommunity.com> At 12:00 AM 1/14/05 -0500, Clark C. Evans wrote: >On Thu, Jan 13, 2005 at 11:50:37PM -0500, Phillip J. Eby wrote: >| 'lossless' isn't really a good term for non-noisy. The key is that a >| "noisy" adapter is one that alters the precision of the information it >| provides, by either claiming greater precision than is actually present, >| or by losing precision that was present in the meaning of the data. > >Noisy doesn't cut it -- my PC fan is noisy. In computer science, >noisy usually refers to a flag on an object that tells it to spew >debug output... Come up with a better name, then. Precision-munging? :) >| Anyway, for type declaration, IMO statelessness is the key criterion. >| Type declaration "wants" to have true adapters (which can maintain object >| identity), not decorators (which are distinct objects from the things >| they add functionality to). > >Stateful adapters are very useful, and the value of PEP 246 is >significantly reduced without alowing them. Absolutely. But that doesn't mean type declarations are the right choice for PEP 246. Look at this code: def foo(self, bar:Baz): bar.whack(self) self.spam.fling(bar) Does this code produce transitive adaptation (i.e. adapt an adapter)? Can't tell? Me neither. :) It depends on what type spam.fling() declares its parameter to be, and whether the caller of foo() passed in an object that needed an adapter to Baz. The problem here is that *all* of the arguments you and Alex and others raised in the last few days against unconstrained transitive adaptation apply in spades to type declarations. My argument was that if well-designed and properly used, transitivity could be quite safe, but even I agreed that uncontrolled semi-random adapter composition was madness. Unfortunately, having type declarations do adapt() introduces the potential for precisely this sort of uncontrolled semi-random adapter madness in seemingly harmless code. Now compare to *this* code: def foo(self, bar): adapt(bar,Baz).whack(self) self.spam.fling(bar) It's obvious that the above does not introduce a transitive adaptation; at least if it was passed an "original" object, then it will pass on that original object to spam.fling(). So, explicit use of PEP 246 doesn't introduce this problem, but type declarations do. With type declarations you can never even *know* if you have the "original object" or not, let alone get it if you don't have it. From gvanrossum at gmail.com Fri Jan 14 07:20:40 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 14 07:20:43 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> Message-ID: [Guido] > >This may solve the curernt raging argument, but IMO it would make the > >optional signature declaration less useful, because there's no way to > >accept other kind of adapters. I'd be happier if def f(X: Y) implied X > >= adapt(X, Y). [Phillip] > The problem is that type declarations really want more guarantees about > object identity and state than an unrestricted adapt() can provide, I'm not so sure. When I hear "guarantee" I think of compile-time checking, and I though that was a no-no. > including sane behavior when objects are passed into the same or different > functions repeatedly. See this short post by Paul Moore: > > http://mail.python.org/pipermail/python-dev/2005-January/051020.html Hm. Maybe that post points out that adapters that add state are bad, period. I have to say that the example of adapting a string to a file using StringIO() is questionable. Another possible adaptation from a string to a file would be open(), and in fact I know a couple of existing APIs in the Python core (and elsewhere) that take either a string or a file, and interpret the string as a filename. Operations that are customarily done with string data or a file typically use two different different function/method names, for example pickle.load and pickle.loads. But I'd be just as happy if an API taking either a string or a file (stream) should be declared as taking the union of IString and IStream; adapting to a union isn't that hard to define (I think someone gave an example somewhere already). OK, so what am I saying here (rambling really): my gut tells me that I still like argument declarations to imply adapt(), but that adapters should be written to be stateless. (I'm not so sure I care about lossless.) Are there real-life uses of stateful adapters that would be thrown out by this requirement? > Even if you're *very* careful, your seemingly safe setup can be blown just > by one routine passing its argument to another routine, possibly causing an > adapter to be adapted. This is a serious pitfall because today when you > 'adapt' you can also access the "original" object -- you have to first > *have* it, in order to *adapt* it. How often is this used, though? I can imagine all sorts of problems if you mix access to the original object and to the adapter. > But type declarations using adapt() > prevents you from ever *seeing* the original object within a function. So, > it's *really* unsafe in a way that explicitly calling 'adapt()' is > not. You might be passing an adapter to another function, and then that > function's signature might adapt it again, or perhaps just fail because you > have to adapt from the original object. Real-life example, please? I can see plenty of cases where this could happen with explicit adaptation too, for example f1 takes an argument and adapts it, then calls f2 with the adapted value, which calls f3, which adapts it to something else. Where is f3 going to get the original object? I wonder if a number of these cases are isomorphic to the hypothetical adaptation from a float to an int using the int() constructor -- no matter how we end up defining adaptation, that should *not* happen, and neither should adaptation from numbers to strings using str(), or from strings to numbers using int() or float(). But the solution IMO is not to weigh down adapt(), but to agree, as a user community, not to create such "bad" adapters, period. OTOH there may be specific cases where the conventions of a particular application or domain make stateful or otherwise naughty adapters useful, and everybody understands the consequences and limitations. Sort of the way that NumPy defines slices as views on the original data, even though lists define slices as copies of the original data; you have to know what you are doing with the NumPy slices but the NumPy users don't seem to have a problem with that. (I think.) > Clark's proposal isn't going to solve this issue for PEP 246, alas. In > order to guarantee safety of adaptive type declarations, the implementation > strategy *must* be able to guarantee that 1) adapters do not have state of > their own, and 2) adapting an already-adapted object re-adapts the original > rather than creating a new adapter. This is what the monkey-typing PEP and > prototype implementation are intended to address. Guarantees again. I think it's hard to provide these, and it feels unpythonic. (2) feels weird too -- almost as if it were to require that float(int(3.14)) should return 3.14. That ain't gonna happen. > (This doesn't mean that explicit adapt() still isn't a useful thing, it > just means that using it for type declarations is a bad idea in ways that > we didn't realize until after the "great debate".) Or maybe we shouldn't try to guarantee so much and instead define simple, "Pythonic" semantics and live with the warts, just as we do with mutable defaults and a whole slew of other cases where Python makes a choice rooted in what is easy to explain and implement (for example allowing non-Liskovian subclasses). Adherence to a particular theory about programming is not very Pythonic; doing something that superficially resembles what other languages are doing but actually uses a much more dynamic mechanism is (for example storing instance variables in a dict, or defining assignment as name binding rather than value copying). My 0.02 MB, -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Fri Jan 14 08:38:05 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 08:36:33 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com> At 10:20 PM 1/13/05 -0800, Guido van Rossum wrote: >[Guido] > > >This may solve the curernt raging argument, but IMO it would make the > > >optional signature declaration less useful, because there's no way to > > >accept other kind of adapters. I'd be happier if def f(X: Y) implied X > > >= adapt(X, Y). > >[Phillip] > > The problem is that type declarations really want more guarantees about > > object identity and state than an unrestricted adapt() can provide, > >I'm not so sure. When I hear "guarantee" I think of compile-time >checking, and I though that was a no-no. No, it's not compile-time based, it's totally at runtime. I mean that if the implementation of 'adapt()' *generates* the adapter (cached of course for source/target type pairs), it can trivially guarantee that adapter's stateless. Quick demo (strawman syntax) of declaring adapters... First, a type declaring that its 'read' method has the semantics of 'file.read': class SomeKindOfStream: def read(self, byteCount) like file.read: ... Second, third-party code adapting a string iterator to a readable file: def read(self, byteCount) like file.read for type(iter("")): # self is a string iterator here, implement read() # in terms of its .next() And third, some standalone code implementing an "abstract" dict.update method for any source object that supports a method that's "like" dict.__setitem__: def update_anything(self:dict, other:dict) like dict.update for object: for k,v in other.items(): self[k] = v Each of these examples registers the function as an implementation of the "file.read" operation for the appropriate type. When you want to build an adapter from SomeKindOfStream or from a string iterator to the "file" type, you just access the 'file' type's descriptors, and look up the implementation registered for that descriptor for the source type (SomeKindOfStream or string-iter). If there is no implementation registered for a particular descriptor of 'file', you leave the corresponding attribute off of the adapter class, resulting in a class representing the subset of 'file' that can be obtained for the source class. The result is that you generate a simple adapter class whose only state is a read-only slot pointing to the adapted object, and descriptors that bind the registered implementations to that object. That is, the descriptor returns a bound instancemethod with an im_self of the original object, not the adapter. (Thus the implementation never even gets a reference to the adapter, unless 'self' in the method is declared of the same type as the adapter, which would be the case for an abstract method like 'readline()' being implemented in terms of 'read'.) Anyway, it's therefore trivially "guaranteed" to be stateless (in the same way that an 'int' is "guaranteed" to be immutable), and the implementation is also "guaranteed" to be able to always get back the "original" object. Defining adaptation in terms of adapting operations also solves another common problem with interface mechanisms for Python: the dreaded "mapping interface" and "file-like object" problem. Really, being able to *incompletely* implement an interface is often quite useful in practice, so this "monkey see, monkey do" typing ditches the whole concept of a complete interface in favor of "explicit duck typing". You're just declaring "how can X act 'like' a duck" -- emulating behaviors of another type rather than converting structure. >Are there real-life uses of stateful adapters that would be thrown out >by this requirement? Think about this: if an adapter has independent state, that means it has a particular scope of applicability. You're going to keep the adapter and then throw it away at some point, like you do with an iterator. If it has no state, or only state that lives in the original object (by tacking annotations onto it), then it has a common lifetime with the original object. If it has state, then, you have to explicitly manage that state; you can't do that if the only way to create an adapter is to pass it into some other function that does the adapting, unless all it's going to do is return the adapter back to you! Thus, stateful adapters *must* be explicitly adapted by the code that needs to manage the state. This is why I say that PEP 246 is fine, but type declarations need a more restrictive version. PEP 246 provides a nice way to *find* stateful adapters, it just shouldn't do it for function arguments. > > Even if you're *very* careful, your seemingly safe setup can be blown just > > by one routine passing its argument to another routine, possibly causing an > > adapter to be adapted. This is a serious pitfall because today when you > > 'adapt' you can also access the "original" object -- you have to first > > *have* it, in order to *adapt* it. > >How often is this used, though? I can imagine all sorts of problems if >you mix access to the original object and to the adapter. Right - and early adopters of PEP 246 are warned about this, either from the PEP or PyProtocols docs. The PyProtocols docs early on have dire warnings about not forwarding adapted objects to other functions unless you already know the other method needs only the interface you adapted to already. However, with type declarations, you may never receive the original object. > > But type declarations using adapt() > > prevents you from ever *seeing* the original object within a function. So, > > it's *really* unsafe in a way that explicitly calling 'adapt()' is > > not. You might be passing an adapter to another function, and then that > > function's signature might adapt it again, or perhaps just fail because you > > have to adapt from the original object. > >Real-life example, please? If you mean, an example of code that's currently using adapt() that I'd have changed to use type declaration instead and then broken something, I'll have to look for one and get back to you. I have a gut feel/vague recollection that there are some, but I don't know how many. The problem is that the effect is inherently non-local; you can't look at a piece of code using type declarations and have a clue as to whether there's even *potentially* a problem there. >I can see plenty of cases where this could happen with explicit >adaptation too, for example f1 takes an argument and adapts it, then >calls f2 with the adapted value, which calls f3, which adapts it to >something else. Where is f3 going to get the original object? PyProtocols warns people not to do this in the docs, but it can't do anything about enforcing it. >But the solution IMO is not to weigh down adapt(), but to agree, as a >user community, not to create such "bad" adapters, period. Maybe. The thing that inspired me to come up with a new approach is that "bad" adapters are just *sooo* tempting; many of the adapters that we're just beginning to realize are "bad", were ones that Alex and I both initially thought were okay. Making the system such that you get "safe" adapters by default removes the temptation, and provides a learning opportunity to explain why the caller needs to manage the state when creating a stateful adapter. PEP 246 still allows you to leave it implicit how you get the adapter, but it still should be created explicitly by the code that needs to manage its lifetime. > OTOH there >may be specific cases where the conventions of a particular >application or domain make stateful or otherwise naughty adapters >useful, and everybody understands the consequences and limitations. Right; and I think that in those cases, it's the *caller* that needs to (explicitly) adapt, not the callee, because it's the caller that knows the lifetime for which the adapter needs to exist. > > Clark's proposal isn't going to solve this issue for PEP 246, alas. In > > order to guarantee safety of adaptive type declarations, the implementation > > strategy *must* be able to guarantee that 1) adapters do not have state of > > their own, and 2) adapting an already-adapted object re-adapts the original > > rather than creating a new adapter. This is what the monkey-typing PEP and > > prototype implementation are intended to address. > >Guarantees again. I think it's hard to provide these, and it feels >unpythonic. Well, right now Python provides lots of guarantees, like that numbers are immutable. It would be no big deal to guarantee immutable adapters, if Python supplies the adapter type for you. >(2) feels weird too -- almost as if it were to require >that float(int(3.14)) should return 3.14. That ain't gonna happen. No, but 'int_wrapper(3.14).original_object' is trivial. The point is that adaptation should just always return a wrapper of a type that's immutable and has a pointer to the original object. If you prefer, call these characteristics "implementation requirements" rather than guarantees. :) >Or maybe we shouldn't try to guarantee so much and instead define >simple, "Pythonic" semantics and live with the warts, just as we do >with mutable defaults and a whole slew of other cases where Python >makes a choice rooted in what is easy to explain and implement (for >example allowing non-Liskovian subclasses). Adherence to a particular >theory about programming is not very Pythonic; doing something that >superficially resembles what other languages are doing but actually >uses a much more dynamic mechanism is (for example storing instance >variables in a dict, or defining assignment as name binding rather >than value copying). Obviously the word "guarantee" hit a hot button; please don't let it obscure the actual merit of the approach, which does not involve any sort of compile-time checking. Heck, it doesn't even have interfaces! From aleax at aleax.it Fri Jan 14 08:50:27 2005 From: aleax at aleax.it (Alex Martelli) Date: Fri Jan 14 08:50:34 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: <41E73795.2070505@ActiveState.com> References: <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <5.1.1.6.0.20050111155718.020f5750@mail.telecommunity.com> <2EEE19F5-6429-11D9-ADA4-000A95EFAE9E@aleax.it> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> <41E73795.2070505@ActiveState.com> Message-ID: On 2005 Jan 14, at 04:08, David Ascher wrote: > Alex Martelli wrote: > >> Yes, there is (lato sensu) "non-determinism" involved, just like in, >> say: >> for k in d: >> print k > > Wow, it took more than the average amount of googling to figure out > that lato sensu means "broadly speaking", Ooops -- sorry; I wouldn't have imagined Brazilian hits would swamp the google hits to that extent, mostly qualifying post-grad courses and the like... seems to be an idiom there for that. > and occurs as "sensu lato" with a 1:2 ratio. In Latin as she was spoken word order is very free, but the issue here is that _in taxonomy specifically_ (which was the way I intended the form!) the "sensu lato" order vastly predominates. Very exhaustive discussion of this word order choice in taxonomy at , btw (mostly about "sensu scricto", the antonym). > I learned something today! ;-) Me too: about Brazilian idiom, and about preferred word-order use in Aquinas and Bonaventura. Also, a reflection: taxonomy, the classification of things (living beings, rocks, legal precedents, ...) into categories, is a discipline with many, many centuries of experience behind it. I think it is telling that taxonomists found out they require _two_ kinds of ``inheritance'' to do their job (no doubt there are all kind of _nuances_, but specialized technical wording exists for two kinds: "strict-sense" and "broad-sense")... they need to be able to assert that "A is a B _broadly speaking_" (or specifically "_strictly speaking_") so often that they evolved specific terminology. Let's hope it doesn't take OOP many centuries to accept that both "stricto sensu inheritance" (Liskovianly-correct) AND "lato sensu inheritance" are needed to do _our_ jobs!-) Alex From just at letterror.com Fri Jan 14 10:09:26 2005 From: just at letterror.com (Just van Rossum) Date: Fri Jan 14 10:09:33 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: Message-ID: Guido van Rossum wrote: > Are there real-life uses of stateful adapters that would be thrown out > by this requirement? Here are two interfaces we're using in a project: http://just.letterror.com/ltrwiki/PenProtocol (aka "SegmentPen") http://just.letterror.com/ltrwiki/PointPen They're both abstractions for drawing glyphs (characters from a font). Sometimes the former is more practical and sometimes the latter. We really need both interfaces. Yet they can't be adapted without keeping some state in the adapter. Implicit adaptations may be dangerous here, but I'm not so sure I care. In my particular use case, it will be very rare that people want to do funcTakingPointPen(segmentPen) otherFuncTakingPointPen(segmentPen) I don't it will be a problem in general that my adapter carries a bit of state, and that if it _does_ become a problem, it's easy to work around. It's not dissimilar to file.readline() vs. file.next(): sure, it's not pretty that file iteration doesn't work nice with readline(), but all bug reports about that get closed as "won't fix" ;-). It's something you can easily learn to live with. That said, I don't think implicit adaptation from str to file is a good idea... Python (the std lib, really) shouldn't use "dangerous" adapters for implicit adaptation, but that doesn't mean it should be impossible to do so anyway. [ ... ] > But the solution IMO is not to weigh down adapt(), but to agree, as a > user community, not to create such "bad" adapters, period. OTOH there > may be specific cases where the conventions of a particular > application or domain make stateful or otherwise naughty adapters > useful, and everybody understands the consequences and limitations. > Sort of the way that NumPy defines slices as views on the original > data, even though lists define slices as copies of the original data; > you have to know what you are doing with the NumPy slices but the > NumPy users don't seem to have a problem with that. (I think.) [ ... ] > Guarantees again. I think it's hard to provide these, and it feels > unpythonic. [ ... ] > Or maybe we shouldn't try to guarantee so much and instead define > simple, "Pythonic" semantics and live with the warts, just as we do > with mutable defaults and a whole slew of other cases where Python > makes a choice rooted in what is easy to explain and implement (for > example allowing non-Liskovian subclasses). Adherence to a particular > theory about programming is not very Pythonic; doing something that > superficially resembles what other languages are doing but actually > uses a much more dynamic mechanism is (for example storing instance > variables in a dict, or defining assignment as name binding rather > than value copying). Yes, yes and yes! Just From arigo at tunes.org Fri Jan 14 10:47:15 2005 From: arigo at tunes.org (Armin Rigo) Date: Fri Jan 14 10:58:46 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> Message-ID: <20050114094715.GA21852@vicky.ecs.soton.ac.uk> Hi Guido, On Thu, Jan 13, 2005 at 10:20:40PM -0800, Guido van Rossum wrote: > Hm. Maybe that post points out that adapters that add state are bad, > period. I have to say that the example of adapting a string to a file > using StringIO() is questionable. Another possible adaptation from a > string to a file would be open() I have some theory about why adapting a string to a file in any way is questionable, but why adapting some more specific class to some other class usually "feels right", and about what "lossy" means. In my opinion a user-defined class or interface mixes two notions: a "concept" meaningful for the programmer that the instances represent, and the "interface" provided to manipulate it. Adaptation works well at the "concept" level without all the hassles of information loss and surprizes of transitive adaptation. The problems show up in the cases where a single concrete interface doesn't obviously match to a single "concept". For example, strings "mean" very different concepts in various contexts, e.g. a file name, an url, the byte content a document, or the pickled representation of something. Containers have the similar problem. This suggests that only concrete objects which are expected to encode a *single* concept should be used for adaptation. Note that the theory -- for which I have an old draft at http://arigo.tunes.org/semantic_models.html -- suggests that it is possible to be more precise about various levels of concepts encoding each others, like a string standing for the name of a file itself encoding an image; but I'm not proposing anything similar here, just suggesting a way to realize what kind of adaptation is problematic. > may be specific cases where the conventions of a particular > application or domain make stateful or otherwise naughty adapters > useful, and everybody understands the consequences and limitations. Note that it may be useful to be able to register some adapaters in "local" registeries instead of the single global one, to avoid all kinds of unexpected global effects. For example something along the lines of (but nicer than) : my_registry = AdapterRegister() my_registry.register(...) my_registry.adapt(x, y) # direct use __adaptregistry__ = my_registry def f(x as y): # implicit use of the module-local registry stuff This would allow a module to provide the str->StringIO or str->file conversion locally. A bientot, Armin. From carribeiro at gmail.com Fri Jan 14 11:40:52 2005 From: carribeiro at gmail.com (Carlos Ribeiro) Date: Fri Jan 14 11:40:54 2005 Subject: [Python-Dev] PEP 246, redux In-Reply-To: References: <5.1.1.6.0.20050111142120.029483d0@mail.telecommunity.com> <79990c6b05011205445ea4af76@mail.gmail.com> <338C68EC-64A3-11D9-ADA4-000A95EFAE9E@aleax.it> <5.1.1.6.0.20050112134708.02f8cb50@mail.telecommunity.com> <5.1.1.6.0.20050112151418.0337cd00@mail.telecommunity.com> <49A128D8-654E-11D9-ADA4-000A95EFAE9E@aleax.it> <41E73795.2070505@ActiveState.com> Message-ID: <864d37090501140240628ec1f2@mail.gmail.com> On Fri, 14 Jan 2005 08:50:27 +0100, Alex Martelli wrote: > Ooops -- sorry; I wouldn't have imagined Brazilian hits would swamp the > google hits to that extent, mostly qualifying post-grad courses and the > like... seems to be an idiom there for that. 'Lato sensu' is used to indicate short post-graduate level courses that don't give one any recognized degree such as 'MSc', or 'master'. It's pretty much like a specialization course on some specific area, usually offered by small private universities. It's like a fever around here - everyone does just to add something to the resume - and has spawned a entire branch in the educational industry (and yeah, 'industry' is the best word for it). Some schools refer to traditional post graduate courses as 'stricto sensu'. I don't have the slightest idea about where they did get this naming from. It's also amazing how many hits you'll get for the wrong spelling: 'latu sensu' & 'strictu sensu', mostly from Brazil, and also from some spanish-speaking countries. > Also, a reflection: taxonomy, the classification of things (living > beings, rocks, legal precedents, ...) into categories, is a discipline > with many, many centuries of experience behind it. I think it is > telling that taxonomists found out they require _two_ kinds of > ``inheritance'' to do their job (no doubt there are all kind of > _nuances_, but specialized technical wording exists for two kinds: > "strict-sense" and "broad-sense")... they need to be able to assert > that "A is a B _broadly speaking_" (or specifically "_strictly > speaking_") so often that they evolved specific terminology. Let's > hope it doesn't take OOP many centuries to accept that both "stricto > sensu inheritance" (Liskovianly-correct) AND "lato sensu inheritance" > are needed to do _our_ jobs!-) Good point! -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com From skip at pobox.com Fri Jan 14 10:36:21 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 14 12:01:38 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <41E74790.60108@ocf.berkeley.edu> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> Message-ID: <16871.37525.981821.580939@montanaro.dyndns.org> Brett> The problem I have always had with this proposal is that the Brett> value is worthless, time tuples do not have a slot for fractional Brett> seconds. Yes, it could possibly be changed to return a float for Brett> seconds, but that could possibly break things. Actually, time.strptime() returns a struct_time object. Would it be possible to extend %S to parse floats then add a microseconds (or whatever) field to struct_time objects that is available by attribute only? In Py3k it could worm its way into the tuple representation somehow (either as a new field or by returning seconds as a float). Brett> My vote is that if something is added it be like %N but without Brett> the optional optional digit count. This allows any separator to Brett> be used while still consuming the digits. It also doesn't Brett> suddenly add optional args which are not supported for any other Brett> directive. I realize the %4N notation is distasteful, but without it I think you will have trouble parsing something like 13:02:00.704 What would be the format string? %H:%M:%S.%N would be incorrect. It works if you allow the digit notation: %H:%M:%S.%3N I think that except for the logging module presentation of fractions of a second would almost always use the locale-specific decimal point, so if that problem is fixed, extending %S to understand floating point seconds would be reasonable. Skip From aleax at aleax.it Fri Jan 14 12:40:41 2005 From: aleax at aleax.it (Alex Martelli) Date: Fri Jan 14 12:40:46 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <16871.37525.981821.580939@montanaro.dyndns.org> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> Message-ID: <1E3B87D3-6621-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 14, at 10:36, Skip Montanaro wrote: > > Brett> The problem I have always had with this proposal is that the > Brett> value is worthless, time tuples do not have a slot for > fractional > Brett> seconds. Yes, it could possibly be changed to return a > float for > Brett> seconds, but that could possibly break things. > > Actually, time.strptime() returns a struct_time object. Would it be > possible to extend %S to parse floats then add a microseconds (or > whatever) > field to struct_time objects that is available by attribute only? In > Py3k > it could worm its way into the tuple representation somehow (either as > a new > field or by returning seconds as a float). +1 -- I never liked the idea that 'time tuples' lost fractions of a second. On platforms where that's sensible and not too hard, time.time() could also -- unobtrusively and backwards compatibly -- set that same attribute. I wonder if, where the attribute's real value is unknown, it should be None (a correct indication of "I dunno") or 0.0 (maybe handier); instinctively, I would prefer None. "Available by attribute only" is probably sensible, overall, but maybe strftime should make available whatever formatting item[s] strptime may grow to support fractions of a second; and one such item (distinct from %S for guaranteed backwards compatibility) should be "seconds and fraction, with [[presumably, locale-specific]] decimal point inside". Alex From barry at python.org Fri Jan 14 12:46:09 2005 From: barry at python.org (Barry Warsaw) Date: Fri Jan 14 12:46:12 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <1E3B87D3-6621-11D9-ADA4-000A95EFAE9E@aleax.it> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> <1E3B87D3-6621-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <1105703168.12628.21.camel@presto.wooz.org> On Fri, 2005-01-14 at 06:40, Alex Martelli wrote: > +1 -- I never liked the idea that 'time tuples' lost fractions of a > second. On platforms where that's sensible and not too hard, > time.time() could also -- unobtrusively and backwards compatibly -- set > that same attribute. I wonder if, where the attribute's real value is > unknown, it should be None (a correct indication of "I dunno") or 0.0 > (maybe handier); instinctively, I would prefer None. None feels better. I've always thought it was kind of icky for datetimes to use microseconds=0 to decide whether to print the fractional second part or not for isoformat(), e.g.: >>> import datetime >>> now = datetime.datetime.now() >>> now.isoformat() '2005-01-14T06:44:18.013832' >>> now.replace(microsecond=0).isoformat() '2005-01-14T06:44:18' -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050114/b1f9bc15/attachment.pgp From marktrussell at btopenworld.com Fri Jan 14 13:24:24 2005 From: marktrussell at btopenworld.com (Mark Russell) Date: Fri Jan 14 13:24:27 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <16871.37525.981821.580939@montanaro.dyndns.org> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> Message-ID: <1105705464.5494.9.camel@localhost> On Fri, 2005-01-14 at 09:36, Skip Montanaro wrote: > Actually, time.strptime() returns a struct_time object. Would it be > possible to extend %S to parse floats then add a microseconds (or whatever) > field to struct_time objects that is available by attribute only? +1 for adding a microseconds field to struct_time, but I'd also like to see an integer-only way of parsing fractional seconds in time.strptime. Using floating point makes it harder to support exact comparison of timestamps (an issue I recently ran into when writing unit tests for code storing timestamps in a database). My vote is for %N producing a microseconds field. Mark Russell From cce at clarkevans.com Fri Jan 14 14:30:44 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 14:30:46 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <000d01c4f9dc$138633a0$e841fea9@oemcomputer> References: <20050114010307.GA51446@prometheusresearch.com> <000d01c4f9dc$138633a0$e841fea9@oemcomputer> Message-ID: <20050114133044.GA37099@prometheusresearch.com> On Thu, Jan 13, 2005 at 08:54:33PM -0500, Raymond Hettinger wrote: | > Since __conform__ and __adapt__ | > would sprout two new arguments, it would make those writing adapters | > think a bit more about the kind of adapter that they are providing. | | Using optional arguments may not be the most elegant or extensible | approach. Perhaps a registry table or adapter attributes would fare | better. I'm not sure how either of these would work since the adapt() function could return `self`. Adapter attributes wouldn't work in that case (or would they?), and since adapters could be given dynamically by __adapt__ or __conform__ a registry isn't all that appropriate. Perhaps we could just pass around a single **kwargs? Best, Clark From cce at clarkevans.com Fri Jan 14 15:19:39 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 15:19:41 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> References: <41E5EFF6.9090408@colorstudy.com> <79990c6b05011302352cbd41de@mail.gmail.com> <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> Message-ID: <20050114141939.GB37099@prometheusresearch.com> On Fri, Jan 14, 2005 at 12:11:10AM -0500, Phillip J. Eby wrote: | Clark's proposal isn't going to solve this issue for PEP 246, alas. In | order to guarantee safety of adaptive type declarations, the | implementation strategy *must* be able to guarantee that 1) adapters do | not have state of their own, and 2) adapting an already-adapted object | re-adapts the original rather than creating a new adapter. 1. Following Raymond's idea for allowing adaption to reflect more arbitrary properties (which can be used to provide restrictions on the kinds of adapters expected): adapt(object, protocol, default = False, **properties) Request adaptation, where the result matches a set of properties, such as 'lossless', 'stateless'. __conform__(self, protocol, **properties) __adapt__(self, object, **properties) Conform/adapt but optionally parameterized by a set of restrictions. The **properties can be used to inform the adaptation. register(from, to, adapter = None, predicate = None) Register an adaptation path from one protocol to another, optionally providing an adapter. If no adapter is provided, then adapt(from,to,**properties) is used when adapting. If a predicate is provided, then the adaptation path is available only if predicate(**properties) returns True. 2. Perhaps if we just provide a mechanism for an adapter to specify that it's OK to be used "implicitly" via the declaration syntax? def fun(x: Y): ... is equivalent to, def fun(x): x = adapt(x, Y, declaration = True) On Thu, Jan 13, 2005 at 05:52:10PM -0800, Guido van Rossum wrote: | This may solve the current raging argument, but IMO it would | make the optional signature declaration less useful, because | there's no way to accept other kind of adapters. I'd be happier | if def f(X: Y) implied X = adapt(X, Y). Ideally, yes. However, some adapters may want to explicitly disable their usage in this context -- so some differentiation is warranted. This 'revised' proposal puts the burden on the adapter (or its registration) to specify that it shouldn't be used in this context. I'm carefully using 'declaration' as the restriction, not 'stateless'. One may have a stateful adapter which is most appropriate to be used in declarations (see Armin's insightful post). Furthermore, the 'full' version of adapt() where argument 'restrictions' can be specified could be done via a decorator syntax: @adapt(x, Y, **properties) I hope this helps. P.S. Clearly there is much of information to be captured in this thread and put into the PEP (mostly as appendix material); keep posting good ideas, problems, opinions, whatever -- I will summarize over this weekend. -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From ncoghlan at iinet.net.au Fri Jan 14 15:24:15 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri Jan 14 15:24:19 2005 Subject: [Python-Dev] frame.f_locals is writable In-Reply-To: <41E6DD52.2080109@ieee.org> References: <41E6DD52.2080109@ieee.org> Message-ID: <41E7D60F.9000208@iinet.net.au> Shane Holloway (IEEE) wrote: > For a little background, I'm working on making an edit and continue > support in python a little more robust. So, in replacing references to > unmodifiable types like tuples and bound-methods (instance or class), I > iterate over gc.get_referrers. > > So, I'm working on frame types, and wrote this code:: > > def replaceFrame(self, ref, oldValue, newValue): > for name, value in ref.f_locals.items(): > if value is oldValue: > ref.f_locals[name] = newValue > assert ref.f_locals[name] is newValue > FWIW, this should work: def replaceFrame(self, ref, oldValue, newValue): for name, value in ref.f_locals.items(): if value is oldValue: exec "ref.f_locals[name] = newValue" assert ref.f_locals[name] is newValue And, no, you don't have to tell me that this is an evil hack. I already know that, since I discovered it earlier this evening by poking around in the C source code for PyFrame_LocalsToFast and then looking to see what code calls that function :) Cheers, Nick -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From cce at clarkevans.com Fri Jan 14 15:39:33 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 15:39:36 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114094715.GA21852@vicky.ecs.soton.ac.uk> References: <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <20050114094715.GA21852@vicky.ecs.soton.ac.uk> Message-ID: <20050114143933.GC37099@prometheusresearch.com> On Fri, Jan 14, 2005 at 09:47:15AM +0000, Armin Rigo wrote: | In my opinion a user-defined class or interface mixes two notions: a | "concept" meaningful for the programmer that the instances | represent, and the "interface" provided to manipulate it. ... | This suggests that only concrete objects which are expected to | encode a *single* concept should be used for adaptation. So, in this view of the world, the adapter from FileName to File _is_ appropriate, but the adapter from String to FileName isn't? def checkSecurity(filename: FileName): ... Hmm. I'd like to be able to pass in a String here, and use that String->FileName adapter. So, there isn't a problem yet; although String is vague in a sense, it doesn't hurt to specialize it in the context that I have in mind. def checkContent(file: File): ... look for well known viruses ... def checkSecurity(filename: FileName): ... look for nasty path information ... return checkContent(filename) Even this is _ok_ since the conceptual jump is specified by the programmer between the two stages. The problem happens when one does... checkContent("is-this-a-filename-or-is-this-content") This is where we run into issues. When an adapter which 'specializes' the content is used implicitly in a trasitive adaption chain. | Note that it may be useful to be able to register some adapaters | in "local" registeries instead of the single global one, to avoid | all kinds of unexpected global effects. Nice... Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From pje at telecommunity.com Fri Jan 14 16:07:00 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 16:05:25 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: Message-ID: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> At 10:09 AM 1/14/05 +0100, Just van Rossum wrote: >Guido van Rossum wrote: > > > Are there real-life uses of stateful adapters that would be thrown out > > by this requirement? > >Here are two interfaces we're using in a project: > > http://just.letterror.com/ltrwiki/PenProtocol (aka "SegmentPen") > http://just.letterror.com/ltrwiki/PointPen > >They're both abstractions for drawing glyphs (characters from a font). >Sometimes the former is more practical and sometimes the latter. We >really need both interfaces. Yet they can't be adapted without keeping >some state in the adapter. Maybe I'm missing something, but for those interfaces, isn't it okay to keep the state in the *adapted* object here? In other words, if PointPen just added some private attributes to store the extra data? >Implicit adaptations may be dangerous here, but I'm not so sure I care. >In my particular use case, it will be very rare that people want to do > > funcTakingPointPen(segmentPen) > otherFuncTakingPointPen(segmentPen) But if the extra state were stored on the segmentPen rather than the adapter, this would work correctly, wouldn't it? Whereas with it stored in an adapter, it wouldn't. From pje at telecommunity.com Fri Jan 14 16:22:36 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 16:21:02 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114094715.GA21852@vicky.ecs.soton.ac.uk> References: <20050113143421.GA39649@prometheusresearch.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com> At 09:47 AM 1/14/05 +0000, Armin Rigo wrote: >For example, strings "mean" very >different concepts in various contexts, e.g. a file name, an url, the byte >content a document, or the pickled representation of something. Note that this is solvable in practice by the author of a method or framework choosing to define an interface that they accept, and then pre-defining the adaptation from string to that interface. So, what a string "means" in that context is pre-defined. The interpretation problem for strings comes only when a third party attempts to define adaptation from a string to a context that takes some more generic interface. >This would allow a module to provide the str->StringIO or str->file conversion >locally. It also works for the module to define a target interface and register an adapter to that, and introduces less complexity into the adaptation system. From just at letterror.com Fri Jan 14 17:27:53 2005 From: just at letterror.com (Just van Rossum) Date: Fri Jan 14 17:27:59 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > At 10:09 AM 1/14/05 +0100, Just van Rossum wrote: > >Guido van Rossum wrote: > > > > > Are there real-life uses of stateful adapters that would be > > > thrown out by this requirement? > > > >Here are two interfaces we're using in a project: > > > > http://just.letterror.com/ltrwiki/PenProtocol (aka "SegmentPen") > > http://just.letterror.com/ltrwiki/PointPen > > > >They're both abstractions for drawing glyphs (characters from a > >font). Sometimes the former is more practical and sometimes the > >latter. We really need both interfaces. Yet they can't be adapted > >without keeping some state in the adapter. > > Maybe I'm missing something, but for those interfaces, isn't it okay > to keep the state in the *adapted* object here? In other words, if > PointPen just added some private attributes to store the extra data? > > > >Implicit adaptations may be dangerous here, but I'm not so sure I > >care. In my particular use case, it will be very rare that people > >want to do > > > > funcTakingPointPen(segmentPen) > > otherFuncTakingPointPen(segmentPen) > > But if the extra state were stored on the segmentPen rather than the > adapter, this would work correctly, wouldn't it? Whereas with it > stored in an adapter, it wouldn't. Are you saying the adapter could just hijack some attrs on the adapted object? Or ore you saying the adapted object should be aware of the adapter? Both don't sound right, so I hope I'm misunderstanding... Just From gvanrossum at gmail.com Fri Jan 14 17:32:42 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 14 17:32:45 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com> References: <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com> Message-ID: [Phillip] > Quick demo (strawman syntax) of declaring adapters... > > First, a type declaring that its 'read' method has the semantics of > 'file.read': > > class SomeKindOfStream: > def read(self, byteCount) like file.read: > ... [and more like this] Sorry, this is dead in the water. I have no desire to add syntax complexities like this to satisfy some kind of theoretically nice property. > Second, third-party code adapting a string iterator to a readable file: We need to pick a better example; I like Armin's hypothesis that adapting strings to more specific things is abuse of the adapter concept (paraphrased). > >Are there real-life uses of stateful adapters that would be thrown out > >by this requirement? > > Think about this: [...] No, I asked for a real-life example. Just provided one, and I'm satisfied that stateful adapters can be useful. ["proof" omitted] > Thus, stateful adapters *must* be explicitly adapted by the code that needs > to manage the state. This doesn't prove it at all to me. > This is why I say that PEP 246 is fine, but type declarations need a more > restrictive version. PEP 246 provides a nice way to *find* stateful > adapters, it just shouldn't do it for function arguments. You haven't proven that for me. The example quoted earlier involving print_next_line() does nothing to prove it, since it's a bad use of adaptation for a different reason: string -> file adaptation is abuse. > >But the solution IMO is not to weigh down adapt(), but to agree, as a > >user community, not to create such "bad" adapters, period. > > Maybe. The thing that inspired me to come up with a new approach is that > "bad" adapters are just *sooo* tempting; many of the adapters that we're > just beginning to realize are "bad", were ones that Alex and I both > initially thought were okay. One of my hesitations about adding adapt() and interfaces to the core language has always been that it would change the "flavor" of much of the Python programming we do and that we'd have to relearn how to write good code. There are other places in Python where it can be tempting to use its features in a way that can easily cause trouble (the extremely dynamic nature of the language is always tempting); we tend not to invent new syntax to fix this but instead develop idioms that avoid the issues. IOW, I don't want to make it syntactically impossible to write bad adapters, but we'll have to develop a set of guidelines for writing good adapters. I don't believe for a second that all stateful adapters are bad, even though I expect that stateless lossless adapters are always good. I like Armin's hypothesis better. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From arigo at tunes.org Fri Jan 14 17:39:00 2005 From: arigo at tunes.org (Armin Rigo) Date: Fri Jan 14 17:50:29 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com> References: <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com> Message-ID: <20050114163900.GA21005@vicky.ecs.soton.ac.uk> Hi Phillip, On Fri, Jan 14, 2005 at 10:22:36AM -0500, Phillip J. Eby wrote: > Note that this is solvable in practice by the author of a method or > framework choosing to define an interface that they accept, and then > pre-defining the adaptation from string to that interface. So, what a > string "means" in that context is pre-defined. I'm trying to reserve the usage of "interface" to something more concrete: the concrete ways we have to manipulate a given object (typically a set of methods including some unwritten expectations). We might be talking about the same thing, then, but just to check: I'm making sense of the above paragraph in two steps. First, I read it with "interface" replaced by "concept": the author of the method chooses what concepts the input arguments carry: a file or a file name, for example. Then he chooses which particular interface he'd like to access the input arguments through: if it's a file, then it's probably via the standard file-like methods; if it's a file name, then it's probably as a string. It's important to do it in two steps, even if in practice a lot of concepts typically comes with a single associated interface (both together, they are a "duck type"). The programmer using existing methods also goes through two steps: first, he considers the "intuitive" signature of the method, which includes a reasonable name and conceptual arguments. Then he traditionally has to care about the precise interface that the callee expects. For example, he knows that in some specific situation he wants to use marshal.load[s](something), but he has to check precisely which interface the function expects for 'something': a file name string, a file-like object, a real file, a content string? Adaptation should make the latter part more automatic, and nothing more. Ideally, both the caller and the callee know (and write down) that the function's argument is a "reference to some kind of file stuff", a very general concept; then they can independently specify which concrete object they expect and provide, e.g. "a string naming a file", "a file-like object", "a string containing the data". What I see in most arguments about adaptation/conversion/cast is some kind of confusion that would make us believe that the concrete interface (or even worse the formal one) fully defines what underlying concepts they represent. It is true only for end-user application-specific classes. > The interpretation problem for strings comes only when a third party > attempts to define adaptation from a string to a context that takes some > more generic interface. In the above example, there is nothing in the general concept that helps the caller to guess how a plain string will be interpreted, or symmetrically that helps the callee to guess what an incoming plain string means. In my opinion this should fail, in favor of something more explicit. It's already a problem without any third party. > >(...) conversion locally. > > It also works for the module to define a target interface and register an > adapter to that, and introduces less complexity into the adaptation system. Makes sense, but my fear is that people will soon register generic adapters all around... debugging nightmares! Armin From shane.holloway at ieee.org Fri Jan 14 18:17:23 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Fri Jan 14 18:17:54 2005 Subject: [Python-Dev] frame.f_locals is writable In-Reply-To: <41E6ED20.50103@ocf.berkeley.edu> References: <41E6DD52.2080109@ieee.org> <41E6ED20.50103@ocf.berkeley.edu> Message-ID: <41E7FEA3.10101@ieee.org> Brett C. wrote: > Other option would be to add a function that either directly modified > single values in f_localsplus, a function that takes a dict and > propogates the values, or a function that just calls > PyFrame_LocalsToFast() . Brett!! Thanks for looking this up! With a little help from ctypes, I was able to call PyFrame_LocalsToFast, and it works wonderfully! Maybe this method could be added to the frame type itself? > Personally I am against this, but that is because you would > single-handedly ruin my master's thesis and invalidate any possible > type inferencing one can do in Python without some semantic change. > But then again my thesis shows that amount of type inferencing is not > worth the code complexity so it isn't totally devastating. =) Well, at least in theory this only allows the developer to replace a variable with a better (hopefully) version of a class that is very similar... > And you are right, "don't do that". =) I'm going to only remember this trick in the light of development tools. Really! This magic is WAY too deep for a library. The only use for it that I could really see is a smalltalk-like swap method. Thanks again for your help! -Shane From shane.holloway at ieee.org Fri Jan 14 18:19:46 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Fri Jan 14 18:20:20 2005 Subject: [Python-Dev] frame.f_locals is writable In-Reply-To: <41E7D60F.9000208@iinet.net.au> References: <41E6DD52.2080109@ieee.org> <41E7D60F.9000208@iinet.net.au> Message-ID: <41E7FF32.2030502@ieee.org> > FWIW, this should work: > > def replaceFrame(self, ref, oldValue, newValue): > for name, value in ref.f_locals.items(): > if value is oldValue: > exec "ref.f_locals[name] = newValue" > assert ref.f_locals[name] is newValue > > And, no, you don't have to tell me that this is an evil hack. I already > know that, since I discovered it earlier this evening by poking around > in the C source code for PyFrame_LocalsToFast and then looking to see > what code calls that function :) Yes. After poking around in Google with PyFrame_LocalsToFast, I found some other links to people doing that. I implemented a direct call using ctypes to make the code explicit about what's happening. I'm just glad it is possible now. Works fine in both 2.3 and 2.4. Thanks, -Shane From pje at telecommunity.com Fri Jan 14 18:28:16 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 18:26:41 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com> <5.1.1.6.0.20050112095119.02995600@mail.telecommunity.com> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com> At 08:32 AM 1/14/05 -0800, Guido van Rossum wrote: >I have no desire to add syntax >complexities like this to satisfy some kind of theoretically nice >property. Whether it's syntax or a decorator, it allows you to create stateless adapters without needing to write individual adapter *classes*, or even having an explicit notion of an "interface" to adapt to. That is, it makes it very easy to write a "good" adapter; you can do it without even trying. The point isn't to make it impossible to write a "bad" adapter, it's to make it more attractive to write a good one. Also, btw, it's not a "theoretically nice" property. I just tried making PyProtocols 'Adapter' class immutable, and reran PEAK's unit tests, exercising over 100 adapter classes. *Three* had state. All were trivial caching, not per-adapter state. However, even had they *not* been trivial caching, this suggests that there are a *lot* of use cases for stateless adapters, and that means that a trivial 'like' decorator can make it very easy to write stateless adapters. What I'm suggesting is effectively replacing PEP 246's global registry with one that can generate a stateless adapter from individual operation declarations. But it can still fall back on __conform__ and __adapt__ if there aren't any declarations, and we could also require adapt() to return the same adapter instance if an adapter is stateful. >No, I asked for a real-life example. Just provided one, and I'm >satisfied that stateful adapters can be useful. But his example doesn't require *per-adapter* state, just per-original-object state. As long as there's a clean way to support that, his example still works -- and in fact it works *better*, because then that "rare" case he spoke of will work just fine without even thinking about it. Therefore, I think we should make it easy for stateful adapters to link their state to the adapted object, not the adapter instance. This better matches most people's intuitive mental model of adaptation, as judged by the comments of people in this discussion who were new to adaptation. If adapt() promised to return the same (stateful) adapter instance each time, then Just's "rare" example would work nicely, without a bug. >One of my hesitations about adding adapt() and interfaces to the core >language has always been that it would change the "flavor" of much of >the Python programming we do and that we'd have to relearn how to >write good code. Exactly! I came up with the monkey typing idea specifically to address this very issue, because the PEP discussion has shown that it is hard to learn to write good adapters, and very easy to be tempted to write bad ones. If there is a very easy way to write good adapters, then it will be more attractive to learn about it. If you have to do a little bit more to get per-object state, and then it's hardest of all to get per-adapter state, the model is a good match to the frequency of those use cases. Even better, it avoids creating the concept of an interface, except that you want something "like" a file or a dictionary. It's the first Python "interface" proposal I know of that can actually spell the loose notion of "file-like" in a concretely useful way! I think the concept can be extended slightly to work with stateful (per-object) adapters, though I'll have to give it some thought and prototyping. >I don't believe for a second that all stateful adapters >are bad, Neither do I. It's *per-adapter-instance* state that's bad, or at least that no good use cases have yet been shown for. If we can make it easy to have *per-adapted-object* state, or guarantee "same-adapter return", then that's even better. For example, if there were a weak reference dictionary mapping objects to their (stateful) adapters, then adapt() could always return the same adapter instance for a given source object, thus guaranteeing a single state. Of course, this would also imply that adapt() needs to know that an adapter is stateful, so that it doesn't keep around lots of trivial stateless adapters. Thus, there should be a little more effort required to create this kind of adapter (i.e., you need to say that it's stateful). By the way, I've encountered the need for *this* kind of stateful adapter more than once; PyProtocols has a notion of a StickyAdapter, that keeps per-adapted-object state, which is sometimes needed because you can't hold on to the "same adapter" for some reason. The StickyAdapter attaches itself to the original object, such that when you adapt that object again, you always get the same StickyAdapter instance. In basically all the use cases I found where there's a *really* stateful adapter, I'm using a StickyAdapter, not trying to have per-adapter-instance state. So, what I'm suggesting is that we make it ridiculously easy for somebody to create adapters that either have no state, or that have "sticky" state, and make it obscure at best to create one that has per-adapter-instance state, because nobody has yet presented an example of per-adapter-instance state that wasn't either 1) clearly abuse or 2) would be problematic if adapt() always returned the same adapter instance. From pje at telecommunity.com Fri Jan 14 18:36:12 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 18:34:40 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114163900.GA21005@vicky.ecs.soton.ac.uk> References: <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com> <5A04A886-6588-11D9-ADA4-000A95EFAE9E@aleax.it> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050114122923.03b55cd0@mail.telecommunity.com> At 04:39 PM 1/14/05 +0000, Armin Rigo wrote: >Ideally, both the caller and the callee know (and write down) that the >function's argument is a "reference to some kind of file stuff", a very >general concept; then they can independently specify which concrete object >they expect and provide, e.g. "a string naming a file", "a file-like object", >"a string containing the data". Yes, exactly! That's what I mean by "one use case, one interface". But as you say, that's because we don't currently have a way to separate these ideas. So, in developing with PyProtocols, I create a new interface for each concept, possibly allowing adapters for some other interface to supply default implementations for that concept. But, for things like strings and such, I define direct adapters to the new concept, so that they override any "generic" adapters as you call them. So, I have a path that looks like: concreteType -> functionalInterface -> conceptInterface Except that there's also a shorter concreteType -> conceptInterface path for various types like string, thus providing context-sensitivity. (Interestingly, strings are the *most* common instance of this situation, as they're one of the most "open to interpretation" objects you can have!) > > It also works for the module to define a target interface and register an > > adapter to that, and introduces less complexity into the adaptation system. > >Makes sense, but my fear is that people will soon register generic adapters >all around... debugging nightmares! Well, if you have "interface per concept", you *have* a context; the context is the concept itself. From cce at clarkevans.com Fri Jan 14 18:41:32 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 18:41:39 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114163900.GA21005@vicky.ecs.soton.ac.uk> References: <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com> <20050114163900.GA21005@vicky.ecs.soton.ac.uk> Message-ID: <20050114174132.GA46344@prometheusresearch.com> On Fri, Jan 14, 2005 at 04:39:00PM +0000, Armin Rigo wrote: | I'm trying to reserve the usage of "interface" to something more | concrete: the concrete ways we have to manipulate a given object | (typically a set of methods including some unwritten expectations). I'd say that a programmer interface intends to encapsulates both the 'concept' and the 'signature'. The concept is indicated by the names of the function delcarations and fields, the signature by the position and type of arguments. | Adaptation should make [passing data between conceptually equivalent | interfaces?] more automatic, and nothing more. Ideally, both the caller | and the callee know (and write down) that the function's argument is a | "reference to some kind of file stuff", a very general concept; then they | can independently specify which concrete object they expect and provide, | e.g. "a string naming a file", "a file-like object", "a string containing | the data". But it is quite difficult to know when two interfaces are conceptually equivalent... | What I see in most arguments about adaptation/conversion/cast is some kind | of confusion that would make us believe that the concrete interface (or | even worse the formal one) fully defines what underlying concepts they | represent. It is true only for end-user application-specific classes. It seems your distinction comes down to defining 'best pratice' for when you define an adapter... and when you don't. Perhaps we don't need to qualify the adapters that exist, as much as make them transparent to the programmer. A bad adapter will most likely be detected _after_ a weird bug has happened. Perhaps the adapt() framework can provide meaningful information in these cases. Imagine enhancing the stack-trace with additional information about what adaptations were made; Traceback (most recent call last): File "xxx", line 1, in foo Adapting x to File File "yyy", line 384, in bar Adapting x to FileName etc. | In the above example, there is nothing in the general concept that helps | the caller to guess how a plain string will be interpreted, or | symmetrically that helps the callee to guess what an incoming plain | string means. In my opinion this should fail, in favor of something | more explicit. It's already a problem without any third party. How can we express your thoughts so that they fit into a narrative describing how adapt() should and should not be used? If you could respond by re-posting your idea with the 'average python programmer' as your audience it would help me quite a bit when summarizing your contribution to the thread. Best, Clark From cce at clarkevans.com Fri Jan 14 18:41:50 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 18:41:53 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com> References: <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com> <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com> Message-ID: <20050114174149.GB21254@prometheusresearch.com> On Fri, Jan 14, 2005 at 12:28:16PM -0500, Phillip J. Eby wrote: | At 08:32 AM 1/14/05 -0800, Guido van Rossum wrote: | >I have no desire to add syntax | >complexities like this to satisfy some kind of theoretically nice | >property. | | Whether it's syntax or a decorator, it allows you to create stateless | adapters without needing to write individual adapter *classes*, or even | having an explicit notion of an "interface" to adapt to. That is, it | makes it very easy to write a "good" adapter; you can do it without even | trying. The point isn't to make it impossible to write a "bad" adapter, | it's to make it more attractive to write a good one. Phillip, May I suggest that you write this up as a PEP? Being dead in the water isn't always fatal. Right now you're ideas are still very fuzzy and by forcing yourself to come up with a narrative, semantics section, minimal implementation, and examples, you will go along way to both refining your idea and also allowing others to better understand what you're proposing. Cheers, Clark From just at letterror.com Fri Jan 14 18:56:52 2005 From: just at letterror.com (Just van Rossum) Date: Fri Jan 14 18:56:57 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > For example, if there were a weak reference dictionary mapping > objects to their (stateful) adapters, then adapt() could always > return the same adapter instance for a given source object, thus > guaranteeing a single state. Wouldn't that tie the lifetime of the adapter object to that of the source object? Possibly naive question: is using adaptation to go from iterable to iterator abuse? That would be a clear example of per-adapter state. Just From bac at OCF.Berkeley.EDU Fri Jan 14 19:04:05 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Jan 14 19:04:18 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <16871.37525.981821.580939@montanaro.dyndns.org> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> Message-ID: <41E80995.5030901@ocf.berkeley.edu> Skip Montanaro wrote: > Brett> The problem I have always had with this proposal is that the > Brett> value is worthless, time tuples do not have a slot for fractional > Brett> seconds. Yes, it could possibly be changed to return a float for > Brett> seconds, but that could possibly break things. > > Actually, time.strptime() returns a struct_time object. Would it be > possible to extend %S to parse floats then add a microseconds (or whatever) > field to struct_time objects that is available by attribute only? In Py3k > it could worm its way into the tuple representation somehow (either as a new > field or by returning seconds as a float). > Right, it's a struct_time object; just force of habit to call it a time tuple. And I technically don't see why a fractional second attribute could not be added that is not represented in the tuple. But I personally would like to see struct_tm eliminated in Py3k and replaced with datetime usage. My wish is to have the 'time' module stripped down to only the bare essentials that just don't fit in datetime and push everyone to use datetime for most things. > Brett> My vote is that if something is added it be like %N but without > Brett> the optional optional digit count. This allows any separator to > Brett> be used while still consuming the digits. It also doesn't > Brett> suddenly add optional args which are not supported for any other > Brett> directive. > > I realize the %4N notation is distasteful, but without it I think you will > have trouble parsing something like > > 13:02:00.704 > > What would be the format string? %H:%M:%S.%N would be incorrect. Why is that incorrect? -Brett From aahz at pythoncraft.com Fri Jan 14 19:11:01 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri Jan 14 19:11:03 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <41E80995.5030901@ocf.berkeley.edu> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> <41E80995.5030901@ocf.berkeley.edu> Message-ID: <20050114181101.GB21486@panix.com> On Fri, Jan 14, 2005, Brett C. wrote: > > Right, it's a struct_time object; just force of habit to call it a time > tuple. > > And I technically don't see why a fractional second attribute could not be > added that is not represented in the tuple. But I personally would like to > see struct_tm eliminated in Py3k and replaced with datetime usage. My wish > is to have the 'time' module stripped down to only the bare essentials that > just don't fit in datetime and push everyone to use datetime for most > things. Because of people doing things like year, month, day, hour, min, sec, junk, junk, junk = time.localtime() -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From michel at dialnetwork.com Fri Jan 14 19:02:39 2005 From: michel at dialnetwork.com (Michel Pelletier) Date: Fri Jan 14 19:18:50 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114090955.0BCD41E400D@bag.python.org> References: <20050114090955.0BCD41E400D@bag.python.org> Message-ID: <200501141002.39525.michel@dialnetwork.com> > Date: Fri, 14 Jan 2005 02:38:05 -0500 > From: "Phillip J. Eby" > Subject: Re: [Python-Dev] PEP 246: lossless and stateless > To: guido@python.org > Cc: "Clark C. Evans" , python-dev@python.org > Message-ID: <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com> > Content-Type: text/plain; charset="us-ascii"; format=flowed > > Each of these examples registers the function as an implementation of the > "file.read" operation for the appropriate type. ?When you want to build an > adapter from SomeKindOfStream or from a string iterator to the "file" type, > you just access the 'file' type's descriptors, and look up the > implementation registered for that descriptor for the source type > (SomeKindOfStream or string-iter). ?If there is no implementation > registered for a particular descriptor of 'file', you leave the > corresponding attribute off of the adapter class, resulting in a class > representing the subset of 'file' that can be obtained for the source class. > > The result is that you generate a simple adapter class whose only state is > a read-only slot pointing to the adapted object, and descriptors that bind > the registered implementations to that object. ?That is, the descriptor > returns a bound instancemethod with an im_self of the original object, not > the adapter. ?(Thus the implementation never even gets a reference to the > adapter, unless 'self' in the method is declared of the same type as the > adapter, which would be the case for an abstract method like 'readline()' > being implemented in terms of 'read'.) > > Anyway, it's therefore trivially "guaranteed" to be stateless (in the same > way that an 'int' is "guaranteed" to be immutable), and the implementation > is also "guaranteed" to be able to always get back the "original" object. > > Defining adaptation in terms of adapting operations also solves another > common problem with interface mechanisms for Python: the dreaded "mapping > interface" and "file-like object" problem. ?Really, being able to > *incompletely* implement an interface is often quite useful in practice, so > this "monkey see, monkey do" typing ditches the whole concept of a complete > interface in favor of "explicit duck typing". ?You're just declaring "how > can X act 'like' a duck" -- emulating behaviors of another type rather than > converting structure. I get it! Your last description didn't quite sink in but this one does and I've been thinking about this quite a bit, and I like it. I'm starting to see how it nicely sidesteps the problems discussed in the thread so far. Partial implementation of interfaces (read, implementing only the operations you care about on a method by method basis instead of an entire interface) really is very useful and feels quite pythonic to me. After all, in most cases of substitutability in Pyhton (in my experience), it's not the *type* you do anything with, but that type's operations. Does anyone know of any other languages that take this "operational" aproach to solving the substitutability problem? There seem to be some downsides vs. interfaces (I think) the lack of "it's documentation too" aspect, I find zope 3 interfaces.py modules the best way to learn about it, but again the upside is, no complex interface relationships just to define the subtle variations of "mapping" and users can always just say help(file.read). Another thing I see used fairly commonly are marker interfaces. While I'm not sure of their overall usefullness I don't see how they can be done using your operational scheme. Maybe that means they were a bad idea in the first place. I also think this is easier for beginners to understand, instead of "you have to implement this interface, look at it over here, that's the "file" interface, now you implement that in your object and you better do it all right" you just tell them "call your method 'read' and say its 'like file.read' and your thing will work where any file can be read. -Michel From skip at pobox.com Fri Jan 14 19:26:02 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Jan 14 19:26:15 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <41E80995.5030901@ocf.berkeley.edu> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> <41E80995.5030901@ocf.berkeley.edu> Message-ID: <16872.3770.25143.582154@montanaro.dyndns.org> >> I realize the %4N notation is distasteful, but without it I think you >> will have trouble parsing something like >> >> 13:02:00.704 >> >> What would be the format string? %H:%M:%S.%N would be incorrect. Brett> Why is that incorrect? Because "704" represents the number of milliseconds, not the number of nanoseconds. I'm sure that in some applications people are interested in extremely short time scales. Writing out hours, minutes and seconds when all you are concerned with are small fractions of seconds (think high energy physics) would be a waste. In those situations log entries like 704 saw proton 705 proton hit neutron 706 saw electron headed toward Saturn might make perfect sense. Parsing the time field entirely within time.strptime would be at least clumsy if you couldn't tell it the scale of the numbers you're dealing with. Parsing with %N, %3N or %6N would give different values (nanoseconds, milliseconds or microseconds). Skip From aleax at aleax.it Fri Jan 14 19:29:53 2005 From: aleax at aleax.it (Alex Martelli) Date: Fri Jan 14 19:29:58 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <20050114181101.GB21486@panix.com> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> <41E80995.5030901@ocf.berkeley.edu> <20050114181101.GB21486@panix.com> Message-ID: <48A86BD2-665A-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 14, at 19:11, Aahz wrote: > On Fri, Jan 14, 2005, Brett C. wrote: >> >> Right, it's a struct_time object; just force of habit to call it a >> time >> tuple. >> >> And I technically don't see why a fractional second attribute could >> not be >> added that is not represented in the tuple. But I personally would >> like to >> see struct_tm eliminated in Py3k and replaced with datetime usage. >> My wish >> is to have the 'time' module stripped down to only the bare >> essentials that >> just don't fit in datetime and push everyone to use datetime for most >> things. > > Because of people doing things like > > year, month, day, hour, min, sec, junk, junk, junk = time.localtime() And why would that be a problem? It would keep working just like today, assuming you're answering the "don't see why" part. From the start, we discussed fractional seconds being available only as an ATTRIBUTE of a struct_time, not an ITEM (==iteration on a struct_time will keep working just line now). Alex From aahz at pythoncraft.com Fri Jan 14 19:39:43 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri Jan 14 19:39:46 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <48A86BD2-665A-11D9-ADA4-000A95EFAE9E@aleax.it> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> <41E80995.5030901@ocf.berkeley.edu> <20050114181101.GB21486@panix.com> <48A86BD2-665A-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <20050114183943.GA10564@panix.com> On Fri, Jan 14, 2005, Alex Martelli wrote: > On 2005 Jan 14, at 19:11, Aahz wrote: >>On Fri, Jan 14, 2005, Brett C. wrote: >>> >>>Right, it's a struct_time object; just force of habit to call it a >>>time tuple. >>> >>>And I technically don't see why a fractional second attribute could >>>not be added that is not represented in the tuple. But I personally >>>would like to see struct_tm eliminated in Py3k and replaced with >>>datetime usage. My wish is to have the 'time' module stripped down >>>to only the bare essentials that just don't fit in datetime and push >>>everyone to use datetime for most things. >> >>Because of people doing things like >> >>year, month, day, hour, min, sec, junk, junk, junk = time.localtime() > > And why would that be a problem? It would keep working just like > today, assuming you're answering the "don't see why" part. From the > start, we discussed fractional seconds being available only as an > ATTRIBUTE of a struct_time, not an ITEM (==iteration on a struct_time > will keep working just line now). Uh, I missed the second "not" in Brett's first sentence of second paragraph. Never mind! -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From cce at clarkevans.com Fri Jan 14 20:25:11 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Fri Jan 14 20:25:14 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <200501141002.39525.michel@dialnetwork.com> References: <20050114090955.0BCD41E400D@bag.python.org> <200501141002.39525.michel@dialnetwork.com> Message-ID: <20050114192511.GD21254@prometheusresearch.com> On Fri, Jan 14, 2005 at 10:02:39AM -0800, Michel Pelletier wrote: | Phillip J. Eby wrote: | > The result is that you generate a simple adapter class whose | > only state is a read-only slot pointing to the adapted object, | > and descriptors that bind the registered implementations to that object. it has only the functions in the interface, plus the adaptee; all requests through the functions are forwarded on to their equivalent in the adaptee; sounds alot like the adapter pattern ;) | I get it! Your last description didn't quite sink in but this one does | and I've been thinking about this quite a bit, and I like it. I'm | starting to see how it nicely sidesteps the problems discussed in | the thread so far. I'm not sure what else this mechanism provides; besides limiting adapters so that they cannot maintain their own state. | Does anyone know of any other languages that take this "operational" | aproach to solving the substitutability problem? Microsoft's COM? | I also think this is easier for beginners to understand, instead of | "you have to implement this interface, look at it over here, | that's the "file" interface, now you implement that in your object | and you better do it all right" you just tell them "call your | method 'read' and say its 'like file.read' and your thing will work | where any file can be read. A tangable example would perhaps better explain... Looking forward to the PEP, Clark From aleax at aleax.it Fri Jan 14 21:36:36 2005 From: aleax at aleax.it (Alex Martelli) Date: Fri Jan 14 21:36:39 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114192511.GD21254@prometheusresearch.com> References: <20050114090955.0BCD41E400D@bag.python.org> <200501141002.39525.michel@dialnetwork.com> <20050114192511.GD21254@prometheusresearch.com> Message-ID: On 2005 Jan 14, at 20:25, Clark C. Evans wrote: > | Does anyone know of any other languages that take this "operational" > | aproach to solving the substitutability problem? > > Microsoft's COM? I don't see the parallel: COM (QueryInterface) is strictly by-interface, not by-method, and has many other differences. Alex From pje at telecommunity.com Fri Jan 14 21:48:01 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 21:46:28 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050114153236.0335fc50@mail.telecommunity.com> At 06:56 PM 1/14/05 +0100, Just van Rossum wrote: >Phillip J. Eby wrote: > > > For example, if there were a weak reference dictionary mapping > > objects to their (stateful) adapters, then adapt() could always > > return the same adapter instance for a given source object, thus > > guaranteeing a single state. > >Wouldn't that tie the lifetime of the adapter object to that of the >source object? Well, you also need to keep the object alive if the adapter is still hanging around. I'll get to implementation details and alternatives in the PEP. >Possibly naive question: is using adaptation to go from iterable to >iterator abuse? That would be a clear example of per-adapter state. I don't know if it's abuse per se, but I do know that speciifying whether a routine takes an iterable or can accept an iterator is often something important to point out, and it's a requirement that back-propagates through code, forcing explicit management of the iterator's state. So, if you were going to do some kind of adaptation with iterators, it would be much more useful IMO to adapt the *other* way, to turn an iterator into a reiterable. Coincidentally, a reiterable would create per-object state. :) In other words, if you *did* consider iterators to be adaptation, it seems to me an example of wanting to be explicit about when the adapter gets created, if its state is per-adapter. And the reverse scenario (iterator->reiterable) is an example of adaptation where shared state could solve a problem for you if it's done implicitly. (E.g. by declaring that you take a reiterable, but allowing people to pass in iterators.) From pje at telecommunity.com Fri Jan 14 21:50:03 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 21:48:29 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114174149.GB21254@prometheusresearch.com> References: <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com> <20050113182142.GC35655@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114014238.0308e850@mail.telecommunity.com> <5.1.1.6.0.20050114114219.03c44bc0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050114154836.03479c90@mail.telecommunity.com> At 12:41 PM 1/14/05 -0500, Clark C. Evans wrote: >May I suggest that you write this up as a PEP? Already committed to it for this weekend, but my statement was buried in a deep thread between Alex and I, so you might've missed it. From pje at telecommunity.com Fri Jan 14 22:12:48 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 22:11:31 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <200501141002.39525.michel@dialnetwork.com> References: <20050114090955.0BCD41E400D@bag.python.org> <20050114090955.0BCD41E400D@bag.python.org> Message-ID: <5.1.1.6.0.20050114155150.0347e140@mail.telecommunity.com> At 10:02 AM 1/14/05 -0800, Michel Pelletier wrote: >I get it! Thanks for the positive feedback, I was getting worried that I had perhaps gone quite insane during the great debate. :) > Your last description didn't quite sink in but this one does and >I've been thinking about this quite a bit, and I like it. I'm starting to >see how it nicely sidesteps the problems discussed in the thread so far. Good, I'll try to cherry-pick from that post when writing the PEP. >Does anyone know of any >other languages that take this "operational" aproach to solving the >substitutability problem? This can be viewed as a straightforward extension of the COM or Java type models in two specific ways: 1) You can implement an interface incompletely, but still receive a partial adapter 2) Third parties can supply implementations of individual operations Everything else is pretty much like COM pointers to interfaces, or Java casting. Of course, as a consequence of #1, you also have to declare conformance per-operation rather than per-interface, but some syntactic sugar for declaring a block of methods would be helpful. But that's a detail; declaring support for an interface in COM or Java is just "like" automatically adding all the individual "like" declarations. Alternatively, you can look at this as a dumbed-down version of protocols or typeclasses in functional languages that use generic or polymorphic operations as the basis of their type system. E.g. in Haskell a "typeclass" categorizes types by common operations that are available to them. For example the 'Ord' typeclass represents types that have ordering via operations like <, >, and so forth. However, you don't go and declare that a type is in the 'Ord' typeclass, what you do is *implement* those operations (which may be by defining how to call some other operation the type has) and the type is automatically then considered to be in the typeclass. (At least, that's my understanding as a non-Haskell developer who's skimmed exactly one tutorial on the language! I could be totally misinterpreting what I read.) Anyway, all of these systems were inspirations, if that's what you're asking. >There seem to be some downsides vs. interfaces (I think) the lack of "it's >documentation too" aspect, I find zope 3 interfaces.py modules the best way >to learn about it, but again the upside is, no complex interface >relationships just to define the subtle variations of "mapping" and users can >always just say help(file.read). It doesn't *stop* you from using interfaces of whatever stripe for documentation, though. The target type can be abstract. All that's required is that it *be* a type (and that restriction might be loosen-able via an adapter!) and that it have descriptors that will indicate the callable operations. So Zope interfaces still work; there's no requirement that the descriptor something is "like" can't be an empty function with a docstring, like it is in a Zope or PyProtocols interface. >Another thing I see used fairly commonly are marker interfaces. While I'm >not >sure of their overall usefullness I don't see how they can be done using your >operational scheme. Add an operation to them, or an attribute like 'isFoo'. Then declare an implementation that returns true, if the appropriate object state matches. (I presume you're talking about Zope's per-instance marker interfaces that come and go based on object state.) > Maybe that means they were a bad idea in the first >place. Probably so! But they can still be done, if you really need one. You just have to recast it in terms of some kind of operation or attribute. >I also think this is easier for beginners to understand, instead of "you have >to implement this interface, look at it over here, that's the "file" >interface, now you implement that in your object and you better do it all >right" you just tell them "call your method 'read' and say its 'like >file.read' and your thing will work where any file can be read. You don't even need to call it read; you could use the word "read" in the non-English language of your choice; any code that wants a "file" will still be able to invoke it using "read". (And English speakers will at least know they're looking at code that's "like" file.read.) From pje at telecommunity.com Fri Jan 14 22:29:28 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Jan 14 22:27:54 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114192511.GD21254@prometheusresearch.com> References: <200501141002.39525.michel@dialnetwork.com> <20050114090955.0BCD41E400D@bag.python.org> <200501141002.39525.michel@dialnetwork.com> Message-ID: <5.1.1.6.0.20050114161325.0347ac20@mail.telecommunity.com> At 02:25 PM 1/14/05 -0500, Clark C. Evans wrote: >I'm not sure what else this mechanism provides; besides limiting >adapters so that they cannot maintain their own state. * No need to write adapter classes for stateless adapters; just declare methods * Allows partial adapters to be written for e.g. "file-like" objects without creating lots of mini-interfaces and somehow relating them all * No need to explain the concept of "interface" to somebody who just knows that the routine they're calling needs a "file" and they need to make their object "work like" a file in some way. (That is, more supportive of "programming for everybody") * Supports using either concrete or abstract types as effective interfaces * Doesn't require us to create explicit interfaces for the entire stdlib, if saying something's "like" an existing abstract or concrete type suffices! * Supports abstract operations like "dict.update" that can automatically flesh out partial adapters (i.e, if you have an object with an operation "like" dict.__setitem__, then a generic dict.update can be used to complete your adaptation) * Doesn't require anybody to write __conform__ or __adapt__ methods in order to get started with adaptation. This is really more of a replacement for PEP 245 than 246 in some ways, but of course it relates to 246 also, since the idea would basically be to integrate it with the "global registry" described in 246. In other words, "like" declarations should populate the global registry, and in such a way that state is unified for (per-object) stateful adapters. From bac at OCF.Berkeley.EDU Fri Jan 14 22:50:48 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Jan 14 22:51:02 2005 Subject: [Python-Dev] redux: fractional seconds in strptime In-Reply-To: <16872.3770.25143.582154@montanaro.dyndns.org> References: <16870.61059.451494.303971@montanaro.dyndns.org> <41E74790.60108@ocf.berkeley.edu> <16871.37525.981821.580939@montanaro.dyndns.org> <41E80995.5030901@ocf.berkeley.edu> <16872.3770.25143.582154@montanaro.dyndns.org> Message-ID: <41E83EB8.8060405@ocf.berkeley.edu> Skip Montanaro wrote: > >> I realize the %4N notation is distasteful, but without it I think you > >> will have trouble parsing something like > >> > >> 13:02:00.704 > >> > >> What would be the format string? %H:%M:%S.%N would be incorrect. > > Brett> Why is that incorrect? > > Because "704" represents the number of milliseconds, not the number of > nanoseconds. > > I'm sure that in some applications people are interested in extremely short > time scales. Writing out hours, minutes and seconds when all you are > concerned with are small fractions of seconds (think high energy physics) > would be a waste. In those situations log entries like > > 704 saw proton > 705 proton hit neutron > 706 saw electron headed toward Saturn > > might make perfect sense. Parsing the time field entirely within > time.strptime would be at least clumsy if you couldn't tell it the scale of > the numbers you're dealing with. Parsing with %N, %3N or %6N would give > different values (nanoseconds, milliseconds or microseconds). > Fine, but couldn't you also do a pass over the data after extraction to get to the actual result you want (so parse, and take the millisecond value and multiply by the proper scale)? This feels like it is YAGNI, or at least KISS. If you want to handle milliseconds because of the logging module, fine. But trying to deal with all possible time parsing possibilities is painful and usually not needed. Personally I am more inclined to add a new directive that acts as %S but allows for an optional decimal point, comma or the current locale's separator if it isn't one of those two which will handle the logging package's optional decimal output ('\d+([,.%s]\d+)?" % locale.localeconv()['decimal_point']). Also doesn't break any existing code. And an issue I forgot to mention for all of this is it will break symmetry with time.strftime(). If symmetry is kept then an extra step in strftime will need to be handled since whatever solution we do will not match the C spec anymore. -Brett From glyph at divmod.com Sat Jan 15 01:02:52 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Sat Jan 15 00:58:47 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> Message-ID: <1105747372.13655.93.camel@localhost> On Fri, 2005-01-14 at 10:07 -0500, Phillip J. Eby wrote: > Maybe I'm missing something, but for those interfaces, isn't it okay to > keep the state in the *adapted* object here? In other words, if PointPen > just added some private attributes to store the extra data? I have been following this discussion with quite a lot of interest, and I have to confess that a lot of what's being discussed is confusing me. I use stateful adapters quite a bit - Twisted has long had a concept of "sticky" adapters (they are called "persistent" in Twisted, but I think I prefer "sticky"). Sometimes my persistent adapters are sticky, sometimes not. Just's example of iter() as an adaptation is a good example of a non-sticky stateful adaptation, but this example I found interesting, because it seems that the value-judgement of stateless adapters as "good" is distorting design practices to make other mistakes, just to remove state from adapters. I can't understand why PJE thinks - and why there seems to be a general consensus emerging - that stateless adapters are intrinsically better. For the sake of argument, let's say that SegmentPen is a C type, which does not have a __dict__, and that PointPen is a Python adapter for it, in a different project. Now, we have nowhere to hide PointPen's state on SegmentPen - and why were we trying to in the first place? It's a horrible breach of encapsulation. The whole *point* of adapters is to convert between *different* interfaces, not merely to rename methods on the same interface, or to add extra methods that work on the same data. To me, "different interfaces" means that the actual meaning of the operations is different - sometimes subtly, sometimes dramatically. There has to be enough information in one interface to get started on the implementation of another, but the fact that such information is necessary doesn't mean it is sufficient. It doesn't mean that there is enough information in the original object to provide a complete implementation of a different interface. If there were enough information, why not just implement all of your interfaces on the original class? In the case of our hypothetical cSegmentPen, we *already* have to modify the implementation of the original class to satisfy the needs of a "stateless" adapter. When you're modifying cSegmentPen, why not just add the methods that you wanted in the first place? Here's another example: I have a business logic class which lives in an object database, typically used for a web application. I convert this into a desktop application. Now, I want to adapt IBusinessThunk to IGtkUIPlug. In the process of doing so, I have to create a GTK widget, loaded out of some sort of resource file, and put it on the screen. I have to register event handlers which are associated with that adapter. The IBusinessThunk interface doesn't specify a __dict__ attribute as part of the interface, or the ability to set arbitrary attributes. And nor should it! It is stored in an indexed database where every attribute has to be declared, maybe, or perhaps it uses Pickle and sticking a GTK widget into its representation would make it un-pickleable. Maybe it's using an O/R mapper which loses state that is not explicitly declared or explicitly touch()ed. There are a variety of problems which using it in this unsupported way might create, but as the implementor of a IGtkUIPlug, I should be concerned *only* with what IBusinessThunk provides, which is .embezzle() and .checkFundsAvailable(). I am not writing an adapter from DBBusinessThunkImpl, after all, and perhaps I am receiving a test implementation that works entirely differently. This example gets to the heart of what makes interfaces useful to me - model/view separation. Although one might be hard pressed to call some of the things I use adaptation for "views", the idea of mediated access from a user, or from network protocol, or from some internal code acting on behalf of a user is the overwhelming majority of my use-cases. Most of the other use-cases I can think of are like the one James mentions, where we really are using adaptation to shuffle around some method names and provide simple glossing over totally isomorphic functionality to provide backwards (or sideways, in the case of almost-identical libraries provided on different platforms or environments) compatibility. For these reasons I would vastly prefer it if transitivity were declared as a property of the *adaptation*, not of the adapter or the registry or to be inferred from various vaguely-defined properties like "losslessness" or "statelessness". I am also concerned about any proposal which introduces transitivity-based errors at adaptation time rather than at registration time, because by then it is definitely too late to do anything about it. I wish I had a better suggestion, but I'm still struggling through the rest of the thread :). From bob at redivi.com Sat Jan 15 01:14:37 2005 From: bob at redivi.com (Bob Ippolito) Date: Sat Jan 15 01:14:44 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <1105747372.13655.93.camel@localhost> References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <1105747372.13655.93.camel@localhost> Message-ID: <7163A634-668A-11D9-A02F-000A95BA5446@redivi.com> On Jan 14, 2005, at 19:02, Glyph Lefkowitz wrote: > On Fri, 2005-01-14 at 10:07 -0500, Phillip J. Eby wrote: > >> Maybe I'm missing something, but for those interfaces, isn't it okay >> to >> keep the state in the *adapted* object here? In other words, if >> PointPen >> just added some private attributes to store the extra data? > > Here's another example: I have a business logic class which lives in an > object database, typically used for a web application. I convert this > into a desktop application. Now, I want to adapt IBusinessThunk to > IGtkUIPlug. In the process of doing so, I have to create a GTK widget, > loaded out of some sort of resource file, and put it on the screen. I > have to register event handlers which are associated with that adapter. > > The IBusinessThunk interface doesn't specify a __dict__ attribute as > part of the interface, or the ability to set arbitrary attributes. And > nor should it! It is stored in an indexed database where every > attribute has to be declared, maybe, or perhaps it uses Pickle and > sticking a GTK widget into its representation would make it > un-pickleable. Maybe it's using an O/R mapper which loses state that > is > not explicitly declared or explicitly touch()ed. There are a variety > of > problems which using it in this unsupported way might create, but as > the > implementor of a IGtkUIPlug, I should be concerned *only* with what > IBusinessThunk provides, which is .embezzle() > and .checkFundsAvailable(). I am not writing an adapter from > DBBusinessThunkImpl, after all, and perhaps I am receiving a test > implementation that works entirely differently. > > This example gets to the heart of what makes interfaces useful to me - > model/view separation. Although one might be hard pressed to call some > of the things I use adaptation for "views", the idea of mediated access > from a user, or from network protocol, or from some internal code > acting > on behalf of a user is the overwhelming majority of my use-cases. I think the idea is that it's "better" to have an adapter from IBusinessThunk -> IGtkUIPlugFactory, which you can use to *create* a stateful object that complies with the IGtkUIPlug interface. This way, you are explicitly creating something entirely new (derived from something else) with its own lifecycle and state and it should be managed accordingly. This is clearly not simply putting a shell around an IBusinessThunk that says "act like this right now". -bob From steven.bethard at gmail.com Sat Jan 15 01:37:13 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Sat Jan 15 01:37:16 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <1105747372.13655.93.camel@localhost> References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <1105747372.13655.93.camel@localhost> Message-ID: On Fri, 14 Jan 2005 19:02:52 -0500, Glyph Lefkowitz wrote: > On Fri, 2005-01-14 at 10:07 -0500, Phillip J. Eby wrote: > > > Maybe I'm missing something, but for those interfaces, isn't it okay to > > keep the state in the *adapted* object here? In other words, if PointPen > > just added some private attributes to store the extra data? > > I have been following this discussion with quite a lot of interest, and > I have to confess that a lot of what's being discussed is confusing me. > I use stateful adapters quite a bit - Twisted has long had a concept of > "sticky" adapters (they are called "persistent" in Twisted, but I think > I prefer "sticky"). Sometimes my persistent adapters are sticky, > sometimes not. Just's example of iter() as an adaptation is a good > example of a non-sticky stateful adaptation, but this example I found > interesting, because it seems that the value-judgement of stateless > adapters as "good" is distorting design practices to make other > mistakes, just to remove state from adapters. I can't understand why > PJE thinks - and why there seems to be a general consensus emerging - > that stateless adapters are intrinsically better. My feeling here was not that people thought that stateless adapters were in general intrinsically better -- just when the adaptation was going to be done implicitly (e.g. by type declarations). When no state is involved, adapting an object multiple times can be guaranteed to produce the same adapted object, so if this happens implicitly, it's not a big deal. When state is involved, _some_ decisions have to be made, and it seems like those decisions should be made explicitly... Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From pje at telecommunity.com Sat Jan 15 02:06:22 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 15 02:04:48 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <1105747372.13655.93.camel@localhost> <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <1105747372.13655.93.camel@localhost> Message-ID: <5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com> At 05:37 PM 1/14/05 -0700, Steven Bethard wrote: >On Fri, 14 Jan 2005 19:02:52 -0500, Glyph Lefkowitz wrote: > > On Fri, 2005-01-14 at 10:07 -0500, Phillip J. Eby wrote: > > > > > Maybe I'm missing something, but for those interfaces, isn't it okay to > > > keep the state in the *adapted* object here? In other words, if PointPen > > > just added some private attributes to store the extra data? > > > > I have been following this discussion with quite a lot of interest, and > > I have to confess that a lot of what's being discussed is confusing me. > > I use stateful adapters quite a bit - Twisted has long had a concept of > > "sticky" adapters (they are called "persistent" in Twisted, but I think > > I prefer "sticky"). Sometimes my persistent adapters are sticky, > > sometimes not. Just's example of iter() as an adaptation is a good > > example of a non-sticky stateful adaptation, but this example I found > > interesting, because it seems that the value-judgement of stateless > > adapters as "good" is distorting design practices to make other > > mistakes, just to remove state from adapters. I can't understand why > > PJE thinks - and why there seems to be a general consensus emerging - > > that stateless adapters are intrinsically better. > >My feeling here was not that people thought that stateless adapters >were in general intrinsically better -- just when the adaptation was >going to be done implicitly (e.g. by type declarations). Yes, exactly. :) >When no state is involved, adapting an object multiple times can be >guaranteed to produce the same adapted object, so if this happens >implicitly, it's not a big deal. When state is involved, _some_ >decisions have to be made, and it seems like those decisions should be >made explicitly... At last someone has been able to produce a concise summary of my insane ramblings. :) Yes, this is precisely the key: implicit adaptation should always return an adapter with the "same" state (for some sensible meaning of "same"), because otherwise control of an important aspect of the system's behavior is too widely distributed to be able to easily tell for sure what's going on. It also produces the side-effect issue of possibly introducing transitive adaptation, and again, that property is widely distributed and hard to "see". Explicit adaptation to add per-adapter state is just fine; it's only *implicit* "non-sticky stateful" adaptation that creates issues. Thus, the PEP I'm working on focuses on making it super-easy to make stateless and sticky stateful adapters with a bare minimum of declarations and interfaces and such. From pje at telecommunity.com Sat Jan 15 02:30:04 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 15 02:28:32 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <1105747372.13655.93.camel@localhost> References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com> At 07:02 PM 1/14/05 -0500, Glyph Lefkowitz wrote: >For the sake of argument, let's say that SegmentPen is a C type, which >does not have a __dict__, and that PointPen is a Python adapter for it, >in a different project. There are multiple implementation alternatives possible here; it isn't necessary that the state be hidden there. The point is that, given the same SegmentPen, we want to get the same PointPen each time we *implicitly* adapt, in order to avoid violating the "naive" developer's mental model of what adaptation is -- i.e. an extension of the object's state, not a new object with independent state. One possible alternative implementation is to use a dictionary from object id to a 'weakref(ob),state' tuple, with the weakref set up to remove the entry when 'ob' goes away. Adapters would then have a pointer to their state object and a pointer to the adaptee. As long as an adapter lives, the adaptee lives, so the state remains valid. Or, if no adapters remain, but the adaptee still lives, then so does the state which can be resurrected when a new adapter is requested. It's too bad Python doesn't have some sort of deallocation hook you could use to get notified when an object goes away. Oh well. Anyway, as you and I have both pointed out, sticky adaptation is an important use case; when you need it, you really need it. >This example gets to the heart of what makes interfaces useful to me - >model/view separation. Although one might be hard pressed to call some >of the things I use adaptation for "views", the idea of mediated access >from a user, or from network protocol, or from some internal code acting >on behalf of a user is the overwhelming majority of my use-cases. If every time you pass a "model" to something that expects a "view", you get a new "view" instance being created, things are going to get mighty confusing, mighty fast. In contrast, explicit adaptation with 'adapt(model,IView)' or 'IView(model)' allows you to explicitly control the lifecycle of the view (or views!) you want to create. Guido currently thinks that type declaration should be implemented as 'adapt(model,IView)'; I think that maybe it should be restricted (if only by considerations of "good style") to adapters that are sticky or stateless, reserving per-state adaptation for explicit creation via today's 'adapt()' or 'IFoo(ob)' APIs. >I wish I had a better suggestion, but I'm still struggling through the >rest of the thread :). I'll be starting work on the PEP soon, maybe I'll have a rough draft of at least the first few sections ready to post tonight so everybody can get started on ripping them to pieces. The sooner I know about the holes, the sooner I can fix 'em. Or alternatively, the sooner Guido shoots it down, the less work I have to do on the PEP. :) From ncoghlan at iinet.net.au Sat Jan 15 04:18:25 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat Jan 15 04:18:30 2005 Subject: [Python-Dev] frame.f_locals is writable In-Reply-To: <41E7FF32.2030502@ieee.org> References: <41E6DD52.2080109@ieee.org> <41E7D60F.9000208@iinet.net.au> <41E7FF32.2030502@ieee.org> Message-ID: <41E88B81.3040303@iinet.net.au> Shane Holloway (IEEE) wrote: > Yes. After poking around in Google with PyFrame_LocalsToFast, I found > some other links to people doing that. I implemented a direct call > using ctypes to make the code explicit about what's happening. I'm just > glad it is possible now. Works fine in both 2.3 and 2.4. I realised after posting that the exec-based hack only works for poking values into the _current_ frame's locals, so my trick wouldn't have done what you needed, anyway. Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From glyph at divmod.com Sat Jan 15 07:45:05 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Sat Jan 15 07:40:59 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <7163A634-668A-11D9-A02F-000A95BA5446@redivi.com> References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <1105747372.13655.93.camel@localhost> <7163A634-668A-11D9-A02F-000A95BA5446@redivi.com> Message-ID: <1105771505.13655.96.camel@localhost> On Fri, 2005-01-14 at 19:14 -0500, Bob Ippolito wrote: > I think the idea is that it's "better" to have an adapter from > IBusinessThunk -> IGtkUIPlugFactory, which you can use to *create* a > stateful object that complies with the IGtkUIPlug interface. > > This way, you are explicitly creating something entirely new (derived > from something else) with its own lifecycle and state and it should be > managed accordingly. This is clearly not simply putting a shell around > an IBusinessThunk that says "act like this right now". Yes. This is exactly what I meant to say. Maybe there are 2 entirely different use-cases for adaptation, and we shouldn't be trying to confuse the two, or conflate them into one system? I am going to go have a look at PEAK next, to see why there are so many stateless adapters there. From aleax at aleax.it Sat Jan 15 10:30:25 2005 From: aleax at aleax.it (Alex Martelli) Date: Sat Jan 15 10:30:32 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <1105747372.13655.93.camel@localhost> References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <1105747372.13655.93.camel@localhost> Message-ID: <16679EEB-66D8-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 15, at 01:02, Glyph Lefkowitz wrote: ... > Now, we have nowhere to hide PointPen's state on SegmentPen - and why > were we trying to in the first place? It's a horrible breach of > encapsulation. The whole *point* of adapters is to convert between > *different* interfaces, not merely to rename methods on the same > interface, or to add extra methods that work on the same data. To me, A common implementation technique, when you'd love to associate some extra data to an object, but can't rely on the object having a __dict__ to let you do that conveniently, is to have an auxiliary dict of bunches of extra data, keyed by object's id(). It's a bit messier, in that you have to deal with cleanup issues when the object goes away, as well as suffer an extra indirectness; but in many use cases it's quite workable. I don't see doing something like myauxdict[id(obj)] = {'foo': 'bar'} as "terribly invasive", and therefore neither do I see obj.myauxfoo = 'bar' as any more invasive -- just two implementation techniques for the same task with somewhat different tradeoffs. The task, associating extra data with obj without changing obj type's source, won't just go away. Incidentally, the realization of this equivalence was a key step in my very early acceptance of Python. In the first few days, the concept "some external code might add an attribute to obj -- encapsulation breach!" made me wary; then CLICK, the first time I had to associate extra data to an object and realized the alleged ``breach'' was just a handy implementation help for the task I needed anyway, I started feeling much better about it. Adapter use cases exist for all three structures: 1. the adapter just needs to change method names and signatures or combine existing methods of the object, no state additions; 2. the adapter needs to add some per-object state, which must be shared among different adapters which may simultaneously exist on the same object; 3. the adapter needs to add some per-adapter state, which must be distinct among different adapters which may simultaneously exist on the same object. Case [1] is simplest because you don't have to wonder whether [2] or [3] are better, which may be why it's being thought of as "best". Case [3] may be dubious when we talk about AUTOMATIC adaptation, because in [3] making and using two separate adapters has very different semantics from making just one adapter and using it twice. When you build the adapter explicitly of course you have full control and hopefully awareness of that. For example, in Model/View, clearly you want multiple views on the same model and each view may well need a few presentation data of its own; if you think of it as adaptation, it's definitely a [3]. But do we really want _automatic_ adaptation -- passing a Model to a function which expects a View, and having some kind of default presentation data be used to make a default view on it? That, I guess, is the dubious part. > "different interfaces" means that the actual meaning of the operations > is different - sometimes subtly, sometimes dramatically. There has to > be enough information in one interface to get started on the > implementation of another, but the fact that such information is > necessary doesn't mean it is sufficient. It doesn't mean that there is > enough information in the original object to provide a complete > implementation of a different interface. > > If there were enough information, why not just implement all of your > interfaces on the original class? In the case of our hypothetical > cSegmentPen, we *already* have to modify the implementation of the > original class to satisfy the needs of a "stateless" adapter. When > you're modifying cSegmentPen, why not just add the methods that you > wanted in the first place? Reason #1: because the author of the cSegmentPen code cannot assume he or she knows about all the interfaces to which a cSegmentPen might be adapted a la [3]. If he or she provides a __dict__, or makes cSegmentPen weakly referenceable, all [3]-like adaptation needs are covered at one (heh heh) stroke. > Here's another example: I have a business logic class which lives in an > object database, typically used for a web application. I convert this > into a desktop application. Now, I want to adapt IBusinessThunk to > IGtkUIPlug. In the process of doing so, I have to create a GTK widget, > loaded out of some sort of resource file, and put it on the screen. I > have to register event handlers which are associated with that adapter. OK, a typical case of model/view and thus a [3]. The issue is whether you want adaptation to be automatic or explicit, in such cases. > Most of the other use-cases I can think of are like the one James > mentions, where we really are using adaptation to shuffle around some > method names and provide simple glossing over totally isomorphic > functionality to provide backwards (or sideways, in the case of > almost-identical libraries provided on different platforms or > environments) compatibility. And what's wrong with that? Those are the "case [1]" adapters, and they're very useful. I guess this boils down to the issue that you don't think there are use cases for [2], where the extra state is needed but it had better be per-object, shared among adapters, and not per-adapter, distinct for each adapter. Well, one example in the model/view area comes from 3d modeling for mechanical engineering: the model is a complex collection of solids which only deal with geometrical properties, the views are "cameras" rendering scenes onto windows on the screen. Each view has some modest state of its own (camera distance, angles, screen coordinates), but also there are some presentation data -- alien to the model itself, which only has geometry -- which are required to be shared among views, such as lighting information and surface texturing. One approach would be to first wrap the bare-model into an enriched-model, once only; and adapt only the enriched model to the views. If it's important to have different sets of views of the same geometry with different lighting &c, it's the only way to go; but sometimes the functional requirement is exactly the reverse -- ensure there is never any discrepancy among the lighting, texturing etc of the views over the same (geometrical) model. Nothing particularly wrong, then, in having the bunch of information that is the "enriched model" (lighting &c) be known only to the views but directly associated with the geometry-model. > For these reasons I would vastly prefer it if transitivity were > declared > as a property of the *adaptation*, not of the adapter or the registry > or > to be inferred from various vaguely-defined properties like > "losslessness" or "statelessness". I am also concerned about any > proposal which introduces transitivity-based errors at adaptation time > rather than at registration time, because by then it is definitely too > late to do anything about it. Fair enough, but for Guido's suggested syntax of "def f(X:Y):..." meaning X=adapt(X,Y) at function entry, the issue is how that particular "default/implicit" adaptation should behave -- is it always allowed to be transitive, never, only when Y is an interface and not a class, or under what specific set of constraints? Alex From aleax at aleax.it Sat Jan 15 10:35:30 2005 From: aleax at aleax.it (Alex Martelli) Date: Sat Jan 15 10:35:33 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com> References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com> Message-ID: On 2005 Jan 15, at 02:30, Phillip J. Eby wrote: > is requested. It's too bad Python doesn't have some sort of > deallocation hook you could use to get notified when an object goes > away. Oh well. For weakly referenceable objects, it does. Giving one to other objects would be almost isomorphic to making every object weakly referenceable, wouldn't it? Or am I missing something...? Alex From just at letterror.com Sat Jan 15 10:39:03 2005 From: just at letterror.com (Just van Rossum) Date: Sat Jan 15 10:39:11 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > At 07:02 PM 1/14/05 -0500, Glyph Lefkowitz wrote: > >For the sake of argument, let's say that SegmentPen is a C type, > >which does not have a __dict__, and that PointPen is a Python > >adapter for it, in a different project. > > There are multiple implementation alternatives possible here; it > isn't necessary that the state be hidden there. The point is that, > given the same SegmentPen, we want to get the same PointPen each time > we *implicitly* adapt, in order to avoid violating the "naive" > developer's mental model of what adaptation is -- i.e. an extension > of the object's state, not a new object with independent state. > > One possible alternative implementation is to use a dictionary from > object id to a 'weakref(ob),state' tuple, with the weakref set up to > remove the entry when 'ob' goes away. Adapters would then have a > pointer to their state object and a pointer to the adaptee. As long > as an adapter lives, the adaptee lives, so the state remains valid. > Or, if no adapters remain, but the adaptee still lives, then so does > the state which can be resurrected when a new adapter is requested. > It's too bad Python doesn't have some sort of deallocation hook you > could use to get notified when an object goes away. Oh well. That sounds extremely complicated as apposed to just storing the sate where it most logically belongs: on the adapter. And all that to work around a problem that I'm not convinced needs solving or even exists. At the very least *I* don't care about it in my use case. > Anyway, as you and I have both pointed out, sticky adaptation is an > important use case; when you need it, you really need it. Maybe I missed it, but was there an example posted of when "sticky adaptation" is needed? It's not at all clear to me that "sticky" behavior is the best default behavior, even with implicit adoptation. Would anyone in their right mind expect the following to return [0, 1, 2, 3, 4, 5] instead of [0, 1, 2, 0, 1, 2]? >>> from itertools import * >>> seq = range(10) >>> list(chain(islice(seq, 3), islice(seq, 3))) [0, 1, 2, 0, 1, 2] >>> Just From p.f.moore at gmail.com Sat Jan 15 14:20:37 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Sat Jan 15 14:20:41 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com> References: <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <1105747372.13655.93.camel@localhost> <5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com> Message-ID: <79990c6b050115052024b2208a@mail.gmail.com> On Fri, 14 Jan 2005 20:06:22 -0500, Phillip J. Eby wrote: > >My feeling here was not that people thought that stateless adapters > >were in general intrinsically better -- just when the adaptation was > >going to be done implicitly (e.g. by type declarations). > > Yes, exactly. :) In which case, given that there is no concept in PEP 246 of implicit adaptation, can we please make a clear separation of this discussion from PEP 246? (The current version of the PEP makes no mention of transitive adaptation, as optional or required behaviour, which is the only other example of implicit adaptation I can think of). I think there are the following distinct threads of discussion going on at the moment: * Details of what should be in PEP 246 * Discussions spinning off from Guido's type-declaration-as-adaptation proposal * Discussion of what counts as a "good" adapter * Philip's new generic function / ducy typing proposals Is that even close to others' understanding? Just trying to keep my brain from exploding :-) Paul. From pje at telecommunity.com Sat Jan 15 16:49:52 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 15 16:48:20 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com> At 10:39 AM 1/15/05 +0100, Just van Rossum wrote: >That sounds extremely complicated as apposed to just storing the sate >where it most logically belongs: on the adapter. Oh, the state will be on the adapter all right. It's just that for type declarations, I'm saying the system should return the *same* adapter each time. > And all that to work >around a problem that I'm not convinced needs solving or even exists. At >the very least *I* don't care about it in my use case. > > > Anyway, as you and I have both pointed out, sticky adaptation is an > > important use case; when you need it, you really need it. > >Maybe I missed it, but was there an example posted of when "sticky >adaptation" is needed? No; but Glyph and I have independent use cases for them. Here's one of mine: code generation from a UML or MOF model. The model classes can't contain methods or data for doing code generation, unless you want to cram every possible kind of code generation into them. The simple thing to do is to adapt them to a PythonCodeGenerator or an SQLCodeGenerator or what-have-you, and to do so stickily. (Because a code generator may need to walk over quite a bit of the structure while keeping state for different things being generated.) You *could* keep state in an external dictionary, of course, but it's much easier to use sticky adapters. >It's not at all clear to me that "sticky" behavior is the best default >behavior, even with implicit adoptation. Would anyone in their right >mind expect the following to return [0, 1, 2, 3, 4, 5] instead of [0, 1, >2, 0, 1, 2]? > > >>> from itertools import * > >>> seq = range(10) > >>> list(chain(islice(seq, 3), islice(seq, 3))) > [0, 1, 2, 0, 1, 2] > >>> I don't understand why you think it would. What does islice have to do with adaptation? From pje at telecommunity.com Sat Jan 15 16:57:29 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 15 16:55:55 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <79990c6b050115052024b2208a@mail.gmail.com> References: <5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com> <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <1105747372.13655.93.camel@localhost> <5.1.1.6.0.20050114195846.030bd6d0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050115105020.04055d90@mail.telecommunity.com> At 01:20 PM 1/15/05 +0000, Paul Moore wrote: >I think there are the following distinct threads of discussion going >on at the moment: > >* Details of what should be in PEP 246 >* Discussions spinning off from Guido's type-declaration-as-adaptation >proposal My understanding was that the first needed to be considered in context of the second, since it was the second which gave an implicit blessing to the first. PEP 246 had languished in relative obscurity for a long time until Guido's blessing it for type declarations brought it back into the spotlight. So, I thought it important to frame its discussion in terms of its use for type declaration. >* Discussion of what counts as a "good" adapter Alex was originally trying to add to PEP 246 some recommendations regarding "good" vs. "bad" adaptation, so this is actually part of "what should be in PEP 246" >* Philip's new generic function / ducy typing proposals And of course this one is an attempt to unify everything and replace PEP 245 (not 246) with a hopefully more pythonic way of defining interfaces and adapters. I hope to define a "relatively safe" subset of PEP 246 for type declarations that can be done automatically by Python, in a way that's also conceptually compatible with COM and Java casting (possibly making Jython and IronPython's lives a little easier re: type declarations). From just at letterror.com Sat Jan 15 17:32:42 2005 From: just at letterror.com (Just van Rossum) Date: Sat Jan 15 17:32:47 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > >It's not at all clear to me that "sticky" behavior is the best > >default behavior, even with implicit adoptation. Would anyone in > >their right mind expect the following to return [0, 1, 2, 3, 4, 5] > >instead of [0, 1, 2, 0, 1, 2]? > > > > >>> from itertools import * > > >>> seq = range(10) > > >>> list(chain(islice(seq, 3), islice(seq, 3))) > > [0, 1, 2, 0, 1, 2] > > >>> > > I don't understand why you think it would. What does islice have to > do with adaptation? islice() takes an iterator, yet I give it a sequence. It calls iter(seq), which I see as a form of adaptation (maybe you don't). Sticky adaptation would not be appropriate here, even though the adaptation is implicit. Just From tjreedy at udel.edu Sat Jan 15 17:59:54 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Sat Jan 15 18:00:02 2005 Subject: [Python-Dev] Re: Re: PEP 246: LiskovViolation as a name References: <1105553300.41e56794d1fc5@mcherm.com><16869.33426.883395.345417@montanaro.dyndns.org><41E63EDB.40008@cs.teiath.gr> <16869.57557.795447.53311@montanaro.dyndns.org> Message-ID: "Skip Montanaro" wrote in message news:16869.57557.795447.53311@montanaro.dyndns.org... > The first example here: > http://www.compulink.co.uk/~querrid/STANDARD/lsp.htm > Looks pretty un-extreme to me. To both summarize and flesh out the square-rectangle example: Q. Is a square 'properly' a rectangle? A. Depends on 'square' and 'rectangle'. * A static, mathematical square is a static, mathematical rectangle just fine, once width and height are aliased (adapted?) to edge. The only 'behaviors' are to report size and possibly derived quantities like diagonal and area. * Similarly, a dynamic, zoomable square is a zoomable rectangle. * But a square cannot 'properly' be a fully dynamic rectangle that can mutate to a dissimilar shape, and must when just one dimension is changed -- unless shape mutation is allowed to fail or unless the square is allowed to mutate itself into a rectangle. So it seems easily possible to introduce Liskov violations when adding behavior to a general superclass. Terry J. Reedy From pje at telecommunity.com Sat Jan 15 18:06:36 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 15 18:05:03 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com> At 05:32 PM 1/15/05 +0100, Just van Rossum wrote: >Phillip J. Eby wrote: > > > >It's not at all clear to me that "sticky" behavior is the best > > >default behavior, even with implicit adoptation. Would anyone in > > >their right mind expect the following to return [0, 1, 2, 3, 4, 5] > > >instead of [0, 1, 2, 0, 1, 2]? > > > > > > >>> from itertools import * > > > >>> seq = range(10) > > > >>> list(chain(islice(seq, 3), islice(seq, 3))) > > > [0, 1, 2, 0, 1, 2] > > > >>> > > > > I don't understand why you think it would. What does islice have to > > do with adaptation? > >islice() takes an iterator, yet I give it a sequence. No, it takes an *iterable*, both practically and according to its documentation: >>> help(itertools.islice) Help on class islice in module itertools: class islice(__builtin__.object) | islice(iterable, [start,] stop [, step]) --> islice object | | ... [snip rest] If you think about the iterator and iterable protocols a bit, you'll see that normally the adaptation goes the *other* way: you can pass an iterator to something that expects an iterable, as long as it doesn't need reiterability. From pje at telecommunity.com Sat Jan 15 18:25:15 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 15 18:23:42 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com> <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <5.1.1.6.0.20050114100001.03846260@mail.telecommunity.com> <5.1.1.6.0.20050114200639.030c0cd0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050115105740.040511e0@mail.telecommunity.com> At 10:35 AM 1/15/05 +0100, Alex Martelli wrote: >On 2005 Jan 15, at 02:30, Phillip J. Eby wrote: > >>is requested. It's too bad Python doesn't have some sort of deallocation >>hook you could use to get notified when an object goes away. Oh well. > >For weakly referenceable objects, it does. Giving one to other objects >would be almost isomorphic to making every object weakly referenceable, >wouldn't it? Or am I missing something...? I meant if there was some way to listen for a particular object's allocation, like sticking all the pointers you were interested in into a big dictionary with callbacks and having a callback run whenever an object's refcount reaches zero. It's doubtless completely impractical, however. I think we can probably live with only weak-referenceable objects being seamlessly sticky, if that's a word. :) Actually, I've just gotten to the part of the PEP where I have to deal with stateful adapters and state retention, and I think I'm going to use this terminology for the three kinds of adapters: * operations (no adapter class needed) * extenders (operations + a consistent state that conceptually adds state to the base object rather than creating an object w/separate lifetime) * "volatile", "inconsistent", or "disposable" adapters (state may be lost or multiplied if passed to different routines) The idea is to make it really easy to make any of these, but for the last category you should have to explicitly declare that you *want* volatility (or at least that you are willing to accept it, if the target type is not weak-referenceable). In this way, all three kinds of adaptation may be allowed, but it takes one extra step to create a potentially "bad" adapter. Right now, people often create volatile adapters even if what they want is an extender ("sticky adapter"), because it's more work to make a functioning extender, not because they actually want volatility. So, let's reverse that and make it easier to create extenders than it is to create volatile adapters. And, since in some cases an extender won't be possible even when it's what you want, we could go ahead and allow type declarations to make them, as long as the creator has specified that they're volatile. Meanwhile, all three kinds of adapters should avoid accidental implicit transitivity by only adapting the "original object". (Unless, again, there is some explicit choice to do otherwise.) This makes the type declaration system a straightforward extension of the COM QueryInterface and Java casting models, where an object's "true identity" is always preserved regardless of which interface you access its operations through. From s.percivall at chello.se Sat Jan 15 22:48:17 2005 From: s.percivall at chello.se (Simon Percivall) Date: Sat Jan 15 22:48:21 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com> References: <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com> <5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com> Message-ID: <2A7F1150-673F-11D9-B46A-0003934AD54A@chello.se> On 2005-01-15, at 18.06, Phillip J. Eby wrote: > At 05:32 PM 1/15/05 +0100, Just van Rossum wrote: >> Phillip J. Eby wrote: >> >> > >It's not at all clear to me that "sticky" behavior is the best >> > >default behavior, even with implicit adoptation. Would anyone in >> > >their right mind expect the following to return [0, 1, 2, 3, 4, 5] >> > >instead of [0, 1, 2, 0, 1, 2]? >> > > >> > > >>> from itertools import * >> > > >>> seq = range(10) >> > > >>> list(chain(islice(seq, 3), islice(seq, 3))) >> > > [0, 1, 2, 0, 1, 2] >> > > >>> >> > >> > I don't understand why you think it would. What does islice have to >> > do with adaptation? >> >> islice() takes an iterator, yet I give it a sequence. > > No, it takes an *iterable*, both practically and according to its > documentation: But it _does_ perform an implicit adaptation, via PyObject_GetIter. A list has no next()-method, but iter(list()) does. //Simon From pje at telecommunity.com Sat Jan 15 23:23:56 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Jan 15 23:22:24 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <2A7F1150-673F-11D9-B46A-0003934AD54A@chello.se> References: <5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com> <5.1.1.6.0.20050115104405.04052d70@mail.telecommunity.com> <5.1.1.6.0.20050115120034.03727d90@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050115171358.02f98510@mail.telecommunity.com> At 10:48 PM 1/15/05 +0100, Simon Percivall wrote: >On 2005-01-15, at 18.06, Phillip J. Eby wrote: >>At 05:32 PM 1/15/05 +0100, Just van Rossum wrote: >>>Phillip J. Eby wrote: >>> >>> > >It's not at all clear to me that "sticky" behavior is the best >>> > >default behavior, even with implicit adoptation. Would anyone in >>> > >their right mind expect the following to return [0, 1, 2, 3, 4, 5] >>> > >instead of [0, 1, 2, 0, 1, 2]? >>> > > >>> > > >>> from itertools import * >>> > > >>> seq = range(10) >>> > > >>> list(chain(islice(seq, 3), islice(seq, 3))) >>> > > [0, 1, 2, 0, 1, 2] >>> > > >>> >>> > >>> > I don't understand why you think it would. What does islice have to >>> > do with adaptation? >>> >>>islice() takes an iterator, yet I give it a sequence. >> >>No, it takes an *iterable*, both practically and according to its >>documentation: > >But it _does_ perform an implicit adaptation, via PyObject_GetIter. First, that's not implicit. Second, it's not adaptation, either. PyObject_GetIter invokes the '__iter__' method of its target -- a method that is part of the *iterable* interface. It has to have something that's *already* iterable; it can't "adapt" a non-iterable into an iterable. Further, if calling a method of an interface that you already have in order to get another object that you don't is adaptation, then what *isn't* adaptation? Is it adaptation when you call 'next()' on an iterator? Are you then "adapting" the iterator to its next yielded value? No? Why not? It's a special method of the "iterator" interface, just like __iter__ is a special method of the "iterable" interface. So, I can't see how you can call one adaptation, but not the other. My conclusion: neither one is adaptation. > A list has no next()-method, but iter(list()) does. But a list has an __iter__ method, so therefore it's an iterable. That's what defines an iterable: it has an __iter__ method. It would only be adaptation if lists *didn't* have an __iter__ method. From just at letterror.com Sat Jan 15 23:50:39 2005 From: just at letterror.com (Just van Rossum) Date: Sat Jan 15 23:50:43 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050115171358.02f98510@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > >But it _does_ perform an implicit adaptation, via PyObject_GetIter. > > First, that's not implicit. Second, it's not adaptation, either. > PyObject_GetIter invokes the '__iter__' method of its target -- a > method that is part of the *iterable* interface. It has to have > something that's *already* iterable; it can't "adapt" a non-iterable > into an iterable. > > Further, if calling a method of an interface that you already have in > order to get another object that you don't is adaptation, then what > *isn't* adaptation? Is it adaptation when you call 'next()' on an > iterator? Are you then "adapting" the iterator to its next yielded > value? That's one (contrived) way of looking at it. Another is that y = iter(x) adapts the iterable protocol to the iterator protocol. I don't (yet) see why a bit of state disqualifies this from being called adaptation. > No? Why not? It's a special method of the "iterator" interface, > just like __iter__ is a special method of the "iterable" interface. The difference it that the result of .next() doesn't have a specified interface. > So, I can't see how you can call one adaptation, but not the other. > My conclusion: neither one is adaptation. Maybe... Just From s.percivall at chello.se Sun Jan 16 00:02:20 2005 From: s.percivall at chello.se (Simon Percivall) Date: Sun Jan 16 00:02:23 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: Message-ID: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> On 2005-01-15, at 23.50, Just van Rossum wrote: > Phillip J. Eby wrote: > >>> But it _does_ perform an implicit adaptation, via PyObject_GetIter. >> >> First, that's not implicit. Second, it's not adaptation, either. >> PyObject_GetIter invokes the '__iter__' method of its target -- a >> method that is part of the *iterable* interface. It has to have >> something that's *already* iterable; it can't "adapt" a non-iterable >> into an iterable. >> >> Further, if calling a method of an interface that you already have in >> order to get another object that you don't is adaptation, then what >> *isn't* adaptation? Is it adaptation when you call 'next()' on an >> iterator? Are you then "adapting" the iterator to its next yielded >> value? > > That's one (contrived) way of looking at it. Another is that > > y = iter(x) > > adapts the iterable protocol to the iterator protocol. Especially since an iterable can also be an object without an __iter__ method but with a __getitem__ method. Calling __iter__ might get an iterator, but calling __getitem__ does not. That seems like adaptation. No? It's still not clear to me, as this shows, exactly what counts as what in this game. //Simon From foom at fuhm.net Sun Jan 16 02:13:26 2005 From: foom at fuhm.net (James Y Knight) Date: Sun Jan 16 02:14:47 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> Message-ID: On Jan 15, 2005, at 6:02 PM, Simon Percivall wrote: > On 2005-01-15, at 23.50, Just van Rossum wrote: >> Phillip J. Eby wrote: >> >>>> But it _does_ perform an implicit adaptation, via PyObject_GetIter. >>> >>> First, that's not implicit. Second, it's not adaptation, either. >>> PyObject_GetIter invokes the '__iter__' method of its target -- a >>> method that is part of the *iterable* interface. It has to have >>> something that's *already* iterable; it can't "adapt" a non-iterable >>> into an iterable. >>> >>> Further, if calling a method of an interface that you already have in >>> order to get another object that you don't is adaptation, then what >>> *isn't* adaptation? Is it adaptation when you call 'next()' on an >>> iterator? Are you then "adapting" the iterator to its next yielded >>> value? >> >> That's one (contrived) way of looking at it. Another is that >> >> y = iter(x) >> >> adapts the iterable protocol to the iterator protocol. > > Especially since an iterable can also be an object without an __iter__ > method but with a __getitem__ method. Calling __iter__ might get an > iterator, but calling __getitem__ does not. That seems like adaptation. > No? It's still not clear to me, as this shows, exactly what counts as > what in this game. I think that's wrong. To spell iter() in an adapter/interface world, I'd spell iter(obj) as: adapt(obj, IIterable).iterator() Then, list, tuple, dict objects would specify that they implement IIterable. There is a default adapter from object->IIterable which provides a .iterate() method which creates an iterator that uses __getitem__ on the adaptee. In my opinion, adapters provide a different view of an object. I can see treating list "as a" iterable, but not "as a" iterator. James From jimjjewett at gmail.com Sat Jan 15 01:20:31 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Sun Jan 16 02:32:32 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? Message-ID: Phillip J. Eby wrote (in http://mail.python.org/pipermail/python-dev/2005-January/050854.html) > * Classic class support is a must; exceptions are still required to be > classic, and even if they weren't in 2.5, backward compatibility should be > provided for at least one release. The base of the Exception hierarchy happens to be a classic class. But why are they "required" to be classic? More to the point, is this a bug, a missing feature, or just a bug in the documentation for not mentioning the restriction? You can inherit from both Exception and object. (Though it turns out you can't raise the result.) My first try with google failed to produce an explanation -- and I'm still not sure I understand, beyond "it doesn't happen to work at the moment." Neither the documentation nor the tutorial mention this restriction. http://docs.python.org/lib/module-exceptions.html http://docs.python.org/tut/node10.html#SECTION0010500000000000000000 I didn't find any references to this restriction in exception.c. I did find some code implying this in errors.c and ceval.c, but that wouldn't have caught my eye if I weren't specifically looking for it *after* having just read the discussion about (rejected) PEP 317. -jJ From gvanrossum at gmail.com Sun Jan 16 02:57:53 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun Jan 16 02:57:56 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: References: Message-ID: > The base of the Exception hierarchy happens to be a classic class. > But why are they "required" to be classic? > > More to the point, is this a bug, a missing feature, or just a bug in > the documentation for not mentioning the restriction? It's an unfortunate feature; it should be mentioned in the docs; it should also be fixed, but fixing it isn't easy (believe me, or it would have been fixed in Python 2.2). To be honest, I don't recall the exact reasons why this wasn't fixed in 2.2; I believe it has something to do with the problem of distinguishing between string and class exception, and between the various forms of raise statements. I think the main ambiguity is raise "abc", which could be considered short for raise str, "abc", but that would be incompatible with except "abc". I also think that the right way out of there is to simply hardcode a check that says that raise "abc" raises a string exception and raising any other instance raises a class exception. But there's a lot of code that has to be changed. It's been suggested that all exceptions should inherit from Exception, but this would break tons of existing code, so we shouldn't enforce that until 3.0. (Is there a PEP for this? I think there should be.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Sun Jan 16 03:06:51 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 16 03:05:19 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <5.1.1.6.0.20050115171358.02f98510@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050115202610.034c3c10@mail.telecommunity.com> At 11:50 PM 1/15/05 +0100, Just van Rossum wrote: >Phillip J. Eby wrote: > > > >But it _does_ perform an implicit adaptation, via PyObject_GetIter. > > > > First, that's not implicit. Second, it's not adaptation, either. > > PyObject_GetIter invokes the '__iter__' method of its target -- a > > method that is part of the *iterable* interface. It has to have > > something that's *already* iterable; it can't "adapt" a non-iterable > > into an iterable. > > > > Further, if calling a method of an interface that you already have in > > order to get another object that you don't is adaptation, then what > > *isn't* adaptation? Is it adaptation when you call 'next()' on an > > iterator? Are you then "adapting" the iterator to its next yielded > > value? > >That's one (contrived) way of looking at it. Another is that > > y = iter(x) > >adapts the iterable protocol to the iterator protocol. I don't (yet) see >why a bit of state disqualifies this from being called adaptation. Well, if you go by the GoF "Design Patterns" book, this is actually what's called an "Abstract Factory": "Abstract Factory: Provide an interface for creating ... related or dependent objects without specifying their concrete classes." So, 'iter()' is an abstract factory that creates an iterator without needing to specify the concrete class of iterator you want. This is a much closer fit for what's happening than the GoF description of "Adapter": "Adapter: Convert the interface of a class into another interface clients expect. Adapter lets classes work together that couldn't otherwise because of incompatible interfaces." IMO, it's quite "contrived" to try and squeeze iteration into this concept, compared to simply saying that 'iter()' is an abstract factory that creates "related or dependent objects". While it has been pointed out that the GoF book is not handed down from heaven or anything, its terminology is certainly widely used to describe certain patterns of programming. If you read their full description of the adapter pattern, nothing in it is about automatically getting an adapter based on an interface. It's just about the idea of *using* an adapter that you already have, and it's strongly implied that you only use one adapter for a given source and destination that need adapting, not create lots of instances all over the place. So really, PEP 246 'adapt()' (like 'iter()') is more about the Abstract Factory pattern. It just happens in the case of PEP 246 that it's an Abstract Factory that *can* create adapters, but it's not restricted to handing out *just* adapters. It can also be used to create views, iterators, and whatever else you like. But that's precisely what makes it problematic for use as a type declaration mechanism, because you run the risk of it serving up entirely new objects that aren't just interface transformers. And of course, that's why I think that you should have to declare that you really want to use it for type declarations, if in fact it's allowed at all. Explicit use of 'adapt()', on the other hand, can safely create whatever objects you want. Oh, one other thing -- distinguishing between "adapters" and merely "related" objects allows you to distinguish whether you should adapt the object or what it wraps. A "related" object (like an iterator) is a separate object, so it's safe to adapt it to other things. An actual *adapter* is not a separate object, it's an extension of the object it wraps. So, it should not be re-adapted when adapting again; instead the underlying object should be adapted. So, while I support in principle all the use cases for "adaptation" (so-called) that have been discussed here, I think it's important to refine our terminology to distinguish between GoF "adapters" and "things you might want to create with an abstract factory", because they have different requirements and support different use cases. We have gotten a little bogged down by our comparisons of "good" and "bad" adapters; perhaps to move forward we should distinguish between "adapters" and "views", and say that an iterator is an example of a view: you may have more than one view on the same thing, and although a view depends on the thing it "views", it doesn't really "convert an interface"; it provides distinct functionality on a per-view basis. Currently, PEP 246 'adapt()' is used "in the field" to create both adapters and views, because 1) it's convenient, and 2) it can. :) However, for type declarations, I think it's important to distinguish between the two, to avoid implicit creation of additional views. A view needs to be managed within the scope that it applies to. By that, I mean for example that a 'for' loop creates an iterator view and then manages it within the scope of the loop. However, if you need the iterator to remain valid outside the 'for' loop, you may need to first call 'iter()' to get an explicit iterator you can hold on to. Similarly, if you have a file that you are reading things from by calling routines and passing in the file, you don't want to pass each of those routines a filename and have them implicitly open the file; they won't be reading from it sequentially then. So, again, you have to manage the view by opening a file or creating a StringIO or whatever. Granted that there are some scenarios where implicit view creation will do exactly the right thing, introducing it also opens the opportunity for it to go very badly. Today's PEP 246 implementations are as easy to use as 'iter()', so why not use them explicitly when you need a view? From s.percivall at chello.se Sun Jan 16 03:07:23 2005 From: s.percivall at chello.se (Simon Percivall) Date: Sun Jan 16 03:07:26 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: References: Message-ID: <5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se> On 2005-01-16, at 02.57, Guido van Rossum wrote: > It's been suggested that all exceptions should inherit from Exception, > but this would break tons of existing code, so we shouldn't enforce > that until 3.0. (Is there a PEP for this? I think there should be.) What would happen if Exception were made a new-style class, enforce inheritance from Exception for all new-style exceptions, and allow all old-style exceptions as before. Am I wrong in assuming that only the most esoteric exceptions inheriting from Exception would break by Exception becoming new-style? //Simon From pje at telecommunity.com Sun Jan 16 03:17:45 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 16 03:16:15 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> Message-ID: <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> At 08:13 PM 1/15/05 -0500, James Y Knight wrote: >On Jan 15, 2005, at 6:02 PM, Simon Percivall wrote: > >>On 2005-01-15, at 23.50, Just van Rossum wrote: >>>Phillip J. Eby wrote: >>> >>>>>But it _does_ perform an implicit adaptation, via PyObject_GetIter. >>>> >>>>First, that's not implicit. Second, it's not adaptation, either. >>>>PyObject_GetIter invokes the '__iter__' method of its target -- a >>>>method that is part of the *iterable* interface. It has to have >>>>something that's *already* iterable; it can't "adapt" a non-iterable >>>>into an iterable. >>>> >>>>Further, if calling a method of an interface that you already have in >>>>order to get another object that you don't is adaptation, then what >>>>*isn't* adaptation? Is it adaptation when you call 'next()' on an >>>>iterator? Are you then "adapting" the iterator to its next yielded >>>>value? >>> >>>That's one (contrived) way of looking at it. Another is that >>> >>> y = iter(x) >>> >>>adapts the iterable protocol to the iterator protocol. >> >>Especially since an iterable can also be an object without an __iter__ >>method but with a __getitem__ method. Calling __iter__ might get an >>iterator, but calling __getitem__ does not. That seems like adaptation. >>No? It's still not clear to me, as this shows, exactly what counts as >>what in this game. > >I think that's wrong. To spell iter() in an adapter/interface world, I'd >spell iter(obj) as: > adapt(obj, IIterable).iterator() > >Then, list, tuple, dict objects would specify that they implement >IIterable. There is a default adapter from object->IIterable which >provides a .iterate() method which creates an iterator that uses >__getitem__ on the adaptee. > >In my opinion, adapters provide a different view of an object. I can see >treating list "as a" iterable, but not "as a" iterator. Uh oh. I just used "view" to describe an iterator as a view on an iterable, as distinct from an adapter that adapts a sequence so that it's iterable. :) I.e., using "view" in the MVC sense where a given Model might have multiple independent Views. We really need to clean up our terminology somehow, and I may need to rewrite some parts of my PEP-in-progress. I had been using the term "volatile adapter" for what I'd written so far, but by the time I got to the part where I had to explain how to actually *make* volatile adapters, I realized that I was right before: they aren't adapters just because PEP 246 'adapt()' can be used to create them. They're just something *else* that's convenient to create with 'adapt()' besides adapters. Calling them even "volatile adapters" just confuses them with "real" adapters. On the *other* hand, maybe we should just call GoF adapters "extenders" (since they extend the base object with a new interface or extended functionality, but aren't really separate objects) and these other things like iterators and views should be called "accessories", which implies you have lots of them and although they "accessorize" an object, they are themselves individual objects. (Whereas an extender becomes conceptually "part of" the thing it extends.) It's then also clearer that it makes no sense to have a type declaration ever cause you to end up with a new accessory, as opposed to an extender that's at least figuratively always there. What do y'all think? Is that a better way to distinguish kinds of "adapters"? (I.e. extenders versus accessories) Or does somebody have better words we can use? From pje at telecommunity.com Sun Jan 16 03:22:28 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 16 03:20:56 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: References: Message-ID: <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com> At 05:57 PM 1/15/05 -0800, Guido van Rossum wrote: >It's been suggested that all exceptions should inherit from Exception, >but this would break tons of existing code, so we shouldn't enforce >that until 3.0. (Is there a PEP for this? I think there should be.) Couldn't we require new-style exceptions to inherit from Exception? Since there are no new-style exceptions that work now, this can't break existing code. Then, the code path is just something like: if isinstance(ob,Exception): # it's an exception, use its type else: # all the other tests done now This way, the other tests that would be ambiguous wrt new-style classes can be skipped, but non-Exception classic classes would still be handled by the existing checks. Or am I missing something? From cce at clarkevans.com Sun Jan 16 05:04:24 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Sun Jan 16 05:04:27 2005 Subject: [Python-Dev] PEP 246, Feedback Request In-Reply-To: References: Message-ID: <20050116040424.GA76191@prometheusresearch.com> I started to edit the PEP, but found that we really don't have any consensus on a great many items. The following is a bunch of topics, and a proposed handling of those topics. A bulk of this comes from a phone chat I had with Alex this past afternoon, and other items come from my understanding of the mailing list, or prior conversations with Phillip, among others. It's a strawman. I'd really very much like feedback on each topic, preferably only one post per person summarizing your position/suggestions. I'd rather not have a run-away discussion on this post. --- - topic: a glossary overview: It seems that we are having difficulty with words that have shifting definitions. The next PEP edit will need to add a glossary that nails down some meanings of these words. Following are a few proposed terms/meanings. proposal: - protocol means any object, usually a type or class or interface, which guides the construction of an adapter - adaptee is the object which is to be adapted, the original object - adaptee-class refers to the adaptee's class - adapter refers to the result of adapting an adaptee to a protocol - factory refers to a function, f(adaptee) -> adapter, where the resulting adapter complies with a given protocol feedback: Much help is needed here; either respond to this thread with your words and definitions, or email them directly to Clark and he will use your feedback when creating the PEP's glossary. - topic: a registry mechanism overview: It has become very clear from the conversations the past few days that a registry is absolutely needed for the kind of adapt() behavior people are currently using in Zope, Twisted, and Peak. proposal: - The PEP will define a simple and flexible registry mechanism. - The registry will be a mapping from a (adaptee-class, protocol) pair to a corresponding factory. - Only one active registration per pair (see below) feedback: We welcome/encourage experiences and concreate suggestions from existing registries. Our goal is to be minimal, extensible, and sufficient. See other topics for more specific concerns before you comment on this more general topic. - topic: should 'object' be impacted by PEP 246 overview: The semantics of exceptions depend if 'object' is given a default __conform__ method (which would do isinstance), in which case, returning None in a subclass could be used to prevent Liskov violations. However, by requiring a change to 'object', it may hinder adoption or slow-down testing. proposal: - We will not ask/require changes to `object'. - Liskov violations will be managed via the registry, see below. - This is probably faster for isinstance cases? feedback: If you really think we should move isinstance() into object.__conform__, then here is your chance to make a final stand. ;) - topic: adaption stages overview: There are several stages for adaptation. It was recommended that the 'registry' be the first stop in the chain. proposal: - First, the registry is checked for a suitable adapter - Second, isinstance() is checked, the adaptee is an instance of the protocol, adaptation ends and adaptee is returned. - Third, __conform__ on the adaptee is called with the given protocol being requested. - Fourth, __adapt__ on the protocol is called, with the given adaptee. feedback: This largely dependent upon the previous topic, but if something isn't obvious (mod exceptions below), please say something. - topic: module vs built-in overview: Since we will be adding a registry, exceptions, and other items, it probably makes sense to use a module for 'adapt'. proposal: - PEP 246 will ask for a `adapt' module, with an `adapt' function. - The registry will be contained in this module, 'adapt.register' - The `adapt' module can provide commonly-used adapter factories, such as adapt.Identity. - With a standardized signature, frameworks can provide their own 'local' registry/adapt overrides. feedback: Please discuss the merits of a module approach, and if having local registries is important (or worth the added complexity). Additional suggestions on how the module should be structured are welcome. - topic: exception handling overview: How should adaption stages progress and exceptions be handled. There were problems with swallowed TypeError exceptions in the 2001 version of the PEP, type errors are not swallowed. proposal: - The 'adapt' module will define an adapt.AdaptError(TypeError). - At any stage of adaptation, if None is returned, the adaptation continues to the next stage. - Any exception other than adapt.AdaptException(TypeError) causes the adapt() call to fail, and the exception to be raised to the caller of adapt(); the 'default' value is not returned in this case. - At any stage of adaption, if adapt.AdaptException(TypeError) is raised, then the adaptation process stops, as if None had been returned from each stage. - If all adaption stages return None, there are two cases. If the call to adapt() had a 'default' value, then this is returned; otherwise, an adapt.AdaptException is raised. feedback: I think this is the same as the current PEP, and different from the first PEP. Comments? Anything that was missed? - topic: transitivity overview: A case for allowing A->C to work when A->B and B->C is available; an equally compelling case to forbid this was also given. There are numerous reasons for not allowing transitive adapters, mostly that 'lossy' adapters or 'stateful' adapters are usually the problem cases. However, a hard-and-fast rule for knowing when transitivity exists wasn't found. proposal: - When registering an adapter factory, from A->B, an additional flag 'transitive' will be available. - This flag defaults to False, so specific care is needed when registering adapters which one considers to be transitive. - If there exist two adapter factories, X: A->B, and Y: B->C, the path factory Z: A->C will be considered registered if and only if both X and Y were registered 'Transitive'. - It is an error for a registration to cause two path factories from A to C to be constructed; thus the registry will never have a case where two transitive adaptation paths exist at a single time. - An explicit registration always has precedent over an a transitive path. - One can also register a None factory from A->B for the purpose of marking it transitive. In this circumstance, the composite adapter is built through __conform__ and __adapt__. The None registration is just a place holder to signal that a given path exists. feedback: I'm looking for warts in this plan, and verification if something like this has been done -- comments how well it works. Alternative approaches? - topic: substitutability overview: There is a problem with the default isinstance() behavior when someone derives a class from another to re-use implementation, but with a different 'concept'. A mechanism to disable isinstance() is needed for this particular case. proposal: - The 'adapt' module will define a 'LiskovAdaptionError', which has as a text description something like: "Although the given class '%s' derives from '%s', it has been marked as not being substitutable; although it is a subclass, the intent has changed so that one should not assume an 'is-a' relationship." % (adaptee.__class__, protocol) - The 'adapt' module will provide an 'NotSubstitutable' adaption factory, which, by default, raises LiskovAdaptionError. - If someone is concerned that their subclass should not be adapted to the superclass automatically, they should register the NotSubstitutable adapter to the superclass, recursively. feedback: I'm not sure how this would work for the adaptee-class's grandparent; perhaps a helper function that recursively marks super classes is needed? Other comments? - topic: declaration (aka Guido's syntax) and intrinsic adaption overview: Guido would like his type declaration syntax (see blog entry) to be equivalent to a call to adapt() without any additional arguments. However, not all adapters should be created in the context of a declaration -- some should be created more explicitly. We propose a mechanism where an adapter factory can register itself as not suitable for the declaration syntax. proposal: - The adapt.register method has an optional argument, 'intrinsic', that defaults to True. - The adapt() function has an optional argument, 'intrinsic_only' which defaults to True; and thus is the default for the declaration syntax. - If an adapter factory is registered with intrinsic = False, then it is _not_ used by default calls to adapt(). - adapt( , intrinsic_only = False) will enable both sorts of adapters, intrinsic or not; enabling the use of adapters which should not be used by default in a declaration syntax. - all adapters created through __conform__ and __adapt__ are by default intrinsic since this parameter is not part of the function signature feedback: This is the simplest solution I heard on the list; the word 'intrinsic' was given by Alex. Is there a better word? Should we even worry about this case? Any other ways to view this issue? - topic: adaptee (aka origin) overview: There was discussion as to how to get back to the original object from an adapter. Is this in scope of PEP 246? proposal: - we specify an __adaptee__ property, to be optionally implemented by an adapter that provides a reference adaptee - the adapt.register method has an optional argument, 'adaptee', that defaults to False; if it is True, adapt() calls will stuff away into a weak-reference mapping from adapter to adaptee. - an adapt.adaptee(adaptor) function which returns the given adaptee for the adaptor; this first checks the weak-reference table, and then checks for an __adaptee_ feedback: Is this useful, worth the complexity? - topic: sticky overview: Sticky adapters, that is, ones where there is only one instance per adaptee is a common use case. Should the registry of PEP 246 provide this feature? proposal: - the adapt.register method has an optional argument, 'sticky', that defaults to False - if the given adapter factory is marked sticky, then a call to adapt() will first check to see if a given adapter (keyed by protocol) has been created for the adaptee; if so, then that adapter is returned, otherwise the factory is asked to produce an adapter and that adapter is cashed. feedback: Is this useful, worth the complexity? It seems like an easy operation. The advantage to this approach (over each factory inheriting from a StickyFactory) is that registry queries can be done, to list sticky adapters and other bookkeeping chores. Ok. That's it. Cheers, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From pje at telecommunity.com Sun Jan 16 05:36:00 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 16 05:34:28 2005 Subject: [Python-Dev] PEP 246, Feedback Request In-Reply-To: <20050116040424.GA76191@prometheusresearch.com> References: Message-ID: <5.1.1.6.0.20050115231609.02fa4230@mail.telecommunity.com> At 11:04 PM 1/15/05 -0500, Clark C. Evans wrote: > topic: a glossary > overview: > It seems that we are having difficulty with words that have shifting > definitions. The next PEP edit will need to add a glossary that > nails down some meanings of these words. Following are a few > proposed terms/meanings. It would also be helpful to distinguish between 1-to-1 "as a" adapters, and 1-to-many "view" adapters. There isn't a really good terminology for this, but it's important at least as it relates to type declarations. > - Any exception other than adapt.AdaptException(TypeError) > causes the adapt() call to fail, and the exception to be > raised to the caller of adapt(); the 'default' value is not > returned in this case. > - At any stage of adaption, if adapt.AdaptException(TypeError) is > raised, then the adaptation process stops, as if None had been > returned from each stage. > - If all adaption stages return None, there are two cases. If the > call to adapt() had a 'default' value, then this is returned; > otherwise, an adapt.AdaptException is raised. -1; This allows unrelated AdaptExceptions to end up being silently caught. These need to be two different exceptions if you want to support stages being able to "veto" adaptation. Perhaps you should have a distinct VetoAdaptation error to support that use case. > topic: transitivity > ... > proposal: > ... > feedback: > I'm looking for warts in this plan, and verification if > something like this has been done -- comments how well > it works. Alternative approaches? I'll try to think some more about this one later, but I didn't see any obvious problems at first glance. > topic: declaration (aka Guido's syntax) and intrinsic adaption > overview: > Guido would like his type declaration syntax (see blog entry) to > be equivalent to a call to adapt() without any additional > arguments. However, not all adapters should be created in the > context of a declaration -- some should be created more > explicitly. We propose a mechanism where an adapter factory can > register itself as not suitable for the declaration syntax. It would be much safer to have the reverse be the default; i.e., it should take special action to declare an adapter as being *suitable* for use with type declarations. IOW, sticky intrinsic adapters should be the default, and volatile accessories should take an extra action to make them usable with type declarations. > feedback: > This is the simplest solution I heard on the list; the word > 'intrinsic' was given by Alex. Is there a better word? Sadly, no. I've been playing with words like "extender", "mask", "personality" etc. to try and find a name for a thing you only reasonably have one of, versus things you can have many of like "accessory", "add-on", etc. > topic: adaptee (aka origin) > overview: > There was discussion as to how to get back to the original > object from an adapter. Is this in scope of PEP 246? > proposal: > - we specify an __adaptee__ property, to be optionally implemented > by an adapter that provides a reference adaptee > - the adapt.register method has an optional argument, 'adaptee', > that defaults to False; if it is True, adapt() calls will stuff > away into a weak-reference mapping from adapter to adaptee. > - an adapt.adaptee(adaptor) function which returns the given > adaptee for the adaptor; this first checks the weak-reference > table, and then checks for an __adaptee_ > feedback: > Is this useful, worth the complexity? This is tied directly to intrinsicness and stickyness. If you are intrinsic, you *must* have __adaptee__, so that adapt can re-adapt you safely. If you are intrinsic, you *must* be stateless or sticky. (Stateless can be considered an empty special case of "sticky") So, you might be able to combine a lot of these options to make the interface cleaner. Think of it this way: if the adapter is intrinsic, it's just a "personality" of the underlying object. So you don't want to re-adapt a personality, instead you re-adapt the "original object". But for a non-intrinsic adapter, the adapter is an independent object only incidentally related to the original adaptee, so it is now an "original object" of its own. > topic: sticky > overview: > Sticky adapters, that is, ones where there is only one instance > per adaptee is a common use case. Should the registry of PEP 246 > provide this feature? Ideally, yes. > proposal: > - the adapt.register method has an optional argument, 'sticky', > that defaults to False Make it default to whatever the 'intrinsic' setting is, because the only time you don't care for an intrinsic adapter is if the adapter is completely stateless. Or, better yet, call it 'volatile' or something and default to False. (I.e, you have to be say you're willing to have it volatile.) If you get all of these features, it's going to come mighty close to the functionality I've written up in my PEP; the primary difference is that mine also includes a more concrete notion of "interface" and defines a way to create intrinsic adapter factories automatically, without having to write adapter classes. For volatile/accessory adapters, you still have to write the classes, but that's sort of the point of such adapters. From pje at telecommunity.com Sun Jan 16 05:57:39 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 16 05:56:08 2005 Subject: [Python-Dev] "Monkey Typing" pre-PEP, partial draft Message-ID: <5.1.1.6.0.20050115235444.02f246f0@mail.telecommunity.com> I just attempted to post the Monkey Typing draft pre-PEP, but it bounced due to being just barely over the size limit for the list. :) So, I'm just posting the preamble and abstract here for now, and a link to a Wiki page with the full text. I hope the moderator will approve the actual posting soon so that replies can quote from the text. === original message === This is only a partial first draft, but the Motivation section nonetheless attempts to briefly summarize huge portions of the various discussions regarding adaptation, and to coin a hopefully more useful terminology than some of our older working adjectives like "sticky" and "stateless" and such. And the specification gets as far as defining a simple decorator-based syntax for creating operational (prev. "stateless") and extension (prev. "per-object stateful") adapters. I stopped when I got to the API for declaring volatile (prev. per-adapter stateful) adapters, and for enabling them to be used with type declarations, because Clark's post on his revisions-in-progress seem to indicate that this can probably be handled within the scope of PEP 246 itself. As such, this PEP should then be viewed more as an attempt to formulate how "intrinsic" adapters can be defined in Python code, without the need to manually create adapter classes for the majority of type-compatibility and "extension" use cases. In other words, the implementation described herein could probably become part of the front-end for the PEP 246 adapter registry. Feedback and corrections (e.g. if I've repeated myself somewhere, spelling, etc.) would be greatly appreciated. This uses ReST markup heavily, so if you'd prefer to read an HTML version, please see: http://peak.telecommunity.com/DevCenter/MonkeyTyping But I'd prefer that corrections/discussion quote the relevant section so I know what parts you're talking about. Also, if you find a place where a more concrete example would be helpful, please consider submitting one that I can add. Thanks! PEP: XXX Title: "Monkey Typing" for Agile Type Declarations Version: $Revision: X.XX $ Last-Modified: $Date: 2003/09/22 04:51:50 $ Author: Phillip J. Eby Status: Draft Type: Standards Track Python-Version: 2.5 Content-Type: text/x-rst Created: 15-Jan-2005 Post-History: 15-Jan-2005 Abstract ======== Python has always had "duck typing": a way of implicitly defining types by the methods an object provides. The name comes from the saying, "if it walks like a duck and quacks like a duck, it must *be* a duck". Duck typing has enormous practical benefits for small and prototype systems. For very large frameworks, however, or applications that comprise multiple frameworks, some limitations of duck typing can begin to show. This PEP proposes an extension to "duck typing" called "monkey typing", that preserves most of the benefits of duck typing, while adding new features to enhance inter-library and inter-framework compatibility. The name comes from the saying, "Monkey see, monkey do", because monkey typing works by stating how one object type may *mimic* specific behaviors of another object type. Monkey typing can also potentially form the basis for more sophisticated type analysis and improved program performance, as it is essentially a simplified form of concepts that are also found in languages like Dylan and Haskell. It is also a straightforward extension of Java casting and COM's QueryInterface, which should make it easier to represent those type systems' behaviors within Python as well. [see the web page above for the remaining text] From aleax at aleax.it Sun Jan 16 09:23:26 2005 From: aleax at aleax.it (Alex Martelli) Date: Sun Jan 16 09:23:33 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> Message-ID: On 2005 Jan 16, at 03:17, Phillip J. Eby wrote: ... > Uh oh. I just used "view" to describe an iterator as a view on an > iterable, as distinct from an adapter that adapts a sequence so that > it's iterable. :) > > I.e., using "view" in the MVC sense where a given Model might have > multiple independent Views. I think that in order to do that you need to draw a distinction between two categories of iterables: so, again, a problem of terminology, but one connected to a conceptual difference. An iterator IS-AN iterable: it has __iter__. However, it can't have "multiple independent views"... except maybe if you use itertools.tee for that purpose. Other iterables are, well, ``re-iterables'': each call to their __iter__ makes a new fresh iterator, and using that iterator won't alter the iterable's state. In this case, viewing multiple iterators on the same re-iterables as akin to views on a model seems quite OK. I can't think of any 3rd case -- an iterable that's not an iterator (__iter__ does not return self) but neither is it seamlessly re-iterable. Perhaps the ``file'' built-in type as it was in 2.2 suffered that problem, but it was a design problem, and is now fixed. Alex From robey at lag.net Sun Jan 16 10:21:51 2005 From: robey at lag.net (Robey Pointer) Date: Sun Jan 16 10:22:26 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: References: Message-ID: <41EA322F.1080304@lag.net> Guido van Rossum wrote: >>The base of the Exception hierarchy happens to be a classic class. >>But why are they "required" to be classic? >> >>More to the point, is this a bug, a missing feature, or just a bug in >>the documentation for not mentioning the restriction? >> >> > >It's an unfortunate feature; it should be mentioned in the docs; it >should also be fixed, but fixing it isn't easy (believe me, or it >would have been fixed in Python 2.2). > >To be honest, I don't recall the exact reasons why this wasn't fixed >in 2.2; I believe it has something to do with the problem of >distinguishing between string and class exception, and between the >various forms of raise statements. > >I think the main ambiguity is raise "abc", which could be considered >short for raise str, "abc", but that would be incompatible with except >"abc". I also think that the right way out of there is to simply >hardcode a check that says that raise "abc" raises a string exception >and raising any other instance raises a class exception. But there's a >lot of code that has to be changed. > >It's been suggested that all exceptions should inherit from Exception, >but this would break tons of existing code, so we shouldn't enforce >that until 3.0. (Is there a PEP for this? I think there should be.) > > There's actually a bug open on the fact that exceptions can't be new-style classes: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=518846&group_id=5470 I added some comments to try to stir it up but there ended up being a lot of confusion and I don't think I helped much. The problem is that people want to solve the larger issues (raising strings, wanting to force all exceptions to be new-style, etc) but those all have long-term solutions, while the current bug just languishes. robey From martin at v.loewis.de Sun Jan 16 10:27:12 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 16 10:27:16 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se> References: <5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se> Message-ID: <41EA3370.7000204@v.loewis.de> Simon Percivall wrote: > What would happen if Exception were made a new-style class, enforce > inheritance from Exception for all new-style exceptions, and allow all > old-style exceptions as before. string exceptions would break. In addition, code may break which assumes that exceptions are classic instances, e.g. that they are picklable, have an __dict__, and so on. > Am I wrong in assuming that only the > most esoteric exceptions inheriting from Exception would break by > Exception becoming new-style? Yes, I think so. Regards, Martin From martin at v.loewis.de Sun Jan 16 10:28:51 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 16 10:28:54 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com> References: <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com> Message-ID: <41EA33D3.8080102@v.loewis.de> Phillip J. Eby wrote: > Couldn't we require new-style exceptions to inherit from Exception? > Since there are no new-style exceptions that work now, this can't break > existing code. This would require to make Exception a new-style class, right? This, in itself, could break existing code. Regards, Martin From aleax at aleax.it Sun Jan 16 10:38:16 2005 From: aleax at aleax.it (Alex Martelli) Date: Sun Jan 16 10:38:22 2005 Subject: [Python-Dev] how to test behavior wrt an extension type? Message-ID: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it> copy.py, as recently discussed starting from a post by /F, has two kinds of misbehaviors since 2.3 (possibly 2.2, haven't checked), both connected to instance/type/metatype confusion (where do special methods come from? in classic classes and types, from the instance, which may delegate to the type/class; in newstype one, from the class/type which _must not_ delegate to the metaclass): type/metatype confusion, and misbehavior with instances of extension types. So, as per discussion here, I have prepared a patch (to the maintenance branch of 2.3, to start with) which adds unit tests to highlight these issues, and fixes them in copy.py. This patch should go in the maintenance of 2.3 and 2.4, but in 2.5 a different approach based on new special descriptors for special methods is envisaged (though keeping compatibility with classic extension types may also require some patching to copy.py along the lines of my patch). Problem: to write unit tests showing that the current copy.py misbehaves with a classic extension type, I need a classic extension type which defines __copy__ and __deepcopy__ just like /F's cElementTree does. So, I made one: a small trycopy.c and accompanying setup.py whose only purpose in life is checking that instances of a classic type get copied correctly, both shallowly and deeply. But now -- where do I commit this extension type, so that the unit tests in test_copy.py can do their job...? Right now I've finessed the issue by having a try/except ImportError in the two relevant unit tests (for copy and deepcopy) -- if the trycopy module is not available I just don't test how its instances behave under deep or shallow copying. However, if I just commit or send the patch as-is, without putting trycopy.c and its setup.py somewhere, then I'm basically doing a fix without unit tests to back it up, from the point of view of anybody but myself. I do not know what the recommended practice is for this kind of issues, so, I'm asking for guidance (and specifically asking Anthony since my case deals with 2.3 and 2.4 maintenance and he's release manager for both, but, of course, everybody's welcome to help!). Surely this can't be the first case in which a bug got triggered only by a certain behavior in an extension type, but I couldn't find precedents. Ideas, suggestions, ...? Alex From aleax at aleax.it Sun Jan 16 10:44:14 2005 From: aleax at aleax.it (Alex Martelli) Date: Sun Jan 16 10:44:20 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <41EA3370.7000204@v.loewis.de> References: <5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se> <41EA3370.7000204@v.loewis.de> Message-ID: <2EEBC8AF-67A3-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 16, at 10:27, Martin v. L?wis wrote: > Simon Percivall wrote: >> What would happen if Exception were made a new-style class, enforce >> inheritance from Exception for all new-style exceptions, and allow all >> old-style exceptions as before. > > string exceptions would break. Couldn't we just specialcase strings specifically, to keep grandfathering them in? > In addition, code may break which assumes that exceptions are classic > instances, e.g. that they are picklable, have an __dict__, and so on. There would be no problem giving the new class Extension(object): ... a __dict__ and the ability to get pickled, particularly since both come by default. The "and so on" would presumably refer to whether special methods should be looked up on the instance or the type. But as I understand the question (raised in the threads about copy.py) the planned solution is to make special methods their own kind of descriptors, so even that exoteric issue could well be finessed. > > Am I wrong in assuming that only the >> most esoteric exceptions inheriting from Exception would break by >> Exception becoming new-style? > > Yes, I think so. It seems to me that if the new-style Exception is made very normally and strings are grandfathered in, we ARE down to exoteric breakage cases (potentially fixable by those new magic descriptors as above for specialmethods). Alex From aleax at aleax.it Sun Jan 16 10:47:34 2005 From: aleax at aleax.it (Alex Martelli) Date: Sun Jan 16 10:47:38 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <41EA33D3.8080102@v.loewis.de> References: <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com> <41EA33D3.8080102@v.loewis.de> Message-ID: On 2005 Jan 16, at 10:28, Martin v. L?wis wrote: > Phillip J. Eby wrote: >> Couldn't we require new-style exceptions to inherit from Exception? >> Since there are no new-style exceptions that work now, this can't >> break existing code. > > This would require to make Exception a new-style class, right? Not necessarily, since Python supports multiple inheritance: class MyException(Exception, object): ..... there -- a newstyle exception class inheriting from oldstyle Exception. (ClassType goes to quite some trouble to allow this, getting the metaclass from _following_ bases if any). Without inheritance you might similarly say: class AnotherOne(Exception): __metaclass__ = type ... > This, in itself, could break existing code. Not necessarily, see my previous post. But anyway, PJE's proposal is less invasive than making Exception itself newstyle. Alex From martin at v.loewis.de Sun Jan 16 11:05:20 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 16 11:05:23 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <2EEBC8AF-67A3-11D9-ADA4-000A95EFAE9E@aleax.it> References: <5C6D82A6-6763-11D9-BC9B-0003934AD54A@chello.se> <41EA3370.7000204@v.loewis.de> <2EEBC8AF-67A3-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <41EA3C60.1060801@v.loewis.de> Alex Martelli wrote: >>> What would happen if Exception were made a new-style class, enforce >>> inheritance from Exception for all new-style exceptions, and allow all >>> old-style exceptions as before. >> >> >> string exceptions would break. > > > Couldn't we just specialcase strings specifically, to keep > grandfathering them in? Sure. That just wouldn't be the change that Simon described, anymore. You don't specify in which way you would like to specialcase strings. Two alternatives are possible: 1. Throwing strings is still allowed, and to catch them, you need the identical string (i.e. the current behaviour) 2. Throwing strings is allowed, and they can be caught by either the identical string, or by catching str In the context of Simon's proposal, the first alternative would be more meaningful, I guess. > The "and so on" would presumably refer to whether special methods should > be looked up on the instance or the type. Perhaps. That type(exc) changes might also cause problems. > It seems to me that if the new-style Exception is made very normally and > strings are grandfathered in, we ARE down to exoteric breakage cases > (potentially fixable by those new magic descriptors as above for > specialmethods). This would be worth a try. Does anybody have a patch to implement it? Regards, Martin From python at rcn.com Sun Jan 16 11:17:52 2005 From: python at rcn.com (Raymond Hettinger) Date: Sun Jan 16 11:21:45 2005 Subject: [Python-Dev] how to test behavior wrt an extension type? In-Reply-To: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <000201c4fbb4$b31995e0$68fdcc97@oemcomputer> [Alex] > So, as per discussion here, I have prepared a patch (to the maintenance > branch of 2.3, to start with) which adds unit tests to highlight these > issues, and fixes them in copy.py. This patch should go in the > maintenance of 2.3 and 2.4, but in 2.5 a different approach based on > new special descriptors for special methods is envisaged (though > keeping compatibility with classic extension types may also require > some patching to copy.py along the lines of my patch). For Py2.5, do you have in mind changing something other than copy.py? If so, please outline your plan. I hope your not planning on wrapping all special method access as descriptor look-ups -- that would be a somewhat radical change. Raymond From fredrik at pythonware.com Sun Jan 16 12:03:42 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun Jan 16 12:03:34 2005 Subject: [Python-Dev] Re: how to test behavior wrt an extension type? References: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: Alex Martelli wrote: > Problem: to write unit tests showing that the current copy.py misbehaves with a classic extension > type, I need a classic extension type which defines __copy__ and __deepcopy__ just like /F's > cElementTree does. So, I made one: a small trycopy.c and accompanying setup.py whose only purpose > in life is checking that instances of a classic type get copied correctly, both shallowly and > deeply. But now -- where do I commit this extension type, so that the unit tests in test_copy.py > can do their job...? Modules/_testcapimodule.c ? (I'm using the C api to define an extension type, after all...) From aleax at aleax.it Sun Jan 16 12:37:33 2005 From: aleax at aleax.it (Alex Martelli) Date: Sun Jan 16 12:37:39 2005 Subject: [Python-Dev] how to test behavior wrt an extension type? In-Reply-To: <000201c4fbb4$b31995e0$68fdcc97@oemcomputer> References: <000201c4fbb4$b31995e0$68fdcc97@oemcomputer> Message-ID: <0365E8C6-67B3-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 16, at 11:17, Raymond Hettinger wrote: > [Alex] >> So, as per discussion here, I have prepared a patch (to the > maintenance >> branch of 2.3, to start with) which adds unit tests to highlight these >> issues, and fixes them in copy.py. This patch should go in the >> maintenance of 2.3 and 2.4, but in 2.5 a different approach based on >> new special descriptors for special methods is envisaged (though >> keeping compatibility with classic extension types may also require >> some patching to copy.py along the lines of my patch). > > For Py2.5, do you have in mind changing something other than copy.py? > If so, please outline your plan. I hope your not planning on wrapping > all special method access as descriptor look-ups -- that would be a > somewhat radical change. The overall plan does appear to be exactly the "somewhat radical change" which you hope is not being proposed, except it's not my plan -- it's Guido's. Quoting his first relevant post on the subject: ''' From: gvanrossum@gmail.com Subject: Re: getting special from type, not instance (was Re: [Python-Dev] copy confusion) Date: 2005 January 12 18:59:13 CET ... I wonder if the following solution wouldn't be more useful (since less code will have to be changed). The descriptor for __getattr__ and other special attributes could claim to be a "data descriptor" which means that it gets first pick *even if there's also a matching entry in the instance __dict__*. Quick illustrative example: >>> class C(object): foo = property(lambda self: 42) # a property is always a "data descriptor" >>> a = C() >>> a.foo 42 >>> a.__dict__["foo"] = "hello" >>> a.foo 42 >>> Normal methods are not data descriptors, so they can be overridden by something in __dict__; but it makes some sense that for methods implementing special operations like __getitem__ or __copy__, where the instance __dict__ is already skipped when the operation is invoked using its special syntax, it should also be skipped by explicit attribute access (whether getattr(x, "__getitem__") or x.__getitem__ -- these are entirely equivalent). We would need to introduce a new decorator so that classes overriding these methods can also make those methods "data descriptors", and so that users can define their own methods with this special behavior (this would be needed for __copy__, probably). I don't think this will cause any backwards compatibility problems -- since putting a __getitem__ in an instance __dict__ doesn't override the x[y] syntax, it's unlikely that anybody would be using this. "Ordinary" methods will still be overridable. PS. The term "data descriptor" now feels odd, perhaps we can say "hard descriptors" instead. Hard descriptors have a __set__ method in addition to a __get__ method (though the __set__ method may always raise an exception, to implement a read-only attribute). ''' All following discussion was, I believe, in the same thread, mostly among Guido, Phillip and Armin. I'm focusing on getting copy.py fixed in 2.3 and 2.4, w/o any plan yet to implement Guido's idea. If you dislike Guido's idea (which Phillip, Armin and I all liked, in different degrees), it might be best for you to read that other thread and explain the issues there, I think. Alex From aleax at aleax.it Sun Jan 16 12:39:49 2005 From: aleax at aleax.it (Alex Martelli) Date: Sun Jan 16 12:39:53 2005 Subject: [Python-Dev] Re: how to test behavior wrt an extension type? In-Reply-To: References: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <540CDCFE-67B3-11D9-ADA4-000A95EFAE9E@aleax.it> On 2005 Jan 16, at 12:03, Fredrik Lundh wrote: > Alex Martelli wrote: > >> Problem: to write unit tests showing that the current copy.py >> misbehaves with a classic extension >> type, I need a classic extension type which defines __copy__ and >> __deepcopy__ just like /F's >> cElementTree does. So, I made one: a small trycopy.c and >> accompanying setup.py whose only purpose >> in life is checking that instances of a classic type get copied >> correctly, both shallowly and >> deeply. But now -- where do I commit this extension type, so that >> the unit tests in test_copy.py >> can do their job...? > > Modules/_testcapimodule.c ? > > (I'm using the C api to define an extension type, after all...) Fine with me, if there are no objections I'll alter the patch accordingly and submit it that way. Thanks, Alex From pje at telecommunity.com Sun Jan 16 16:18:21 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 16 16:16:50 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <41EA33D3.8080102@v.loewis.de> References: <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com> <5.1.1.6.0.20050115211847.034ce0f0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050116101727.030e9a30@mail.telecommunity.com> At 10:28 AM 1/16/05 +0100, Martin v. L?wis wrote: >Phillip J. Eby wrote: >>Couldn't we require new-style exceptions to inherit from Exception? >>Since there are no new-style exceptions that work now, this can't break >>existing code. > >This would require to make Exception a new-style class, right? >>> class MyException(Exception,object): pass >>> Not as far as I can see, no. From irmen at xs4all.nl Sun Jan 16 17:08:54 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Sun Jan 16 17:08:52 2005 Subject: [Python-Dev] a bunch of Patch reviews Message-ID: <41EA9196.1020709@xs4all.nl> Hello I've looked at one bug and a bunch of patches and added a comment to them: (bug) [ 1102649 ] pickle files should be opened in binary mode Added a comment about a possible different solution [ 946207 ] Non-blocking Socket Server Useless, what are the mixins for? Recommend close [ 756021 ] Allow socket.inet_aton('255.255.255.255') on Windows Looks good but added suggestion about when to test for special case [ 740827 ] add urldecode() method to urllib I think it's better to group these things into urlparse [ 579435 ] Shadow Password Support Module Would be nice to have, I recently just couldn't do the user authentication that I wanted: based on the users' unix passwords [ 1093468 ] socket leak in SocketServer Trivial and looks harmless, but don't the sockets get garbage collected once the request is done? [ 1049151 ] adding bool support to xdrlib.py Simple patch and 2.4 is out now, so... It would be nice if somebody could have a look at my own patches or help me a bit with them: [ 1102879 ] Fix for 926423: socket timeouts + Ctrl-C don't play nice [ 1103213 ] Adding the missing socket.recvall() method [ 1103350 ] send/recv SEGMENT_SIZE should be used more in socketmodule [ 1062014 ] fix for 764437 AF_UNIX socket special linux socket names [ 1062060 ] fix for 1016880 urllib.urlretrieve silently truncates dwnld Some of them come from the last Python Bug Day, see http://www.python.org/moin/PythonBugDayStatus Thank you ! Regards, --Irmen de Jong From gvanrossum at gmail.com Sun Jan 16 18:15:35 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun Jan 16 18:15:38 2005 Subject: [Python-Dev] PEP 246: let's reset In-Reply-To: References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> Message-ID: The various PEP 246 threads are dead AFAIC -- I won't ever have the time to read them in full length, and because I haven't followed them I don't get much of the discussion that's still going on. I hear that Clark and Alex are going to do a revision of the PEP; I'm looking forward to the results. In the mean time, here's a proposal to reduce the worries about implicit adaptation: let's not do it! Someone posted a new suggestion to my blog: it would be good if an optimizing compiler (or a lazy one) would be allowed to ignore all type declarations, and the program should behave the same (except for things like catching TypeError). This rules out using adapt() for type declarations, and we're back to pure typechecking. Given the many and various issues with automamtic adaptation (transitivity, lossiness, statelessness, and apparently more still) that might be a better approach. Typechecking can be trivially defined in terms of adaptation: def typecheck(x, T): y = adapt(x, T) if y is x: return y raise TypeError("...") -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Sun Jan 16 19:00:27 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 16 18:58:58 2005 Subject: [Python-Dev] PEP 246: let's reset In-Reply-To: References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com> At 09:15 AM 1/16/05 -0800, Guido van Rossum wrote: >Given the many and various issues with automamtic adaptation >(transitivity, lossiness, statelessness, and apparently more still) >that might be a better approach. Actually, I think Clark, Alex, and I are rapidly converging on a relatively simple common model to explain all this stuff, with only two kinds of adaptation covering everything we've discussed to date in a reasonable way. My most recent version of my pre-PEP (not yet posted) explains the two kinds of adaptation in this way: """One type is the "extender", whose purpose is to extend the capability of an object or allow it to masquerade as another type of object. An "extender" is not truly an object unto itself, merely a kind of "alternate personality" for the object it adapts. For example, a power transformer might be considered an "extender" for a power outlet, because it allows the power to be used with different devices than it would otherwise be usable for. By contrast, an "independent adapter" is an object that provides entirely different capabilities from the object it adapts, and therefore is truly an object in its own right. While it only makes sense to have one extender of a given type for a given base object, you may have as many instances of an independent adapter as you like for the same base object. For example, Python iterators are independent adapters, as are views in a model-view-controller framework, since each iterable may have many iterators in existence, each with its own independent state. Resuming the previous analogy of a power outlet, you may consider independent adapters to be like appliances: you can plug more than one lamp into the same outlet, and different lamps may be on or off at a given point in time. Many appliances may come and go over the lifetime of the power outlet -- there is no inherent connection between them because the appliances are independent objects rather than mere extensions of the power outlet.""" I then go on to propose that extenders be automatically allowed for use with type declaration, but that independent adapters should require additional red tape (e.g. an option when registering) to do so. (An explicit 'adapt()' call should allow either kind of adapter, however.) Meanwhile, no adapt() call should adapt an extender; it should instead adapt the extended object. Clark and Alex have proposed changes to PEP 246 that would support these proposals within the scope of the 'adapt()' system, and I have pre-PEPped an add-on to their system that allows extenders to be automatically assembled from "@like" decorators sprinkled over methods or extension routines. My proposal also does away with the need to have a special interface type to support extender-style adaptation. (I.e., it could supercede PEP 245, because interfaces can then simply be abstract classes or just "like" concrete classes.) From pje at telecommunity.com Sun Jan 16 19:11:14 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Jan 16 19:09:45 2005 Subject: [Python-Dev] Updated Monkey Typing pre-PEP Message-ID: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com> I've revised the draft today to simplify the terminology, discussing only two broad classes of adapters. Since Clark's pending proposals for PEP 246 align well with the concept of "extenders" vs. "independent adapters", I've refocused my PEP to focus exclusively on adding support for "extenders", since PEP 246 already provides everything needed for independent adapters. The new draft is here: http://peak.telecommunity.com/DevCenter/MonkeyTyping And you can view diffs from the previous version(s) here: http://peak.telecommunity.com/DevCenter/MonkeyTyping?action=info From kbk at shore.net Sun Jan 16 20:24:37 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun Jan 16 20:25:07 2005 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200501161924.j0GJObqo029011@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 272 open ( +5) / 2737 closed (+10) / 3009 total (+15) Bugs : 793 open ( -5) / 4777 closed (+29) / 5570 total (+24) RFE : 165 open ( +0) / 141 closed ( +1) / 306 total ( +1) New / Reopened Patches ______________________ Enhance tracebacks and stack traces with vars (2005-01-08) http://python.org/sf/1098732 opened by Skip Montanaro Single-line option to pygettext.py (2005-01-09) http://python.org/sf/1098749 opened by Martin Blais improved smtp connect debugging (2005-01-11) CLOSED http://python.org/sf/1100140 opened by Wummel Log gc times when DEBUG_STATS set (2005-01-11) http://python.org/sf/1100294 opened by Skip Montanaro deepcopying listlike and dictlike objects (2005-01-12) http://python.org/sf/1100562 opened by Bj?rn Lindqvist ast-branch: fix for coredump from new import grammar (2005-01-11) http://python.org/sf/1100563 opened by logistix datetime.strptime constructor added (2005-01-12) http://python.org/sf/1100942 opened by Josh Feed style codec API (2005-01-12) http://python.org/sf/1101097 opened by Walter D?rwald Patch for potential buffer overrun in tokenizer.c (2005-01-13) http://python.org/sf/1101726 opened by Greg Chapman ast-branch: hacks so asdl_c.py generates compilable code (2005-01-14) http://python.org/sf/1102710 opened by logistix Fix for 926423: socket timeouts + Ctrl-C don't play nice (2005-01-15) http://python.org/sf/1102879 opened by Irmen de Jong Boxing up PyDECREF correctly (2005-01-15) CLOSED http://python.org/sf/1103046 opened by Norbert Nemec AF_NETLINK sockets basic support (2005-01-15) http://python.org/sf/1103116 opened by Philippe Biondi Adding the missing socket.recvall() method (2005-01-16) http://python.org/sf/1103213 opened by Irmen de Jong tarfile.py: fix for bug #1100429 (2005-01-16) http://python.org/sf/1103407 opened by Lars Gust?bel Patches Closed ______________ pydoc data descriptor unification (2004-04-17) http://python.org/sf/936774 closed by jlgijsbers xml.dom missing API docs (bugs 1010196, 1013525) (2004-10-21) http://python.org/sf/1051321 closed by jlgijsbers Fix for bug 1017546 (2004-08-27) http://python.org/sf/1017550 closed by jlgijsbers fixes urllib2 digest to allow arbitrary methods (2005-01-04) http://python.org/sf/1095362 closed by jlgijsbers Bug fix 548176: urlparse('http://foo?blah') errs (2003-03-30) http://python.org/sf/712317 closed by jlgijsbers bug fix 702858: deepcopying reflexive objects (2003-03-22) http://python.org/sf/707900 closed by jlgijsbers minor codeop fixes (2003-05-15) http://python.org/sf/737999 closed by jlgijsbers SimpleHTTPServer reports wrong content-length for text files (2003-11-10) http://python.org/sf/839496 closed by jlgijsbers improved smtp connect debugging (2005-01-11) http://python.org/sf/1100140 closed by jlgijsbers Boxing up PyDECREF correctly (2005-01-15) http://python.org/sf/1103046 closed by rhettinger New / Reopened Bugs ___________________ socket.setdefaulttimeout() breaks smtplib.starttls() (2005-01-08) http://python.org/sf/1098618 opened by Matthew Cowles set objects cannot be marshalled (2005-01-09) CLOSED http://python.org/sf/1098985 opened by Gregory H. Ball codec readline() splits lines apart (2005-01-09) CLOSED http://python.org/sf/1098990 opened by Irmen de Jong Optik OptionParse important undocumented option (2005-01-10) http://python.org/sf/1099324 opened by ncouture refman doesn't know about universal newlines (2005-01-10) http://python.org/sf/1099363 opened by Jack Jansen raw_input() displays wrong unicode prompt (2005-01-10) http://python.org/sf/1099364 opened by Petr Prikryl tempfile files not types.FileType (2005-01-10) CLOSED http://python.org/sf/1099516 opened by Frans van Nieuwenhoven copy.deepcopy barfs when copying a class derived from dict (2005-01-10) http://python.org/sf/1099746 opened by Doug Winter Cross-site scripting on BaseHTTPServer (2005-01-11) http://python.org/sf/1100201 opened by Paul Johnston Scripts started with CGIHTTPServer: missing cgi environment (2005-01-11) http://python.org/sf/1100235 opened by pacote Frame does not receive configure event on move (2005-01-11) http://python.org/sf/1100366 opened by Anand Kameswaran Wrong "type()" syntax in docs (2005-01-11) http://python.org/sf/1100368 opened by Facundo Batista TarFile iteration can break (on Windows) if file has links (2005-01-11) http://python.org/sf/1100429 opened by Greg Chapman Python Interpreter shell is crashed (2005-01-12) http://python.org/sf/1100673 opened by abhishek test_fcntl fails on netbsd2 (2005-01-12) http://python.org/sf/1101233 opened by Mike Howard test_shutil fails on NetBSD 2.0 (2005-01-12) CLOSED http://python.org/sf/1101236 opened by Mike Howard dict subclass breaks cPickle noload() (2005-01-13) http://python.org/sf/1101399 opened by Neil Schemenauer popen3 on windows loses environment variables (2005-01-13) http://python.org/sf/1101667 opened by June Kim popen4/cygwin ssh hangs (2005-01-13) http://python.org/sf/1101756 opened by Ph.E % operator bug (2005-01-14) CLOSED http://python.org/sf/1102141 opened by ChrisF rfc822 Deprecated since release 2.3? (2005-01-14) http://python.org/sf/1102469 opened by Wai Yip Tung pickle files should be opened in binary mode (2005-01-15) http://python.org/sf/1102649 opened by John Machin Incorrect RFC 2231 decoding (2005-01-15) http://python.org/sf/1102973 opened by Barry A. Warsaw raw_input problem with readline and UTF8 (2005-01-15) http://python.org/sf/1103023 opened by Casey Crabb send/recv SEGMENT_SIZE should be used more in socketmodule (2005-01-16) http://python.org/sf/1103350 opened by Irmen de Jong Bugs Closed ___________ typo in "Python Tutorial": 1. Whetting your appetite (2005-01-08) http://python.org/sf/1098497 closed by jlgijsbers xml.dom documentation omits hasAttribute, hasAttributeNS (2004-08-16) http://python.org/sf/1010196 closed by jlgijsbers xml.dom documentation omits createDocument, ...DocumentType (2004-08-21) http://python.org/sf/1013525 closed by jlgijsbers Documentation of DOMImplmentation lacking (2004-07-15) http://python.org/sf/991805 closed by jlgijsbers wrong documentation for popen2 (2004-01-29) http://python.org/sf/886619 closed by jlgijsbers test_inspect.py fails to clean up upon failure (2004-08-27) http://python.org/sf/1017546 closed by jlgijsbers weird/buggy inspect.getsource behavious (2003-07-11) http://python.org/sf/769569 closed by jlgijsbers SimpleHTTPServer sends wrong Content-Length header (2005-01-07) http://python.org/sf/1097597 closed by jlgijsbers urllib2: improper capitalization of headers (2004-07-19) http://python.org/sf/994101 closed by jlgijsbers urlparse doesn't handle host?bla (2002-04-24) http://python.org/sf/548176 closed by jlgijsbers set objects cannot be marshalled (2005-01-09) http://python.org/sf/1098985 closed by rhettinger codec readline() splits lines apart (2005-01-09) http://python.org/sf/1098990 closed by doerwalter tempfile files not types.FileType (2005-01-10) http://python.org/sf/1099516 closed by rhettinger for lin in file: file.tell() tells wrong (2002-11-29) http://python.org/sf/645594 closed by facundobatista Py_Main() does not perform to spec (2003-01-21) http://python.org/sf/672035 closed by facundobatista Incorrect permissions set in lib-dynload. (2003-02-04) http://python.org/sf/680379 closed by facundobatista Apple-installed Python fails to build extensions (2005-01-04) http://python.org/sf/1095822 closed by jackjansen test_shutil fails on NetBSD 2.0 (2005-01-12) http://python.org/sf/1101236 closed by jlgijsbers CSV reader does not parse Mac line endings (2003-08-16) http://python.org/sf/789519 closed by andrewmcnamara Bugs in _csv module - lineterminator (2004-11-24) http://python.org/sf/1072404 closed by andrewmcnamara % operator bug (2005-01-14) http://python.org/sf/1102141 closed by rhettinger test_atexit fails in directories with spaces (2003-03-18) http://python.org/sf/705792 closed by facundobatista SEEK_{SET,CUR,END} missing in 2.2.2 (2003-03-29) http://python.org/sf/711830 closed by loewis CGIHTTPServer cannot manage cgi in sub directories (2003-07-28) http://python.org/sf/778804 closed by facundobatista double symlinking corrupts sys.path[0] (2003-08-24) http://python.org/sf/794291 closed by facundobatista popen3 under threads reports different stderr results (2003-12-09) http://python.org/sf/856706 closed by facundobatista Signals discard one level of exception handling (2003-07-15) http://python.org/sf/771429 closed by facundobatista build does not respect --prefix (2002-10-27) http://python.org/sf/629345 closed by facundobatista urllib2 proxyhandle won't work. (2001-11-30) http://python.org/sf/487471 closed by facundobatista RFE Closed __________ popen does not like filenames with spaces (2003-07-20) http://python.org/sf/774546 closed by rhettinger From pje at telecommunity.com Sun Jan 16 05:51:59 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 17 00:14:57 2005 Subject: [Python-Dev] "Monkey Typing" pre-PEP, partial draft Message-ID: <5.1.1.6.0.20050115170350.020fe6a0@mail.telecommunity.com> This is only a partial first draft, but the Motivation section nonetheless attempts to briefly summarize huge portions of the various discussions regarding adaptation, and to coin a hopefully more useful terminology than some of our older working adjectives like "sticky" and "stateless" and such. And the specification gets as far as defining a simple decorator-based syntax for creating operational (prev. "stateless") and extension (prev. "per-object stateful") adapters. I stopped when I got to the API for declaring volatile (prev. per-adapter stateful) adapters, and for enabling them to be used with type declarations, because Clark's post on his revisions-in-progress seem to indicate that this can probably be handled within the scope of PEP 246 itself. As such, this PEP should then be viewed more as an attempt to formulate how "intrinsic" adapters can be defined in Python code, without the need to manually create adapter classes for the majority of type-compatibility and "extension" use cases. In other words, the implementation described herein could probably become part of the front-end for the PEP 246 adapter registry. Feedback and corrections (e.g. if I've repeated myself somewhere, spelling, etc.) would be greatly appreciated. This uses ReST markup heavily, so if you'd prefer to read an HTML version, please see: http://peak.telecommunity.com/DevCenter/MonkeyTyping But I'd prefer that corrections/discussion quote the relevant section so I know what parts you're talking about. Also, if you find a place where a more concrete example would be helpful, please consider submitting one that I can add. Thanks! PEP: XXX Title: "Monkey Typing" for Agile Type Declarations Version: $Revision: X.XX $ Last-Modified: $Date: 2003/09/22 04:51:50 $ Author: Phillip J. Eby Status: Draft Type: Standards Track Python-Version: 2.5 Content-Type: text/x-rst Created: 15-Jan-2005 Post-History: 15-Jan-2005 Abstract ======== Python has always had "duck typing": a way of implicitly defining types by the methods an object provides. The name comes from the saying, "if it walks like a duck and quacks like a duck, it must *be* a duck". Duck typing has enormous practical benefits for small and prototype systems. For very large frameworks, however, or applications that comprise multiple frameworks, some limitations of duck typing can begin to show. This PEP proposes an extension to "duck typing" called "monkey typing", that preserves most of the benefits of duck typing, while adding new features to enhance inter-library and inter-framework compatibility. The name comes from the saying, "Monkey see, monkey do", because monkey typing works by stating how one object type may *mimic* specific behaviors of another object type. Monkey typing can also potentially form the basis for more sophisticated type analysis and improved program performance, as it is essentially a simplified form of concepts that are also found in languages like Dylan and Haskell. It is also a straightforward extension of Java casting and COM's QueryInterface, which should make it easier to represent those type systems' behaviors within Python as well. Motivation ========== Many interface and static type declaration mechanisms have been proposed for Python over the years, but few have met with great success. As Guido has said recently [1]_: One of my hesitations about adding adapt() and interfaces to the core language has always been that it would change the "flavor" of much of the Python programming we do and that we'd have to relearn how to write good code. Even for widely-used Python interface systems (such as the one provided by Zope), interfaces and adapters seem to require this change in "flavor", and can require a fair amount of learning in order to use them well and avoid various potential pitfalls inherent in their use. Thus, spurred by a discussion on PEP 246 and its possible use for optional type declarations in Python [2]_, this PEP is an attempt to propose a semantic basis for optional type declarations that retains the "flavor" of Python, and prevents users from having to "relearn how to write good code" in order to use the new features successfully. Of course, given the number of previous failed attempts to create a type declaration system for Python, this PEP is an act of extreme optimism, and it will not be altogether surprising if it, too, ultimately fails. However, if only because the record of its failure will be useful to the community, it is worth at least making an attempt. (It would also not be altogether surprising if this PEP results in the ironic twist of convincing Guido not to include type declarations in Python at all!) Although this PEP will attempt to make adaptation easy, safe, and flexible, the discussion of *how* it will do that must necessarily delve into many detailed aspects of different use cases for adaptation, and the possible pitfalls thereof. It's important to understand, however, that developers do *not* need to understand more than a tiny fraction of what is in this PEP, in order to effectively use the features it proposes. Otherwise, you may gain the impression that this proposal is overly complex for the benefits it provides, even though virtually none of that complexity is visible to the developer making use of the proposed facilities. That is, the value of this PEP's implementation lies in how much of this PEP will *not* need to be thought about by a developer using it! Therefore, if you would prefer an uncorrupted "developer first impression" of the proposal, please skip the remainder of this Motivation and proceed directly to the `Specification`_ section, which presents the usage and implementation. However, if you've been involved in the Python-Dev discussion regarding PEP 246, you probably already know too much about the subject to have an uncorrupted first impression, so you should instead read the rest of this Motivation and check that I have not misrepresented your point of view before proceeding to the Specification. :) Why Adaptation for Type Declarations? ------------------------------------- As Guido acknowledged in his optional static typing proposals, having type declarations check argument types based purely on concrete type or conformance to interfaces would stifle much of Python's agility and flexibility. However, if type declarations are used instead to *adapt* objects to an interface expected by the receiver, Python's flexibility could in fact be *improved* by type declarations. PEP 246 presents a basic implementation model for automatically finding an appropriate adapter so that one type can conform to the interface of another. However, in recent discussions on the Python developers' mailing list, it came out that there were many open issues about what sort of adapters would be useful (or dangerous) in the context of type declarations. PEP 246 was originally proposed for an explicit adaptation model where an ``adapt()`` function is called to retrieve an "adapter". So, in this model the adapting code potentially has access to both the "original" object and the adapted version of the object. Also, PEP 246 permitted either the caller of a function or the called function to perform the adaptation, meaning that the scope and lifetime of the resulting adapter could be explicitly controlled in a straightforward way. By contrast, type declarations would perform adaptation at the boundary between caller and callee, making it impossible for the caller to control the adapter's lifetime, or for the callee to obtain the "original" object. Many options for reducing or controlling these effects were discussed. By and large, it is possible for an adapter author to address these issues with due care and attention. However, it also became clear from the discussion that most persons new to the use of adaptation are often eager to use it for things that lead rather directly to potentially problematic adapter behaviors. Also, by the very nature of ubiquitous adaptation via type declarations, these potentially problematic behaviors can spread throughout a program, and just because one developer did not create a problematic adaptation, it does not mean he or she will be immune to the effects of those created by others. So, rather than attempt to make all possible Python developers "relearn how to write good code", this PEP seeks to make the safer forms of adaptation easier to learn and use than the less-safe forms. (Which is the reverse of the current situation, where less-safe adapters are often easier to write than some safer ones!) Kinds of Adaptation ------------------- Specifically, the three forms of type adaptation we will discuss here are: Operational Conformance Providing operations required by a target interface, using the operations and state available of the adapted type. This is the simplest category of adaptation, because it introduces no new state information. It is simply a specification of how an instance of one type can be adapted to act as if it were an instance of another type. Extension/Extender The same as operational conformance, but with additional required state. This extra state, however, "belongs" to the original object, in the sense that it should exist as long as the original object exists. An extension, in other words, is intended to extend the capabilities of the original object when needed, not to be an independently created object with its own lifetime. Each time an adapter is requested for the target interface, an extension instance with the "same" state should be returned. Volatile/View/Accessory Volatile adapters are used to provide functionality that may require multiple independent adapters for the same adapted object. For example, a "view" in a model-view-controller (MVC) framework can be seen as a volatile adapter on a model, because more than one view may exist for the same model, with each view having its own independent state (such as window position, etc.). Volatile adaptation is not an ideal match for type declaration, because it is often important to explicitly control when each new volatile adapter is created, and to whom it is being passed. For example, in an MVC framework one would not normally wish to pass a model to methods expecting views, and wind up having new views created (e.g. windows opened) automatically! Naturally, there *are* cases where opening a new window for some model object *is* what you want. However, using an implicit adaptation (via type declaration) also means that passing a model to *any* method expecting a view would result in this happening. So, it is generally better to have the methods that desire this behavior explicitly request it, e.g. by calling the PEP 246 ``adapt()`` function, rather than having it happen implicitly by way of a type declaration. So, this PEP seeks to: 1. Make it easy to define operational and extension adapters 2. Make it possible to define volatile adapters, but only by explicitly declaring them as such in the adapter's definition. 3. Make it possible to have a type declaration result in creation of a volatile adapter, but only by explicitly declaring in the adapter's definition that type declarations are allowed to implicitly create instances. By doing this, the language can gently steer developers away from unintentionally creating adapters whose implicit behavior is difficult to understand, or is not as they intended, by making it easier to do safer forms of adaptation, and suggesting (via declaration requirements) that other forms may need a bit more thought to use correctly. Adapter Composition ------------------- One other issue that was discussed heavily on Python-Dev regarding PEP 246 was adapter composition. That is, adapting an already-adapted object. Many people spoke out against implicit adapter composition (which was referred to as transitive adaptation), because it introduces potentially unpredictable emergent behavior. That is, a local change to a program could have unintended effects at a more global scale. Using adaptation for type declarations can produce unintended adapter composition. Take this code, for example:: def foo(bar: Baz): whack(bar) def whack(ping: Whee): ping.pong() If a ``Baz`` instance is passed to ``foo()``, it is not wrapped in an adapter, but is then passed to ``whack()``, which must then adapt it to the ``Whee`` type. However, if an instance of a different type is passed to ``foo()``, then ``foo()`` will receive an adapter to make that object act like a ``Baz`` instance. This adapter is then passed to ``whack()``, which further adapts it to a ``Whee`` instance, thereby composing a second adapter onto the first, or perhaps failing with a type error because there is no adapter available to adapt the already-adapted object. (There can be other side effects as well, such as when attempting to compare implicitly adapted objects or use them as dictionary keys.) Therefore, this proposal seeks to have adaptation performed via type declarations avoid implicit adapter composition, by never adapting an operational or extension adapter. Instead, the original object will be retrieved from the adapter, and then adapted to the new target interface. Volatile adapters, however, are independent objects from the object they adapt, so they must always be considered an "original object" in their own right. (So, volatile adapters are also more volatile than other adapters with respect to transitive adaptation.) However, since volatile adapters must be declared as such, and require an additional declaration to allow them to be implicitly created, the developer at least has some warning that their behavior will be more difficult to predict in the presence of type declarations. Interfaces vs. Duck Typing -------------------------- An "interface" is generally recognized as a collection of operations that an object may perform, or that may be performed on it. Type declarations are then used in many languages to indicate what interface is required of an object that is supplied to a routine, or what interface is provided by the routine's return value(s). The problem with this concept is that interface implementations are typically expected to be complete. In Java, for example, you say that your class implements an interface unless you actually add all of the required methods, even if some of them aren't needed in your program yet. A second problem with this is that incompatible interfaces tend to proliferate among libraries and frameworks, even when they deal with the same basic concepts and operations. Just the fact that people might choose different names for otherwise-identical operations makes it considerably less likely that two interfaces will be compatible with each other! There are two missing things here: 1. Just because you want to have an object of a given type (interface) doesn't mean you will use all possible operations on it. 2. It'd be really nice to be able to map operations from one interface onto another, without having to write wrapper classes and possibly having to write dummy implementations for operations you don't need, and perhaps can't even implement at all! On the other hand, the *idea* of an interface as a collection of operations isn't a bad idea. And if you're the one *using* the interface's operations, it's a convenient way to do it. This proposal seeks to retain this useful property, while ditching much of the "baggage" that otherwise comes with it. What we would like to do, then, is allow any object that can perform operations "like" those of a target interface, to be used as if it were an object of the type that the interface suggests. As an example, consider the notion of a "file-like" object, which is often referred to in the discussion of Python programs. It basically means, "an object that has methods whose semantics roughly correspond to the same-named methods of the built-in ``file`` type." It does *not* mean that the object must be an instance of a subclass of ``file``, or that it must be of a class that declares it "implements the ``file`` interface". It simply means that the object's *namespace* mirrors the *meaning* of a ``file`` instance's namespace. In a phrase, it is "duck typing": if it walks like a duck and quacks like a duck, it must *be* a duck. Traditional interface systems, however, rapidly break down when you attempt to apply them to this concept. One repeatedly used measuring stick for proposed Python interface systems has been, "How do I say I want a file-like object?" To date, no proposed interface system for Python (that this author knows about, anyway) has had a good answer for this question, because they have all been based on completely implementing the operations defined by an interface object, distinct from the concrete ``file`` type. Note, however, that this alienation between "file-like" interfaces and the ``file`` type, leads to a proliferation of incompatible interfaces being created by different packages, each declaring a different subset of the total operations provided by the ``file`` type. This then leads further to the need to somehow reconcile the incompatibilities between these diverse interfaces. Therefore, in this proposal we will turn both of those assumptions upside down, by proposing to declare conformance to *individual operations* of a target type, whether the type is concrete or abstract. That is, one may define the notion of "file-like" without reference to any interface at all, by simply declaring that certain operations on an object are "like" the operations provided by the ``file`` type. This idea will (hopefully) better match the uncorrupted intuition of a Python programmer who has not yet adopted traditional static interface concepts, or of a Python programmer who rebels against the limitations of those concepts (as many Python developers do). And, the approach corresponds fairly closely to concepts in other languages with more sophisticated type systems (like Haskell typeclasses or Dylan protocols), while still being a straightforward extension of more rigid type systems like those of Java or Microsoft's COM (Component Object Model). This PEP directly competes with PEP 245, which proposes a syntax for Python interfaces. If some form of this proposal is accepted, it would be unnecessary for a special interface type or syntax to be added to Python, since normal classes and partially or completely abstract classes will be routinely usable as interfaces. Some packages or frameworks, of course, may have additional requirements for interface features, but they can use metaclasses to implement such enhanced interfaces without impeding their ability to be used as interfaces by this PEP's adaptation system. Specification ============= For "file-like" objects, the standard library already has a type which may form the basis for compatible interfacing between packages; if each package denotes the relationship between its types' operations and the operations of the ``file`` type, then those packages can accept other packages' objects as parameters declared as requiring a ``file`` instance. However, the standard library cannot contain base versions of all possible operations for which multiple implementations might exist, so different packages are bound to create different renderings of the same basic operations. For example, one package's ``Duck`` class might have ``walk()`` and ``quack()`` methods, where another package might have a ``Mallard`` class (a kind of duck) with ``waddle()`` and ``honk()`` methods. And perhaps another package might have a class with ``moveLeftLeg()`` and ``moveRightLeg()`` methods that must be combined in order to offer an operation equivalent to ``Duck.walk()``. Assuming that the package containing ``Duck`` has a function like this (using Guido's proposed optional typing syntax [2]_):: def walkTheDuck(duck: Duck): duck.walk() This function expects a ``Duck`` instance, but what if we wish to use a ``Mallard`` from the other package? The simple answer is to allow Python programs to explicitly state that an operation (i.e. function or method) of one type has semantics that roughly correspond to those of an operation possessed by a different type. That is, we want to be able to say that ``Mallard.waddle()`` is "like" the method ``Duck.walk()``. (For our examples, we'll use decorators to declare this "like"-ness, but of course Python's syntax could also be extended if desired.) If we are the author of the ``Mallard`` class, we can declare our compatibility like this:: class Mallard(Waterfowl): @like(Duck.walk) def waddle(self): # walk like a duck! This is an example of declaring the similarity *inside* the class to be adapted. In many cases, however, you can't do this because you don't control the implementation of the class you want to use, or even if you do, you don't wish to introduce a dependency on the foreign package. In that case, you can create what we'll call an "external operation", which is just a function that's declared outside the class it applies to. It's almost identical to the "internal operation" we declared inside the ``Mallard`` class, but it has to call the ``waddle()`` method, since it doesn't also implement waddling:: @like(Duck.walk, for_type=Mallard) def duckwalk_by_waddling(self): self.waddle() Whichever way the operation correspondence is registered, we should now be able to successfully call ``walkTheDuck(Mallard())``. Python will then automatically create a "proxy" or "adapter" object that wraps the ``Mallard`` instance with a ``Duck``-like interface. That adapter will have a ``walk()`` method that is just a renamed version of the ``Mallard`` instance's ``waddle()`` method (or of the ``duckwalk_by_waddling`` external operation). For any methods of ``Duck`` that have no corresponding ``Mallard`` operation, the adapter will omit that attribute, thereby maintaining backward compatibility with code that uses attribute introspection or traps ``AttributeError`` to control optional behaviors. In other words, if we have a ``MuteMallard`` class that has no ability to ``quack()``, but has an operation corresponding to ``walk()``, we can still safely pass its instances to ``walkTheDuck()``, but if we pass a ``MuteMallard`` to a routine that tries to make it ``quack``, that routine will get an ``AttributeError``. Adapter Creation ---------------- Note, however, that even though a different adapter class is needed for different adapted types, it is not necessary to create an adapter class "from scratch" every time a ``Mallard`` is used as a ``Duck``. Instead, the implementation can need only create a ``MallardAsDuck`` adapter class once, and then cache it for repeated uses. Adapter instances can also be quite small in size, because in the general case they only need to contain a reference to the object instance that they are adapting. (Except for "extension" adapters, which need storage for their added "state" attributes. More on this later, in the section on `Adapters That Extend`_, below.) In order to be able to create these adapter classes, we need to be able to determine the correspondence between the target ``Duck`` operations, and operations for a ``Mallard``. This is done by traversing the ``Duck`` operation namespace, and retrieving methods and attribute descriptors. These descriptors are then looked up in a registry keyed by descriptor (method or property) and source type (``Mallard``). The found operation is then placed in the adapter class' namespace under the name given to it by the ``Duck`` type. So, as we go through the ``Duck`` methods, we find a ``walk()`` method descriptor, and we look into a registry for the key ``(Duck.walk,Mallard)``. (Note that this is keyed by the actual ``Duck.walk`` method, not by the *name* ``"Duck.walk"``. This means that an operation inherited unchanged by a subclass of ``Duck`` can reuse operations declared "like" that operation.) If we find the entry, ``duckwalk_by_waddling`` (the function object, not its name), then we simply place that object in the adapter class' dictionary under the name ``"walk"``, wrapped in a descriptor that substitutes the original object as the method's ``self`` parameter. Thus, when the function is invoked via an adapter instance's ``walk()`` method, it will receive the adapted ``Mallard`` as its ``self``, and thus be able to call the ``waddle()`` operation. However, operations declared in a class work somewhat differently. If we directly declared that ``waddle()`` is "like" ``Duck.walk`` in the body of the ``Mallard`` class, then the ``@like`` decorator will register the method name ``"waddle"`` as the operation in the registry. So, we would then look up that name on the source type in order to implement the operation on the adapter. For the ``Mallard`` class, this doesn't make any difference, but if we were adapting a subclass of ``Mallard`` this would allow us to pick up the subclass' implementation of ``waddle()`` instead. So, we have our ``walk()`` method, so now let's add a ``quack()`` method. But wait, we haven't declared one for ``Mallard``, so there's no entry for ``(Duck.quack,Mallard)`` in our registry. So, we proceed through the ``__mro__`` (method resolution order) of ``Mallard`` in order to see if there is an operation corresponding to ``quack`` that ``Mallard`` inherited from one of its base classes. If no method is found, we simply do not put anything in the adapter class for a ``"quack"`` method, which will cause an ``AttributeError`` if somebody tries to call it. Finally, if our attempt at creating an adapter winds up having *no* operations specific to the ``Duck`` type, then a ``TypeError`` is raised. Thus if we had passed an instance of ``Pig`` to the ``walkTheDuck`` function, and ``Pig`` had no methods corresponding to any ``Duck`` methods, this would result in a ``TypeError`` -- even if the ``Pig`` type has a method named ``walk()``! -- because we haven't said anywhere that a pig walks like a duck. Of course, if all we wanted was for ``walkTheDuck`` to accept any object with a method *named* ``walk()``, we could've left off the type declaration in the first place! The purpose of the type declaration is to say that we *only* want objects that claim to walk like ducks, assuming that they walk at all. This approach is not perfect, of course. If we passed in a ``LeglessDuck`` to ``walkTheDuck()``, it is not going to work, even though it will pass the ``Duck`` type check (because it can still ``quack()`` like a ``Duck``). However, as with normal Python "duck typing", it suffices to run the program to find that error. The key here is that type declarations should facilitate using *different* objects, perhaps provided by other authors following different naming conventions or using different operation granularities. Inheritance ----------- By default, this system assumes that subclasses are "substitutable" for their base classes. That is, we assume that a method of a given name in a subclass is "like" (i.e. is substitutable for) the correspondingly-named method in a base class. However, sometimes this is *not* the case; a subclass may have stricter requirements on routine parameters. For example, suppose we have a ``Mallard`` subclass like this one:: class SpeedyMallard(Mallard): def waddle(self, speed): # waddle at given speed This class is *not* substitutable for Mallard, because it requires an extra parameter for the ``waddle()`` method. In this case, the system should *not* consider ``SpeedyMallard.waddle`` to be "like" ``Mallard.waddle``, and it therefore should not be usable as a ``Duck.walk`` operation. In other words, when inheriting an operation definition from a base class, the subclass' operation signature must be checked against that of the base class, and rejected if it is not compatible. (Where "compatible" means that the subclass method will accept as many arguments as the base class method will, and that any extra arguments taken by the subclass method are optional ones.) Note that Python cannot tell, however, if a subclass changes the *meaning* of an operation, without changing its name or signature. Doing so is arguably bad style, of course, but it could easily be supported anyway by using an additional decorator, perhaps something like ``@unlike(Mallard.waddle)`` to claim that no operation correspondences should remain, or perhaps ``@unlike(Duck.walk)`` to indicate that only that operation no longer applies. In any case, when a substitutability error like this occurs, it should ideally give the developer an error message that explains what is happening, perhaps something like "waddle() signature changed in class Mallard, but replacement operation for Duck.walk has not been defined." This error can then be silenced with an explicit ``@unlike`` decorator (or by a standalone ``unlike`` call if the class cannot be changed). External Operations and Method Dependencies ------------------------------------------- So far, we've been dealing only with simple examples of method renaming, so let's now look at more complex integration needs. For example, the Python ``dict`` type allows you to set one item at a time (using ``__setitem__``) or to set multiple items using ``update()``. If you have an object that you'd like to pass to a routine accepting "dictionary-like" objects, what if your object only has a ``__setitem__`` operation but the routine wants to use ``update()``? As you may recall, we follow the source type's ``__mro__`` to look for an operation inherited possibly "inherited" from a base class. This means that it's possible to register an "external operation" under ``(dict.update,object)`` that implements a dictionary-like ``update()`` method by repeatedly calling ``__setitem__``. We can do so like this:: @like(dict.update, for_type=object, needs=[dict.__setitem__]) def do_update(self:dict, other:dict): for key,value in other.items(): self[key] = value Thus, if a given type doesn't have a more specific implementation of ``dict.update``, then types that implement a ``dict.__setitem__`` method can automatically have this ``update()`` method added to their ``dict`` adapter class. While building the adapter class, we simply keep track of the needed operations, and remove any operations with unmet or circular dependencies. By the way, even though technically the ``needs`` argument to ``@like`` could be omitted since the information is present in the method body, it's actually helpful for documentation purposes to present the external operation's requirements up-front. However, if the programmer fails to accurately state the method's needs, the result will either be an ``AttributeError`` at a deeper point in the code, or a stack overflow exception caused by looping between mutually recursive operations. (E.g. if an external ``dict.__setitem__`` is defined in terms of ``dict.update``, and a particular adapted type supports neither operation directly.) Neither of these ways of revealing the error is particularly problematic, and is easily fixed when discovered, so ``needs`` is still intended more for the reader of the code than for the adaptation system. By the way, if we look again at one of our earliest examples, where we externally declared a method correspondence from ``Mallard.waddle`` to ``Duck.walk``:: @like(Duck.walk, for_type=Mallard) def walk_like_a_duck(self): self.waddle() we can see that this is actually an external operation being declared; it's just that we didn't give the (optional) full declarations:: @like(Duck.walk, for_type=Mallard, needs=[Mallard.waddle]) def walk_like_a_duck(self:Mallard): self.waddle() When you register an external operation, the actual function object given is registered, because the operation doesn't correspond to a method on the adapted type. In contrast, "internal operations" declared within the adapted type cause the method *name* to be registered, so that subclasses can inherit the "likeness" of the base class' methods. Adapters That Extend -------------------- One big difference between external operations and ones created within a class, is that a class' internal operations can easily add extra attributes if needed. An external operation, however, is not in a good position to do that. It *could* just stick additional attributes onto the original object, but this would be considered bad style at best, even if it used mangled attribute names to avoid collisions with other external operations' attributes. So let's look at an example of how to handle adaptation that needs more state information than is available in the adapted object. Suppose, for example, we have a new ``DuckDodgers`` class, representing a duck who is also a test pilot. He can therefore be used as a rocket-powered vehicle by strapping on a ``JetPack``, which we can have happen automatically:: @like(Rocket.launch, for_type=DuckDodgers, using=JetPack) def launch(jetpack, self): jetpack.activate() print "Up, up, and away!" The type given as the ``using`` parameter must be instantiable without arguments. That is, ``JetPack()`` must create a valid instance. When a ``DuckDodgers`` instance is being used as a ``Rocket`` instance, and this ``launch`` method is invoked, it will attempt to create a ``JetPack`` instance for the ``DuckDodgers`` instance (if one has not already been created and cached). The same ``JetPack`` will be used for all external operations that request to use a ``JetPack`` for that specific ``DuckDodgers`` instance. (Which only makes sense, because Dodgers can wear only one jet pack at a time, and adding more jet packs will not allow him to fly to several places at once!) It's also necessary to keep reusing the *same* ``JetPack`` instance for a given ``DuckDodgers`` instance, even if it is adapted many times to different rocketry-related interfaces. Otherwise, we might create a new ``JetPack`` during flight, which would then be confused about how much fuel it had or whether it was currently in flight! This pattern of adaptation is referred to in the `Motivation`_ section as "extension" or "extender" adaptation, because it allows you to dynamically extend the capabilities of an existing class at runtime, as opposed to just recasting its existing operations in a form that's compatible with another type. In this case, the ``JetPack`` is the extension, and our ``launch`` method defines part of the adapter. Note, by the way that ``JetPack`` is a completely independent class here. It does not have to know anything about ``DuckDodgers`` or its use as an adapter, nor does ``DuckDodgers`` need to know about ``JetPack``. In fact, neither object should be given a reference to the other, or this will create a circularity that may be difficult to garbage collect. Python's adaptation machinery will use a weak-key dictionary mapping from adapted objects to their "extensions", so that our ``JetPack`` instance will hang around until the associated ``DuckDodgers`` instance goes away. Then, when external operations using ``JetPack`` are invoked, they simply request a ``JetPack`` instance from this dictionary, for the given ``DuckDodgers`` instance, and then the operation is invoked with references to both objects. Of course, this mechanism is not available for adapting types whose instances cannot be weak-referenced, such as strings and integers. If you need to extend such a type, you must fall back to using a volatile adapter, even if you would prefer to have a state that remains consistent across adaptations. (See the `Volatile Adaptation`_ section below.) Using Multiple Extenders ------------------------ Each external operation can use a different ``using`` type to store its state. For example, a ``DuckDodgers`` instance might be able to be used as a ``Soldier``, provided that he has a ``RayGun``:: @like(Soldier.fight, for_type=DuckDodgers, using=RayGun) def fight(raygun, self, enemy:Martian): while enemy.isAlive(): raygun.fireAt(enemy) In the event that two operations covering a given ``for_type`` type have ``using`` types with a common base class (other than ``object``), the most-derived type is used for both operations. This rule ensures that extenders do not end up with more than one copy of the same state, divided between a base type and a derived type. Notice that our examples of ``using=JetPack`` and ``using=RayGun`` do not interact, as long as ``RayGun`` and ``JetPack`` do not share a common base class other than ``object``. However, if we had defined one operation ``using=JetPack`` and another as ``using=HypersonicJetPack``, then both operations would receive a ``HypersonicJetPack`` if ``HypersonicJetPack`` is a subclass of ``JetPack``. This ensures that we don't end up with two jet packs, but instead use the best jetpack possible for the operations we're going to perform. However, if we *also* have an operation using a ``BrokenJetPack``, and that's also a subclass of ``JetPack``, then we have a conflict, because there's no way to reconcile a ``HypersonicJetPack`` with a ``BrokenJetPack``, without first creating a ``BrokenHypersonicJetPack`` that derives from both, and using it in at least one of the operations. If it is not possible to determine a single "most-derived" type among a set of operations for a given adapted type, then an error is raised, similar to that raised by when deriving a class from classes with incompatible metaclasses. As with that kind of error, this error can be resolved just by adding another ``using`` type that inherits from the conflicting types. Volatile Adaptation ------------------- Volatile adapters are not the same thing as operational adapters or extenders. Indeed, some strongly question whether they should be called "adapters" at all, because to do so weakens the term. For example, in the model-view-controller pattern, does it make sense to call a view an "adapter"? What about iterators? Are they "adapters", too? At some point, one is reduced to calling any object an adapter, as long as it mainly performs operations on one other object. This seems like a questionable practice, and it's a much broader term than is used in the context of the GoF "Adapter Pattern" [3]_. Indeed, it could be argued that these other "adapters" are actually extensions of the GoF "Abstract Factory" pattern [4]_. An Abstract Factory is a way of creating an object whose interface is known, but whose concrete type is not. PEP 246 adaptation can basically be viewed as an all-purpose Abstract Factory that takes a source object and a destination interface. This is a valuable tool for many purposes, but it is not really the same thing as adaptation. Shortly after I began writing this section, Clark Evans posted a request for feedback on changes to PEP 246, that suggests PEP 246 will provide adequate solutions of its own for defining volatile adapters, including options for declaring an adapter volatile, and whether it is safe for use with type declarations. So, for now, this PEP will assume that volatile adapters will fall strictly under the jurisdiction of PEP 246, leaving this PEP to deal only with the previously-covered styles of adaptation that are by definition safe for use with type declarations. (Because they only cast an object in a different role, rather than creating an independent object.) Miscellaneous ------------- XXX property get/set/del as three "operations" XXX binary operators XXX level-confusing operators: comparison, repr/str, equality/hashing XXX other special methods Backward Compatibility ====================== XXX explain Java cast and COM QueryInterface as proper subsets of adaptation Reference Implementation ======================== TODO Acknowledgments =============== Many thanks to Alex Martelli, Clark Evans, and the many others who participated in the Great Adaptation Debate of 2005. Special thanks also go to folks like Ian Bicking, Paramjit Oberoi, Steven Bethard, Carlos Ribeiro, Glyph Lefkowitz and others whose brief comments in a single message sometimes provided more insight than could be found in a megabyte or two of debate between myself and Alex; this PEP would not have been possible without all of your input. Last, but not least, Ka-Ping Yee is to be thanked for pushing the idea of "partially abstract" interfaces, for which idea I have here attempted to specify a practical implementation. Oh, and finally, an extra special thanks to Guido for not banning me from the Python-Dev list when Alex and I were posting megabytes of adapter-related discussion each day. ;) References ========== .. [1] Guido's Python-Dev posting on "PEP 246: lossless and stateless" (http://mail.python.org/pipermail/python-dev/2005-January/051053.html) .. [2] Optional Static Typing -- Stop the Flames! (http://www.artima.com/weblogs/viewpost.jsp?thread=87182) .. [3] XXX Adapter Pattern .. [4] XXX Abstract Factory Pattern Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From pje at telecommunity.com Mon Jan 17 00:46:08 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 17 00:44:39 2005 Subject: [Python-Dev] "Monkey Typing" pre-PEP, partial draft In-Reply-To: <5.1.1.6.0.20050115170350.020fe6a0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050116184546.030afc00@mail.telecommunity.com> Oops. I forgot to cancel this posting; it's an older version. At 11:51 PM 1/15/05 -0500, Phillip J. Eby wrote: >This is only a partial first draft, but the Motivation section nonetheless >attempts to briefly summarize huge portions of the various discussions >regarding adaptation, and to coin a hopefully more useful terminology than >some of our older working adjectives like "sticky" and "stateless" and >such. And the specification gets as far as defining a simple >decorator-based syntax for creating operational (prev. "stateless") and >extension (prev. "per-object stateful") adapters. From kbk at shore.net Mon Jan 17 05:01:19 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Mon Jan 17 05:02:00 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python pythonrun.c, 2.161.2.15, 2.161.2.16 References: <87brc17xbg.fsf@hydra.bayview.thirdcreek.com> <877jmo93qv.fsf@hydra.bayview.thirdcreek.com> <873bxc8m6k.fsf@hydra.bayview.thirdcreek.com> Message-ID: <87llasvh4g.fsf@hydra.bayview.thirdcreek.com> kbk@shore.net (Kurt B. Kaiser) writes: > kbk@shore.net (Kurt B. Kaiser) writes: >> [JH] >>> ../Python/symtable.c:193: structure has no member named `st_tmpname' >>> >>> Do you see that? >> >> Yeah, the merge eliminated it from the symtable struct in symtable.h. >> You moved it to symtable_entry at rev 2.12 in MAIN :-) [...] I checked in a change which adds the st_tmpname element back to symtable. Temporary until someone gets time to evaluate the situation. [...] > Apparently the $(AST_H) $(AST_C): target ran and Python-ast.c was > recreated (without the changes). It's not clear to me how/why that > happened. I did start with a clean checkout, but it seems that the > target only runs if Python-ast.c and/or its .h are missing (they > should have been in the checkout), or older than Python.asdl, which > they are not. I don't see them in the .cvsignore. I believe the problem was caused by the fact that the dates in the local tree aren't the repository dates, so it happened that Parser/Python.adsl had a newer date than Python-ast.[ch]. I did a clean install on my Debian system and got around the issue by touching Python-ast.[c,h] before the build. IMO ASDLGEN s/b a .phony target, run manually as needed by the AST developer. Otherwise there will be no end of trouble when people try to build from CVS after the merge. Absent objection, I'll check in such a change. == The tree compiles, but there is a segfault when make tries to run Python on setup.py. Failure occurs when trying to import site.py == Neal has fixed the import issue and several others!! I was bitten by the Python.asdl / Python-ast.[ch] timing again when updating to his changes.... Branch now builds and python can be started. There are a number of test failures remaining. -- KBK From gvanrossum at gmail.com Mon Jan 17 06:42:33 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 17 06:42:36 2005 Subject: [Python-Dev] PEP 246, Feedback Request In-Reply-To: <20050116040424.GA76191@prometheusresearch.com> References: <20050116040424.GA76191@prometheusresearch.com> Message-ID: > - protocol means any object, usually a type or class or interface, > which guides the construction of an adapter Then what do we call the abstract *concept* of a protocol? > - adaptee-class refers to the adaptee's class Please make it explicit that this is a.__class__, not type(a). > - factory refers to a function, f(adaptee) -> adapter, where > the resulting adapter complies with a given protocol Make this adapter factory -- factory by itself is too commonly used. > - First, the registry is checked for a suitable adapter How about checking whether adaptee.__class__ is equal to the protocol even before this? It would be perverse to declare an adapter from a protocol to itself that wasn't the identity adapter. > - PEP 246 will ask for a `adapt' module, with an `adapt' function. Please don't give both the same name. This practice has caused enough problems in the past. The module can be called adaptation, or adapting (cf. threading; but it doesn't feel right so I guess adaptation is better). > - At any stage of adaptation, if None is returned, the adaptation > continues to the next stage. Maybe use NotImplemented instead? I could imagine that occasionally None would be a valid adapter. (And what do we do when asked adapt None to a protocol? I think it should be left alone but not considered an error.) > - At any stage of adaption, if adapt.AdaptException(TypeError) is > raised, then the adaptation process stops, as if None had been > returned from each stage. Why are there two ways to signal an error? TOOWTDI! > - One can also register a None factory from A->B for the > purpose of marking it transitive. In this circumstance, > the composite adapter is built through __conform__ and > __adapt__. The None registration is just a place holder > to signal that a given path exists. Sounds overkill; the None feels too magical. An explicit adapter can't be too difficult to come up with? > There is a problem with the default isinstance() behavior when > someone derives a class from another to re-use implementation, > but with a different 'concept'. A mechanism to disable > isinstance() is needed for this particular case. Do we really have to care about this case? Has someone found that things go harebrained when this case is not handled? It sounds very much like a theoretical problem only. I don't mean that subclasses that reuse implementation without being substitutable are rare (I've seen and written plenty); I mean that I don't expect this to cause additional problems due to incorrect adaptation. > Guido would like his type declaration syntax (see blog entry) to > be equivalent to a call to adapt() without any additional > arguments. However, not all adapters should be created in the > context of a declaration -- some should be created more > explicitly. We propose a mechanism where an adapter factory can > register itself as not suitable for the declaration syntax. I'm considering a retraction of this proposal, given that adaptation appears to be so subtle and fraught with controversies and pitfalls; but more particularly given the possible requirement (which someone added in a response to a blog) that it should be possible to remove or ignore type declarations without changing the meaning of programs that run correctly (except for code that catches TypeError). > - adapt( , intrinsic_only = False) will enable both sorts of adapters, That's one ugly keyword parameter. If we really need this, I'd like to see something that defaults to False and can be switched on by passing mumble=True on the call. But if we could have only one kind that would be much more attractive. Adaptation looks like it is going to fail the KISS test. > There was discussion as to how to get back to the original > object from an adapter. Is this in scope of PEP 246? Seems too complex. Again, KISS. > Sticky adapters, that is, ones where there is only one instance > per adaptee is a common use case. Should the registry of PEP 246 > provide this feature? Ditto. If you really need this, __adapt__ and __conform__ could use a cache. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Mon Jan 17 07:12:37 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 17 07:12:40 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available Message-ID: https://sourceforge.net/tracker/index.php?func=detail&aid=1103689&group_id=5470&atid=305470 Here's a patch that gets rid of unbound methods, as discussed here before. A function's __get__ method now returns the function unchanged when called without an instance, instead of returning an unbound method object. I couldn't remove support for unbound methods completely, since they were used by the built-in exceptions. (We can get rid of that use once we convert to new-style exceptions.) For backward compatibility, functions now have read-only im_self and im_func attributes; im_self is always None, im_func is always the function itself. (These should issue warnings, but I haven't added that yet.) The test suite passes. (I have only tried "make test" on a Linux box.) What do people think? (My main motivation for this, as stated before, is that it adds complexity without much benefit.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From glyph at divmod.com Mon Jan 17 07:49:07 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Mon Jan 17 07:44:31 2005 Subject: [Python-Dev] PEP 246: let's reset In-Reply-To: <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com> References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com> Message-ID: <1105944547.30052.21.camel@localhost> On Sun, 2005-01-16 at 13:00 -0500, Phillip J. Eby wrote: > """One type is the "extender", ... > By contrast, an "independent adapter" ... I really like the way this part of the PEP is sounding, since it really captures two almost, but not quite, completely different use-cases, the confusion between which generated all the discussion here in the first place. The terminology seems a bit cumbersome though. I'd like to propose that an "extender" be called a "transformer", since it provides a transformation for an underlying object - it changes the shape of the underlying object so it will fit somewhere else, without creating a new object. Similarly, the cumbersome "independent adapter" might be called a "converter", since it converts A into B, where B is some new kind of thing. Although the words are almost synonyms, their implications seem to match up with what's trying to be communicated here. A "transformer" is generally used in the electrical sense - it is a device which changes voltage, and only voltage. It takes in one flavor of current and produces one, and exactly one other. Used in the electrical sense, a "converter" is far more general, since it has no technical meaning that I'm aware of - it might change anything about the current. However, other things are also called converters, such as currency converters, which take one kind of currency and produce another, separate currency. Similar to "independent adapters", this conversion is dependent on a moment in time for the conversion - after the conversion, each currency may gain or lose value relative to the other. If nobody likes this idea, it would seem a bit more symmetric to have "dependent" and "independent" adapters, rather than "extenders" and "independent adapters". As it is I'm left wondering what the concept of dependency in an adapter is. From glyph at divmod.com Mon Jan 17 07:56:59 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Mon Jan 17 07:52:23 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: Message-ID: <1105945019.30052.26.camel@localhost> On Sun, 2005-01-16 at 22:12 -0800, Guido van Rossum wrote: > What do people think? (My main motivation for this, as stated before, > is that it adds complexity without much benefit.) > *************** > *** 331,339 **** > def test_im_class(): > class C: > def foo(self): pass > - verify(C.foo.im_class is C) ^ Without this, as JP Calderone pointed out earlier, you can't serialize unbound methods. I wouldn't mind that so much, but you can't tell that they're any different from regular functions until you're *de*-serializing them. In general I like the patch, but what is the rationale for removing im_class from functions defined within classes? From tjreedy at udel.edu Mon Jan 17 08:39:46 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Jan 17 08:39:55 2005 Subject: [Python-Dev] Re: Getting rid of unbound methods: patch available References: Message-ID: "Guido van Rossum" wrote in message news:ca471dc20501162212446e63b5@mail.gmail.com... > What do people think? (My main motivation for this, as stated before, > is that it adds complexity without much benefit.) >From the viewpoint of learning and explaining Python, this is a plus. I never understood why functions were wrapped as unbounds until this proposal was put forth and discussed. Terry J. Reedy From ncoghlan at iinet.net.au Mon Jan 17 11:01:27 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon Jan 17 11:01:31 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: Message-ID: <41EB8CF7.9010002@iinet.net.au> Guido van Rossum wrote: > What do people think? (My main motivation for this, as stated before, > is that it adds complexity without much benefit.) I'm in favour, since it removes the "an unbound method is almost like a bare function, only not quite as useful" distinction. It would allow things like str.join(sep, seq) to work correctly for a Unicode separator. It also allows 'borrowing' of method implementations without inheritance. I'm a little concerned about the modification to pyclbr_input.py, though (since it presumably worked before the patch). Was the input file tweaked before or after the test itself was fixed? (I'll probably get around to trying out the patch myself, but that will be on Linux as well, so I doubt my results will differ from yours). The other question is the pickling example - an unbound method currently stores meaningful data in im_class, whereas a standard function doesn't have that association. Any code which makes use of im_class on unbound methods (even without involving pickling)is going to have trouble with the change. (Someone else will need to provide a real-life use case though, since I certainly don't have one). Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From arigo at tunes.org Mon Jan 17 11:52:19 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon Jan 17 12:04:03 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: References: Message-ID: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> Hi, On Fri, Jan 14, 2005 at 07:20:31PM -0500, Jim Jewett wrote: > The base of the Exception hierarchy happens to be a classic class. > But why are they "required" to be classic? For reference, PyPy doesn't have old-style classes at all so far, so we had to come up with something about exceptions. After some feedback from python-dev it appears that the following scheme works reasonably well. Actually it's surprizing how little problems we actually encountered by removing the old-/new-style distinction (particularly when compared with the extremely obscure workarounds we had to go through in PyPy itself, e.g. precisely because we wanted exceptions that are member of some (new-style) class hierarchy). Because a bit of Python code tells more than long and verbose explanations, here it is: def app_normalize_exception(etype, value, tb): """Normalize an (exc_type, exc_value) pair: exc_value will be an exception instance and exc_type its class. """ # mistakes here usually show up as infinite recursion, which is fun. while isinstance(etype, tuple): etype = etype[0] if isinstance(etype, type): if not isinstance(value, etype): if value is None: # raise Type: we assume we have to instantiate Type value = etype() elif isinstance(value, tuple): # raise Type, Tuple: assume Tuple contains the constructor # args value = etype(*value) else: # raise Type, X: assume X is the constructor argument value = etype(value) # raise Type, Instance: let etype be the exact type of value etype = value.__class__ elif type(etype) is str: # XXX warn -- deprecated if value is not None and type(value) is not str: raise TypeError("string exceptions can only have a string value") else: # raise X: we assume that X is an already-built instance if value is not None: raise TypeError("instance exception may not have a separate" " value") value = etype etype = value.__class__ # for the sake of language consistency we should not allow # things like 'raise 1', but it's probably fine (i.e. # not ambiguous) to allow them in the explicit form 'raise int, 1' if not hasattr(value, '__dict__') and not hasattr(value, '__slots__'): raise TypeError("raising built-in objects can be ambiguous, " "use 'raise type, value' instead") return etype, value, tb Armin From ncoghlan at iinet.net.au Mon Jan 17 12:49:42 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon Jan 17 12:49:46 2005 Subject: [Python-Dev] PEP 246: let's reset In-Reply-To: References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> Message-ID: <41EBA656.8070409@iinet.net.au> Guido van Rossum wrote: > Typechecking can be trivially defined in terms of adaptation: > > def typecheck(x, T): > y = adapt(x, T) > if y is x: > return y > raise TypeError("...") Assuming the type error displayed contains information on T, the caller can then trivially correct the type error by invoking adapt(arg, T) at the call point (assuming the argument actually *is* adaptable to the desired protocol). The code inside the function still gets to assume the supplied object has the correct type - the only difference is that if adaptation is actually needed, the onus is on the caller to provide it explicitly (and they will get a specific error telling them so). This strikes me as quite an elegant solution. Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From mal at egenix.com Mon Jan 17 13:11:19 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Jan 17 13:11:24 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <41EB8CF7.9010002@iinet.net.au> References: <41EB8CF7.9010002@iinet.net.au> Message-ID: <41EBAB67.2080907@egenix.com> Nick Coghlan wrote: > Guido van Rossum wrote: > >> What do people think? (My main motivation for this, as stated before, >> is that it adds complexity without much benefit.) > > > I'm in favour, since it removes the "an unbound method is almost like a > bare function, only not quite as useful" distinction. It would allow > things like str.join(sep, seq) to work correctly for a Unicode > separator. This won't work. Strings and Unicode are two different types, not subclasses of one another. > It also allows 'borrowing' of method implementations without > inheritance. > > I'm a little concerned about the modification to pyclbr_input.py, though > (since it presumably worked before the patch). Was the input file > tweaked before or after the test itself was fixed? (I'll probably get > around to trying out the patch myself, but that will be on Linux as > well, so I doubt my results will differ from yours). > > The other question is the pickling example - an unbound method currently > stores meaningful data in im_class, whereas a standard function doesn't > have that association. Any code which makes use of im_class on unbound > methods (even without involving pickling)is going to have trouble with > the change. (Someone else will need to provide a real-life use case > though, since I certainly don't have one). I don't think there's much to worry about. At the C level, bound and unbound methods are the same type. The only difference is that bound methods have the object attribute im_self set to an instance object, while unbound methods have it set NULL. Given that the two are already the same type, I don't really see much benefit from dropping the printing of "unbound" in case im_self is NULL... perhaps I'm missing something. As for real life examples: basemethod() in mxTools uses .im_class to figure the right base method to use (contrary to super(), basemethod() also works for old-style classes). basemethod() in return if used in quite a few applications to deal with overriding methods in mixin classes. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 10 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From anthony at interlink.com.au Mon Jan 17 14:45:09 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon Jan 17 14:44:57 2005 Subject: [Python-Dev] Re: how to test behavior wrt an extension type? In-Reply-To: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it> References: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it> Message-ID: <200501180045.10439.anthony@interlink.com.au> On Sunday 16 January 2005 20:38, Alex Martelli wrote: > Problem: to write unit tests showing that the current copy.py > misbehaves with a classic extension type, I need a classic extension > type which defines __copy__ and __deepcopy__ just like /F's > cElementTree does. So, I made one: a small trycopy.c and accompanying > setup.py whose only purpose in life is checking that instances of a > classic type get copied correctly, both shallowly and deeply. But now > -- where do I commit this extension type, so that the unit tests in > test_copy.py can do their job...? > I do not know what the recommended practice is for this kind of issues, > so, I'm asking for guidance (and specifically asking Anthony since my > case deals with 2.3 and 2.4 maintenance and he's release manager for > both, but, of course, everybody's welcome to help!). Surely this can't > be the first case in which a bug got triggered only by a certain > behavior in an extension type, but I couldn't find precedents. Ideas, > suggestions, ...? Beats me - worst comes to worst, I guess we ship the unittest code there with a try/except around the ImportError on the new 'copytest' module, and the test skips if it's not built. Then we don't build it by default, but if someone wants to build it and check it, they can. I don't like this much, but I can't think of a better alternative. Shipping a new extension module just for this unittest seems like a bad idea. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From FBatista at uniFON.com.ar Mon Jan 17 14:48:10 2005 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Jan 17 14:51:46 2005 Subject: [Python-Dev] Deprecating old bugs Message-ID: As I discussed in this list, in the "Policy about old Python versions" thread at 8-Nov-2004, I started verifying the old bugs. Here are the results for 2.1.*. This maybe should be put in an informational PEP. When I verified the bug, I filled two fields: - Group: the bug's group at verifying time. - Bug #: the bug number - Verified: is the date when I checked the bug. - Action: is what I did then. If the bug survived the verification, the next two fields are applicable (if not, I put a dash, the idea is to keep this info easily parseable): - Final: is the action took by someone who eliminated the bug from that category (closed, moved to Py2.4, etc). - By: is the someone who did the final action. Group: 2.1.1 Bug #: 1020605 Verified: 08-Nov-2004 Action: Closed: Invalid. Was a Mailman issue, not a Python one. Final: - By: - Group: 2.1.2 Bug #: 771429 Verified: 08-Nov-2004 Action: Deprecation alerted. I can not try it, don't have that context. Final: Closed: Won't fix. By: facundobatista Group: 2.1.2 Bug #: 629345 Verified: 08-Nov-2004 Action: Deprecation alerted. Can't discern if it's really a bug or not. Final: Closed: Won't fix. By: facundobatista Group: 2.1.2 Bug #: 589149 Verified: 08-Nov-2004 Action: Closed: Fixed. The problem is solved from Python 2.3a1, as the submitter posted. Final: - By: - I included here only 2.1.* because there were only four, so it's a good trial. If you think I should change the format or add more information, please let me know ASAP. The next chapter of this story will cover 2.2 bugs. Regards, . Facundo Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog PyAr - Python Argentina: http://pyar.decode.com.ar/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA. La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050117/ba12c730/attachment.htm From aleax at aleax.it Mon Jan 17 15:03:46 2005 From: aleax at aleax.it (Alex Martelli) Date: Mon Jan 17 15:03:54 2005 Subject: [Python-Dev] Re: how to test behavior wrt an extension type? In-Reply-To: <200501180045.10439.anthony@interlink.com.au> References: <59527618-67A2-11D9-ADA4-000A95EFAE9E@aleax.it> <200501180045.10439.anthony@interlink.com.au> Message-ID: <9AC0ACCE-6890-11D9-9DED-000A95EFAE9E@aleax.it> On 2005 Jan 17, at 14:45, Anthony Baxter wrote: ... >> both, but, of course, everybody's welcome to help!). Surely this >> can't >> be the first case in which a bug got triggered only by a certain >> behavior in an extension type, but I couldn't find precedents. Ideas, >> suggestions, ...? > > Beats me - worst comes to worst, I guess we ship the unittest code > there with a try/except around the ImportError on the new 'copytest' > module, and the test skips if it's not built. Then we don't build it by > default, but if someone wants to build it and check it, they can. I > don't > like this much, but I can't think of a better alternative. Shipping a > new > extension module just for this unittest seems like a bad idea. Agreed about this issue not warranting the shipping of a new extension module -- however, in the patch (to the 2.3 maintenance branch) which I uploaded (and assigned to you), I followed the effbot's suggestion, and added the type needed for testing to the already existing "extension module for testing purposes", namely Modules/_testcapi.c -- I don't think it can do any harm there, and lets test/test_copy.py do all of its testing blissfully well. I haven't even made the compilation of the part of Modules/_testcapi.c which hold the new type conditional upon anything, because I don't think that having it there unconditionally can possibly break anything anyway... _testcapi IS only used for testing, after all...! Alex From mwh at python.net Mon Jan 17 15:06:39 2005 From: mwh at python.net (Michael Hudson) Date: Mon Jan 17 15:06:41 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: (Guido van Rossum's message of "Sat, 15 Jan 2005 17:57:53 -0800") References: Message-ID: <2mmzv8cfps.fsf@starship.python.net> Guido van Rossum writes: > To be honest, I don't recall the exact reasons why this wasn't fixed > in 2.2; I believe it has something to do with the problem of > distinguishing between string and class exception, and between the > various forms of raise statements. A few months back I hacked an attempt to make all exceptions new-style. It's not especially hard, but it's tedious. There's lots of code (more than I expected, anyway) to change and my attempt ended up being pretty messy. I suspect allowing both old- and new-style classes would be no harder, but even more tedious and messy. It would still be worth doing, IMHO. Cheers, mwh -- If you are anal, and you love to be right all the time, C++ gives you a multitude of mostly untimportant details to fret about so you can feel good about yourself for getting them "right", while missing the big picture entirely -- from Twisted.Quotes From anthony at interlink.com.au Mon Jan 17 14:41:01 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon Jan 17 15:30:26 2005 Subject: [Python-Dev] 2.3.5 delayed til next week Message-ID: <200501180041.03195.anthony@interlink.com.au> As I'd kinda feared, my return to work has left me completely buried this week, and so I'm going to have to push 2.3.5 until next week. Thomas and Fred: does one of the days in the range 25-27 January suit you? The 26th is a public holiday here, and so that's the day that's most likely for me... Anthony -- Anthony Baxter It's never too late to have a happy childhood. From gvanrossum at gmail.com Mon Jan 17 16:27:33 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 17 16:27:36 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> Message-ID: [Armin] > For reference, PyPy doesn't have old-style classes at all so far, so we had to > come up with something about exceptions. After some feedback from python-dev > it appears that the following scheme works reasonably well. Actually it's > surprizing how little problems we actually encountered by removing the > old-/new-style distinction (particularly when compared with the extremely > obscure workarounds we had to go through in PyPy itself, e.g. precisely > because we wanted exceptions that are member of some (new-style) class > hierarchy). > > Because a bit of Python code tells more than long and verbose explanations, > here it is: > > def app_normalize_exception(etype, value, tb): [...] > elif type(etype) is str: > # XXX warn -- deprecated > if value is not None and type(value) is not str: > raise TypeError("string exceptions can only have a string value") That is stricter than classic Python though -- it allows the value to be anything (and you get the value back unadorned in the except 's', x: clause). [Michael] > It would still be worth doing, IMHO. Then let's do it. Care to resurrect your patch? (And yes, classic classes should also be allowed for b/w compatibility.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Mon Jan 17 16:39:30 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 17 16:38:10 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: Message-ID: <5.1.1.6.0.20050117103315.02f70a10@mail.telecommunity.com> At 10:12 PM 1/16/05 -0800, Guido van Rossum wrote: >I couldn't remove support for unbound methods >completely, since they were used by the built-in >exceptions. (We can get rid of that use once we convert >to new-style exceptions.) Will it still be possible to create an unbound method with new.instancemethod? (I know the patch doesn't change this, I mean, is it planned to remove the facility from the instancemethod type?) From gvanrossum at gmail.com Mon Jan 17 16:43:18 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 17 16:43:21 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <1105945019.30052.26.camel@localhost> References: <1105945019.30052.26.camel@localhost> Message-ID: [Guido] > > def test_im_class(): > > class C: > > def foo(self): pass > > - verify(C.foo.im_class is C) [Glyph] > ^ Without this, as JP Calderone pointed out earlier, you can't serialize > unbound methods. I wouldn't mind that so much, but you can't tell that > they're any different from regular functions until you're > *de*-serializing them. Note that you can't pickle unbound methods anyway unless you write specific suppport code to do that; it's not supported by pickle itself. I think that use case is weak. If you really have the need to pickle an individual unbound method, it's less work to create a global helper function and pickle that, than to write the additional pickling support for picking unbound methods. > In general I like the patch, but what is the rationale for removing > im_class from functions defined within classes? The information isn't easily available to the function. I could go around and change the parser to make this info available, but that would require changes in many places currently untouched by the patch. [Nick] > I'm a little concerned about the modification to pyclbr_input.py, though (since > it presumably worked before the patch). Was the input file tweaked before or > after the test itself was fixed? (I'll probably get around to trying out the > patch myself, but that will be on Linux as well, so I doubt my results will > differ from yours). It is just a work-around for stupidity in the test code, which tries to filter out cases like "om = Other.om" because the pyclbr code doesn't consider these. pyclbr.py hasn't changed, and still doesn't consider these (since it parses the source code); but the clever test in the test code no longer works. > The other question is the pickling example - an unbound method currently stores > meaningful data in im_class, whereas a standard function doesn't have that > association. Any code which makes use of im_class on unbound methods (even > without involving pickling)is going to have trouble with the change. (Someone > else will need to provide a real-life use case though, since I certainly don't > have one). Apart from the tests that were testing the behavior of im_class, I found only a single piece of code in the standard library that used im_class of an unbound method object (the clever test in the pyclbr test). Uses of im_self and im_func were more widespread. Given the level of cleverness in the pyclbr test (and the fact that I wrote it myself) I'm not worried about widespread use of im_class on unbound methods. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Mon Jan 17 16:45:26 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 17 16:44:02 2005 Subject: [Python-Dev] PEP 246: let's reset In-Reply-To: <1105944547.30052.21.camel@localhost> References: <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com> <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com> At 01:49 AM 1/17/05 -0500, Glyph Lefkowitz wrote: >On Sun, 2005-01-16 at 13:00 -0500, Phillip J. Eby wrote: > > > """One type is the "extender", ... > > > By contrast, an "independent adapter" ... > >I really like the way this part of the PEP is sounding, since it really >captures two almost, but not quite, completely different use-cases, the >confusion between which generated all the discussion here in the first >place. The terminology seems a bit cumbersome though. > >I'd like to propose that an "extender" be called a "transformer", since >it provides a transformation for an underlying object - it changes the >shape of the underlying object so it will fit somewhere else, without >creating a new object. Similarly, the cumbersome "independent adapter" >might be called a "converter", since it converts A into B, where B is >some new kind of thing. Heh. As long as you're going to continue the electrical metaphor, why not just call them transformers and appliances? Appliances "convert" electricity into useful non-electricity things, and it's obvious that you can have more than one, they're independent objects, etc. Whereas a transformer or converter would be something you use in order to be able to change the electricity itself. Calling views and iterators "appliances" might be a little weird at first, but it fits. (At one point, I thought about calling them "accessories".) >If nobody likes this idea, it would seem a bit more symmetric to have >"dependent" and "independent" adapters, rather than "extenders" and >"independent adapters". As it is I'm left wondering what the concept of >dependency in an adapter is. It's that independent adapters each have state independent from other independent adapters of the same type for the same object. (vs. extenders having shared state amongst themselves, even if you have more than one) From gvanrossum at gmail.com Mon Jan 17 16:45:52 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 17 16:45:55 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <5.1.1.6.0.20050117103315.02f70a10@mail.telecommunity.com> References: <5.1.1.6.0.20050117103315.02f70a10@mail.telecommunity.com> Message-ID: > Will it still be possible to create an unbound method with > new.instancemethod? (I know the patch doesn't change this, I mean, is it > planned to remove the facility from the instancemethod type?) I was hoping to be able to get rid of this as soon as the built-in exceptions code no longer depends on it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Mon Jan 17 16:50:03 2005 From: theller at python.net (Thomas Heller) Date: Mon Jan 17 16:48:34 2005 Subject: [Python-Dev] Re: 2.3.5 delayed til next week In-Reply-To: <200501180041.03195.anthony@interlink.com.au> (Anthony Baxter's message of "Tue, 18 Jan 2005 00:41:01 +1100") References: <200501180041.03195.anthony@interlink.com.au> Message-ID: Anthony Baxter writes: > As I'd kinda feared, my return to work has left me completely > buried this week, and so I'm going to have to push 2.3.5 until > next week. Thomas and Fred: does one of the days in the > range 25-27 January suit you? The 26th is a public holiday here, > and so that's the day that's most likely for me... > 25-27 January are all ok for me. Will there be a lot of backports, or are they already in place? If they are already there, I can build the installer as soon as Fred has built the html docs. Thomas From arigo at tunes.org Mon Jan 17 16:49:07 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon Jan 17 17:00:48 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> Message-ID: <20050117154907.GA7853@vicky.ecs.soton.ac.uk> Hi Guido, On Mon, Jan 17, 2005 at 07:27:33AM -0800, Guido van Rossum wrote: > That is stricter than classic Python though -- it allows the value to > be anything (and you get the value back unadorned in the except 's', > x: clause). Thanks for the note ! Armin From gjc at inescporto.pt Mon Jan 17 16:05:08 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Mon Jan 17 17:02:04 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41EA9196.1020709@xs4all.nl> References: <41EA9196.1020709@xs4all.nl> Message-ID: <1105974308.17513.1.camel@localhost> If someone could take a look at: [ 1069624 ] incomplete support for AF_PACKET in socketmodule.c I have to ship my own patched copy of the socket module because of this... :| On Sun, 2005-01-16 at 17:08 +0100, Irmen de Jong wrote: > Hello > I've looked at one bug and a bunch of patches and > added a comment to them: > > (bug) [ 1102649 ] pickle files should be opened in binary mode > Added a comment about a possible different solution > > [ 946207 ] Non-blocking Socket Server > Useless, what are the mixins for? Recommend close > > [ 756021 ] Allow socket.inet_aton('255.255.255.255') on Windows > Looks good but added suggestion about when to test for special case > > [ 740827 ] add urldecode() method to urllib > I think it's better to group these things into urlparse > > [ 579435 ] Shadow Password Support Module > Would be nice to have, I recently just couldn't do the user > authentication that I wanted: based on the users' unix passwords > > [ 1093468 ] socket leak in SocketServer > Trivial and looks harmless, but don't the sockets > get garbage collected once the request is done? > > [ 1049151 ] adding bool support to xdrlib.py > Simple patch and 2.4 is out now, so... > > > > It would be nice if somebody could have a look at my > own patches or help me a bit with them: > > [ 1102879 ] Fix for 926423: socket timeouts + Ctrl-C don't play nice > [ 1103213 ] Adding the missing socket.recvall() method > [ 1103350 ] send/recv SEGMENT_SIZE should be used more in socketmodule > [ 1062014 ] fix for 764437 AF_UNIX socket special linux socket names > [ 1062060 ] fix for 1016880 urllib.urlretrieve silently truncates dwnld > > Some of them come from the last Python Bug Day, see > http://www.python.org/moin/PythonBugDayStatus > > > Thank you ! > > Regards, > > --Irmen de Jong > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/gjc%40inescporto.pt -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic. -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3086 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050117/8fb01c9b/smime-0001.bin From just at letterror.com Mon Jan 17 17:06:04 2005 From: just at letterror.com (Just van Rossum) Date: Mon Jan 17 17:06:12 2005 Subject: [Python-Dev] PEP 246: let's reset In-Reply-To: <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > Heh. As long as you're going to continue the electrical metaphor, > why not just call them transformers and appliances? [ ... ] Next we'll see Appliance-Oriented Programming ;-) Just From mwh at python.net Mon Jan 17 17:06:40 2005 From: mwh at python.net (Michael Hudson) Date: Mon Jan 17 17:06:44 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: (Guido van Rossum's message of "Mon, 17 Jan 2005 07:27:33 -0800") References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> Message-ID: <2mbrboca5r.fsf@starship.python.net> Guido van Rossum writes: > [Michael] >> It would still be worth doing, IMHO. > > Then let's do it. Care to resurrect your patch? (And yes, classic > classes should also be allowed for b/w compatibility.) I found it and uploaded it here: http://starship.python.net/crew/mwh/new-style-exception-hacking.diff The change to type_str was the sort of unexpected change I was talking about. TBH, I'm not sure it's really worth working from my patch, a more sensible course would be to just do the work again, but paying a bit more attention to getting a maintainable result. Questions: a) Is Exception to be new-style? b) Somewhat but not entirely independently, would demanding that all new-style exceptions inherit from Exception be reasonable? Cheers, mwh -- ZAPHOD: You know what I'm thinking? FORD: No. ZAPHOD: Neither do I. Frightening isn't it? -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From fdrake at acm.org Mon Jan 17 17:09:57 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon Jan 17 17:10:04 2005 Subject: [Python-Dev] Re: 2.3.5 delayed til next week In-Reply-To: <200501180041.03195.anthony@interlink.com.au> References: <200501180041.03195.anthony@interlink.com.au> Message-ID: <200501171109.04797.fdrake@acm.org> On Monday 17 January 2005 08:41, Anthony Baxter wrote: > As I'd kinda feared, my return to work has left me completely > buried this week, and so I'm going to have to push 2.3.5 until > next week. Thomas and Fred: does one of the days in the > range 25-27 January suit you? The 26th is a public holiday here, > and so that's the day that's most likely for me... Sounds good to me. Anything in that range is equally doable. -Fred -- Fred L. Drake, Jr. From pje at telecommunity.com Mon Jan 17 17:35:53 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 17 17:34:31 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <2mbrboca5r.fsf@starship.python.net> References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> Message-ID: <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> At 04:06 PM 1/17/05 +0000, Michael Hudson wrote: >a) Is Exception to be new-style? Probably not in 2.5; Martin and others have suggested that this could introduce instability for users' existing exception classes. >b) Somewhat but not entirely independently, would demanding that all > new-style exceptions inherit from Exception be reasonable? Yes. Right now you can't have a new-style exception at all, so it would be quite reasonable to require new ones to inherit from Exception. From gvanrossum at gmail.com Mon Jan 17 19:16:45 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 17 19:16:49 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> <2mbrboca5r.fsf@starship.python.net> <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> Message-ID: On Mon, 17 Jan 2005 11:35:53 -0500, Phillip J. Eby wrote: > At 04:06 PM 1/17/05 +0000, Michael Hudson wrote: > >a) Is Exception to be new-style? > > Probably not in 2.5; Martin and others have suggested that this could > introduce instability for users' existing exception classes. Really? I thought that was eventually decided to be a very small amount of code. > >b) Somewhat but not entirely independently, would demanding that all > > new-style exceptions inherit from Exception be reasonable? > > Yes. Right now you can't have a new-style exception at all, so it would be > quite reasonable to require new ones to inherit from Exception. That would be much more reasonable if Exception itself was a new-style class. As long as it isn't, you'd have to declare new-style classes like this: class MyError(Exception, object): ... which is ugly. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Mon Jan 17 19:21:13 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 17 19:21:17 2005 Subject: [Python-Dev] PEP 246: let's reset In-Reply-To: <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com> References: <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com> <1105944547.30052.21.camel@localhost> <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com> Message-ID: > Heh. As long as you're going to continue the electrical metaphor, why not > just call them transformers and appliances? Please don't. Transformer is commonly used in all sorts of contexts. But appliances applies mostly to kitchenware and the occasional marketing term for cheap computers. The electrical metaphor is cute, but doesn't cut it IMO. Adapter, converter and transformer all sound to me like they imply an "as a" relationship rather than "has a". The "has a" kind feels more like a power tool to me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Mon Jan 17 19:32:23 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Mon Jan 17 19:32:44 2005 Subject: [Python-Dev] Re: 2.3.5 delayed til next week In-Reply-To: (Thomas Heller's message of "Mon, 17 Jan 2005 16:50:03 +0100") References: <200501180041.03195.anthony@interlink.com.au> Message-ID: <87d5w3vrd4.fsf@hydra.bayview.thirdcreek.com> Thomas Heller writes: > 25-27 January are all ok for me. Will there be a lot of backports, or > are they already in place? If they are already there, I can build the > installer as soon as Fred has built the html docs. I've got a couple, I'll get them in by tomorrow. -- KBK From pje at telecommunity.com Mon Jan 17 19:34:30 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 17 19:33:08 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: References: <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> <20050117105219.GA12763@vicky.ecs.soton.ac.uk> <2mbrboca5r.fsf@starship.python.net> <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050117133028.030bf570@mail.telecommunity.com> At 10:16 AM 1/17/05 -0800, Guido van Rossum wrote: >On Mon, 17 Jan 2005 11:35:53 -0500, Phillip J. Eby > wrote: > > At 04:06 PM 1/17/05 +0000, Michael Hudson wrote: > > >a) Is Exception to be new-style? > > > > Probably not in 2.5; Martin and others have suggested that this could > > introduce instability for users' existing exception classes. > >Really? I thought that was eventually decided to be a very small amount of >code. Guess I missed that part of the thread in the ongoing flood of PEP 246 stuff. :) >That would be much more reasonable if Exception itself was a new-style >class. As long as it isn't, you'd have to declare new-style classes >like this: > >class MyError(Exception, object): > ... > >which is ugly. I was thinking the use case was that you were having to add 'Exception', not that you were adding 'object'. The two times in the past that I wanted to make a new-style class an exception, I *first* made it a new-style class, and *then* tried to make it an exception. I believe the OP on this thread described the same thing. But whatever; as long as it's *possible*, I don't care much how it's done, and I can't think of anything in my code that would break from making Exception new-style. From pje at telecommunity.com Mon Jan 17 19:42:54 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Jan 17 19:41:31 2005 Subject: [Python-Dev] PEP 246: let's reset In-Reply-To: References: <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com> <82543A8B-6749-11D9-B46A-0003934AD54A@chello.se> <5.1.1.6.0.20050115210716.034cb030@mail.telecommunity.com> <5.1.1.6.0.20050116124501.0349f7c0@mail.telecommunity.com> <1105944547.30052.21.camel@localhost> <5.1.1.6.0.20050117104114.02f79060@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050117133747.02c66ec0@mail.telecommunity.com> At 10:21 AM 1/17/05 -0800, Guido van Rossum wrote: > > Heh. As long as you're going to continue the electrical metaphor, why not > > just call them transformers and appliances? > >Please don't. Transformer is commonly used in all sorts of contexts. >But appliances applies mostly to kitchenware and the occasional >marketing term for cheap computers. > >The electrical metaphor is cute, but doesn't cut it IMO. Adapter, >converter and transformer all sound to me like they imply an "as a" >relationship rather than "has a". The "has a" kind feels more like a >power tool to me. By the way, another use case for type declarations supporting dynamic "as-a" adapters... Chandler's data model has a notion of "kinds" that a single object can be, like Email, Appointment, etc. A single object can be of multiple kinds, sort of like per-instance multiple-inheritance. Which means that passing the same object to routines taking different types would "do the right thing" with such an object if they adapted to the desired kind, and if such adaptation removed the existing kind-adapter and replaced it with the destination kind-adapter. So, there's an underlying object that just represents the identity, and then everything else is "as-a" adaptation. From gvanrossum at gmail.com Mon Jan 17 19:44:54 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 17 19:44:57 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <5.1.1.6.0.20050117133028.030bf570@mail.telecommunity.com> References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> <2mbrboca5r.fsf@starship.python.net> <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> <5.1.1.6.0.20050117133028.030bf570@mail.telecommunity.com> Message-ID: > >That would be much more reasonable if Exception itself was a new-style > >class. As long as it isn't, you'd have to declare new-style classes > >like this: > > > >class MyError(Exception, object): > > ... > > > >which is ugly. > > I was thinking the use case was that you were having to add 'Exception', > not that you were adding 'object'. The two times in the past that I wanted > to make a new-style class an exception, I *first* made it a new-style > class, and *then* tried to make it an exception. I believe the OP on this > thread described the same thing. > > But whatever; as long as it's *possible*, I don't care much how it's done, > and I can't think of anything in my code that would break from making > Exception new-style. Well, right now you would only want to make an exception a new style class if you had a very specific use case for wanting the new style class. But once we allow new-style exceptions *and* require them to inherit from Exception, we pretty much send the message "if you're not using new-style exceptions derived from Exception your code is out of date" and that means it should be as simple as possible to make code conform. And that means IMO making Exception a new style class. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Mon Jan 17 23:12:25 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 17 23:12:26 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <1105974308.17513.1.camel@localhost> References: <41EA9196.1020709@xs4all.nl> <1105974308.17513.1.camel@localhost> Message-ID: <41EC3849.8040503@v.loewis.de> Gustavo J. A. M. Carneiro wrote: > If someone could take a look at: > > [ 1069624 ] incomplete support for AF_PACKET in socketmodule.c The rule applies: five reviews, with results posted to python-dev, and I will review your patch. Regards, Martin From martin at v.loewis.de Mon Jan 17 23:14:54 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 17 23:14:55 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> <2mbrboca5r.fsf@starship.python.net> <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> Message-ID: <41EC38DE.8080603@v.loewis.de> Guido van Rossum wrote: >>>a) Is Exception to be new-style? >> >>Probably not in 2.5; Martin and others have suggested that this could >>introduce instability for users' existing exception classes. > > > Really? I thought that was eventually decided to be a very small amount of code. I still think that only an experiment could decide: somebody should come up with a patch that does that, and we will see what breaks. I still have the *feeling* that this has significant impact, but I could not pin-point this to any specific problem I anticipate. Regards, Martin From gjc at inescporto.pt Mon Jan 17 23:27:52 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Mon Jan 17 23:28:02 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41EC3849.8040503@v.loewis.de> References: <41EA9196.1020709@xs4all.nl> <1105974308.17513.1.camel@localhost> <41EC3849.8040503@v.loewis.de> Message-ID: <1106000872.5931.6.camel@emperor> On Mon, 2005-01-17 at 23:12 +0100, "Martin v. L?wis" wrote: > Gustavo J. A. M. Carneiro wrote: > > If someone could take a look at: > > > > [ 1069624 ] incomplete support for AF_PACKET in socketmodule.c > > > The rule applies: five reviews, with results posted to python-dev, > and I will review your patch. Oh... sorry, I didn't know about any rules. /me hides in shame. -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic From martin at v.loewis.de Mon Jan 17 23:46:44 2005 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 17 23:46:47 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <1106000872.5931.6.camel@emperor> References: <41EA9196.1020709@xs4all.nl> <1105974308.17513.1.camel@localhost> <41EC3849.8040503@v.loewis.de> <1106000872.5931.6.camel@emperor> Message-ID: <41EC4054.6000908@v.loewis.de> Gustavo J. A. M. Carneiro wrote: > Oh... sorry, I didn't know about any rules. My apologies - I had announced this (personal) rule a few times, so I thought everybody on python-dev knew. If you really want to push a patch, you can do so by doing your own share of work, namely by reviewing other's patches. If you don't, someone will apply your patch when he finds the time to do so. So if you can wait, it might be best to wait a few months (this won't go into 2.4 patch releases, anyway). I think Brett Cannon now also follows this rule; it really falls short enough in practice because (almost) nobody really wants to push his patch bad enough to put some work into it to review other patches. Regards, Martin From mal at egenix.com Mon Jan 17 23:58:34 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Jan 17 23:58:37 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: <1105945019.30052.26.camel@localhost> Message-ID: <41EC431A.90204@egenix.com> Guido van Rossum wrote: > Apart from the tests that were testing the behavior of im_class, I > found only a single piece of code in the standard library that used > im_class of an unbound method object (the clever test in the pyclbr > test). Uses of im_self and im_func were more widespread. Given the > level of cleverness in the pyclbr test (and the fact that I wrote it > myself) I'm not worried about widespread use of im_class on unbound > methods. I guess this depends on how you define widespread use. I'm using this feature a lot via the basemethod() function in mxTools for calling the base method of an overridden method in mixin classes (basemethod() predates super() and unlike the latter works for old-style classes). What I don't understand in your reasoning is that you are talking about making an unbound method look more like a function. Unbound methods and bound methods are objects of the same type - the method object. By turning an unbound method into a function type, you break code that tests for MethodType in Python or does a PyMethod_Check() at C level. If you want to make methods look more like functions, the method object should become a subclass of the function object (function + added im_* attributes). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 10 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From glyph at divmod.com Tue Jan 18 00:33:34 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Tue Jan 18 00:28:49 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: <1105945019.30052.26.camel@localhost> Message-ID: <1106004814.30052.80.camel@localhost> On Mon, 2005-01-17 at 07:43 -0800, Guido van Rossum wrote: > Note that you can't pickle unbound methods anyway unless you write > specific suppport code to do that; it's not supported by pickle > itself. It's supported by Twisted. Alternatively, replace "pickle" with "python object serializer of my design" - I am concerned both about useful information being removed, and about specific features of Pickle. Twisted's .tap files have always pushed the limits of pickle. I don't remember why my users wanted this specific feature - the code in question is almost 3 years old - but when you think of a pickle as a self-contained universe of running Python objects, any plausible reason why one might want a reference to an unbound method in code becomes a reason to want to serialize one. The only time I've used it myself was to pickle attributes of interfaces, which I no longer need to do since zope.interface has its own object types for that, so it's not really _that_ important to me. On the other hand, if PJE's "monkey typing" PEP is accepted, there will probably be lots more reasons to serialize unbound methods, for descriptive purposes. > I think that use case is weak. It's not the strongest use-case in the world, but is the impetus to remove unbound method objects from Python that much stronger? I like the fact that it's simpler, but it's a small amount of extra simplicity, it doesn't seem to enable any new use-cases, and it breaks the potential for serialization. In general, Pickle handles other esoteric, uncommon use-cases pretty well: >>> x = [] >>> y = (x,) >>> x.append(y) >>> import cPickle >>> cPickle.dumps(x) '(lp1\n(g1\ntp2\na.' >>> x [([...],)] since when you need 'em, you really need 'em. Method objects were previously unsupported, which is fine because they're pretty uncommon. Not only would this patch continue to not support them, though, it makes the problem impossible to fix in 3rd-party code. By removing the unbound method type, it becomes an issue that has to be fixed in the standard library. Assuming that 3rd party code will not be able to change the way that functions are pickled and unpickled in cPickle, in python2.5. Ironically, I think that this use case is also going to become more common if the patch goes in, because then it is going to be possible to "borrow" functionality without going around a method's back to grab its im_func. > If you really have the need to pickle an individual unbound method, > it's less work to create a global helper function and pickle that, > than to write the additional pickling support for picking unbound > methods. This isn't true if you've already got the code written, which I do ;-). From glyph at divmod.com Tue Jan 18 00:35:23 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Tue Jan 18 00:30:36 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <41EC431A.90204@egenix.com> References: <1105945019.30052.26.camel@localhost> <41EC431A.90204@egenix.com> Message-ID: <1106004923.30052.82.camel@localhost> On Mon, 2005-01-17 at 23:58 +0100, M.-A. Lemburg wrote: > If you want to make methods look more like functions, > the method object should become a subclass of the function > object (function + added im_* attributes). I think this suggestion would fix my serialization problem as well... but does it actually buy enough extra simplicity to make it worthwhile? From barry at python.org Tue Jan 18 01:29:35 2005 From: barry at python.org (Barry Warsaw) Date: Tue Jan 18 01:29:42 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <41EC431A.90204@egenix.com> References: <1105945019.30052.26.camel@localhost> <41EC431A.90204@egenix.com> Message-ID: <1106008175.20172.115.camel@geddy.wooz.org> On Mon, 2005-01-17 at 17:58, M.-A. Lemburg wrote: > If you want to make methods look more like functions, > the method object should become a subclass of the function > object (function + added im_* attributes). I have no personal use cases, but it does make me vaguely uncomfortable to lose im_class. Isn't it possible to preserve this attribute? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050117/ac818a02/attachment.pgp From bob at redivi.com Tue Jan 18 02:07:13 2005 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 18 02:07:18 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <1106004814.30052.80.camel@localhost> References: <1105945019.30052.26.camel@localhost> <1106004814.30052.80.camel@localhost> Message-ID: <49A0D307-68ED-11D9-A13E-000A95BA5446@redivi.com> On Jan 17, 2005, at 18:33, Glyph Lefkowitz wrote: > It's not the strongest use-case in the world, but is the impetus to > remove unbound method objects from Python that much stronger? I like > the fact that it's simpler, but it's a small amount of extra > simplicity, > it doesn't seem to enable any new use-cases, and it breaks the > potential > for serialization. Well, it lets you meaningfully do: class Foo: def someMethod(self): pass class Bar: someMethod = Foo.someMethod Where now you have to do: class Bar: someMethod = Foo.someMethod.im_func I'm not sure how useful this actually is, though. -bob From gvanrossum at gmail.com Tue Jan 18 06:15:42 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 18 06:15:48 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <41EC431A.90204@egenix.com> References: <1105945019.30052.26.camel@localhost> <41EC431A.90204@egenix.com> Message-ID: [Guido] > > Apart from the tests that were testing the behavior of im_class, I > > found only a single piece of code in the standard library that used > > im_class of an unbound method object (the clever test in the pyclbr > > test). Uses of im_self and im_func were more widespread. Given the > > level of cleverness in the pyclbr test (and the fact that I wrote it > > myself) I'm not worried about widespread use of im_class on unbound > > methods. [Marc-Andre] > I guess this depends on how you define widespread use. I'm using > this feature a lot via the basemethod() function in mxTools for > calling the base method of an overridden method in mixin classes > (basemethod() predates super() and unlike the latter works for > old-style classes). I'm not sure I understand how basemethod is supposed to work; I can't find docs for it using Google (only three hits for the query mxTools basemethod). How does it depend on im_class? > What I don't understand in your reasoning is that you are talking > about making an unbound method look more like a function. That's a strange interpretation. I'm getting rid of the unbound method object altogether. > Unbound > methods and bound methods are objects of the same type - > the method object. Yeah I know that. :-) And it is one of the problems -- the two uses are quite distinct and yet it's the same object, which is confusing. > By turning an unbound method into a function > type, you break code that tests for MethodType in Python > or does a PyMethod_Check() at C level. My expectation is that there is very little code like that. Almost all the code that I found doing that in the core Python code (none in C BTW) was in the test suite. > If you want to make methods look more like functions, > the method object should become a subclass of the function > object (function + added im_* attributes). Can't do that, since the (un)bound method object supports binding other callables besides functions. [Glyph] > On the other hand, if PJE's "monkey typing" PEP is accepted, there will > probably be lots more reasons to serialize unbound methods, for > descriptive purposes. Let's cross that bridge when we get to it. > It's not the strongest use-case in the world, but is the impetus to > remove unbound method objects from Python that much stronger? Perhaps not, but we have to strive for simplicity whenever we can, to counteract the inevitable growth in complexity of the language elsewhere. > I like > the fact that it's simpler, but it's a small amount of extra simplicity, > it doesn't seem to enable any new use-cases, I think it does. You will be able to get a method out of a class and put it into another unrelated class. Previously, you would have to use __dict__ or im_func to do that. Also, I've always liked the explanation of method calls that C().f() is the same as C.f(C()) and to illustrate this it would be nice to say "look, C.f is just a function". > and it breaks the potential for serialization. For which you seem to have no use yourself. The fact that you support it doesn't prove that it's used -- large software packages tend to accrete lots of unused features over time, because it's safer to keep it in than to remove it. This is a trend I'd like to buck with Python. There's lots of dead code in Python's own standard library, and one day it will bite the dust. [Barry] > I have no personal use cases, but it does make me vaguely uncomfortable > to lose im_class. Isn't it possible to preserve this attribute? That vague uncomfort is called FUD until proven otherwise. :-) Keeping im_class would be tricky -- the information isn't easily available when the function is defined, and adding it would require changing unrelated code that the patch so far didn't have to get near. Also, it would not be compatible -- the unbound method sets im_class to whichever class was used to retrieve the attribute, not the class in which the function was defined. -- --Guido van Rossum (home page: http://www.python.org/~guido/)> From gvanrossum at gmail.com Tue Jan 18 06:17:44 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 18 06:17:48 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <41EC38DE.8080603@v.loewis.de> References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> <2mbrboca5r.fsf@starship.python.net> <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> <41EC38DE.8080603@v.loewis.de> Message-ID: > I still think that only an experiment could decide: somebody should > come up with a patch that does that, and we will see what breaks. > > I still have the *feeling* that this has significant impact, but > I could not pin-point this to any specific problem I anticipate. This sounds like a good approach. We should do this now in 2.5, and as alpha and beta testing progresses we can decide whether to roll it back of what kind of backwards compatibility to provide. (Most exceptions are very short classes with very limited behavior, so I expect that in the large majority of cases it won't matter. The question is of course how small the remaining minority is.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney at avaya.com Tue Jan 18 06:56:11 2005 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Jan 18 06:56:17 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE721231@au3010avexu1.global.avaya.com> Guido van Rossum wrote: > Keeping im_class would be tricky -- the information isn't easily > available when the function is defined, and adding it would require > changing unrelated code that the patch so far didn't have to get near. > Also, it would not be compatible -- the unbound method sets im_class > to whichever class was used to retrieve the attribute, not the class > in which the function was defined. I actually do have a use case for im_class, but not in its current incarnation. It would be useful if im_class was set (permanently) to the class in which the function was defined. My use case is my autosuper recipe. Currently I have to trawl through the MRO, comparing code objects to find out which class I'm currently in. Most annoyingly, I have to trawl *beyond* where I first find the function, in case it's actually come from a base class (otherwise infinite recursion can result ;) If im_func were set to the class where the function was defined, I could definitely avoid the second part of the trawling (not sure about the first yet, since I need to get at the function object). Cheers. Tim Delaney From ncoghlan at iinet.net.au Tue Jan 18 09:37:05 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Jan 18 09:37:11 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <41EBAB67.2080907@egenix.com> References: <41EB8CF7.9010002@iinet.net.au> <41EBAB67.2080907@egenix.com> Message-ID: <41ECCAB1.5080303@iinet.net.au> M.-A. Lemburg wrote: > Nick Coghlan wrote: > >> Guido van Rossum wrote: >> >>> What do people think? (My main motivation for this, as stated before, >>> is that it adds complexity without much benefit.) >> >> >> >> I'm in favour, since it removes the "an unbound method is almost like >> a bare function, only not quite as useful" distinction. It would allow >> things like str.join(sep, seq) to work correctly for a Unicode separator. > > > This won't work. Strings and Unicode are two different types, > not subclasses of one another. My comment was based on misremembering how str.join actually works. It automatically flips to Unicode when it finds a Unicode string in the sequence - however, it doesn't do that for the separator, since that should already have been determined to be a string by the method lookup machinery. However, looking at the code for string_join suggests another possible issue with removing unbound methods. The function doesn't check the type of the first argument - it just assumes it is a PyStringObject. PyString_Join adds the typecheck that is normally performed by the method wrapper when str.join is invoked from Python. The issue is that, if the unbound method wrapper is removed, str.join(some_unicode_str, seq) will lead to PyString_AS_STRING being invoked on a PyUnicode object. Ditto for getting the arguments out of order, as in str.join(seq, separator). At the moment, the unbound method wrapper takes care of raising a TypeError in both of these cases. Without it, we get an unsafe PyString macro being applied to an arbitrary type. I wonder how many other C methods make the same assumption, and skip type checking on the 'self' argument? It certainly seems to be the standard approach in stringobject.c. Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From mal at egenix.com Tue Jan 18 10:28:57 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Jan 18 10:29:00 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: <1105945019.30052.26.camel@localhost> <41EC431A.90204@egenix.com> Message-ID: <41ECD6D9.9000001@egenix.com> Guido van Rossum wrote: > [Guido] > >>>Apart from the tests that were testing the behavior of im_class, I >>>found only a single piece of code in the standard library that used >>>im_class of an unbound method object (the clever test in the pyclbr >>>test). Uses of im_self and im_func were more widespread. Given the >>>level of cleverness in the pyclbr test (and the fact that I wrote it >>>myself) I'm not worried about widespread use of im_class on unbound >>>methods. > > > [Marc-Andre] > >>I guess this depends on how you define widespread use. I'm using >>this feature a lot via the basemethod() function in mxTools for >>calling the base method of an overridden method in mixin classes >>(basemethod() predates super() and unlike the latter works for >>old-style classes). > > > I'm not sure I understand how basemethod is supposed to work; I can't > find docs for it using Google (only three hits for the query mxTools > basemethod). How does it depend on im_class? It uses im_class to find the class defining the (unbound) method: def basemethod(object,method=None): """ Return the unbound method that is defined *after* method in the inheritance order of object with the same name as method (usually called base method or overridden method). object can be an instance, class or bound method. method, if given, may be a bound or unbound method. If it is not given, object must be bound method. Note: Unbound methods must be called with an instance as first argument. The function uses a cache to speed up processing. Changes done to the class structure after the first hit will not be noticed by the function. """ ... This is how it is used in mixin classes to call the base method of the overridden method in the inheritance tree (of old-style classes): class RequestListboxMixin: def __init__(self,name,viewname,viewdb,context=None,use_clipboard=0, size=None,width=None,monospaced=1,events=None): # Call base method mx.Tools.basemethod(self, RequestListboxMixin.__init__)\ (self,name,size,width,monospaced,None,events) ... Without .im_class for the unbound method, basemethod would seize to work since it uses this attribute to figure out the class object defining the overriding method. I can send you the code if you don't have egenix-mx-base installed somewhere (its in mx/Tools/Tools.py). >>What I don't understand in your reasoning is that you are talking >>about making an unbound method look more like a function. > > That's a strange interpretation. I'm getting rid of the unbound method > object altogether. Well, you do have to assign some other type to the object that is returned by "myClass.myMethod" and as I understood your proposal, the returned object should be of the FunctionType. So from an application point of view, you are changing the type of an object. >>Unbound >>methods and bound methods are objects of the same type - >>the method object. > > Yeah I know that. :-) > > And it is one of the problems -- the two uses are quite distinct and > yet it's the same object, which is confusing. Hmm, I have a hard time seeing how you can get rid off unbound methods while keeping bound methods - since both are the same type :-) >>By turning an unbound method into a function >>type, you break code that tests for MethodType in Python >>or does a PyMethod_Check() at C level. > > > My expectation is that there is very little code like that. Almost > all the code that I found doing that in the core Python code (none in > C BTW) was in the test suite. I'm using PyMethod_Check() in mxProxy to automatically wrap methods of proxied object in order to prevent references to the object class or the object itself to slip by the proxy. Changing the type to function object and placing the class information into a function attribute would break this approach. Apart from that the type change (by itself) would not affect the eGenix code base. I would expect code in the following areas to make use of the type check: * language interface code (e.g. Java, .NET bridges) * security code that tries to implement object access control * RPC applications that use introspection to generate interface definitions (e.g. WSDL service definitions) * debugging tools (e.g. IDEs) Perhaps a few others could scan their code base as well ?! >>If you want to make methods look more like functions, >>the method object should become a subclass of the function >>object (function + added im_* attributes). > > Can't do that, since the (un)bound method object supports binding > other callables besides functions. Is this feature used anywhere ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 10 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From arigo at tunes.org Tue Jan 18 13:59:14 2005 From: arigo at tunes.org (Armin Rigo) Date: Tue Jan 18 14:11:03 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050114174132.GA46344@prometheusresearch.com> References: <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com> <20050114163900.GA21005@vicky.ecs.soton.ac.uk> <20050114174132.GA46344@prometheusresearch.com> Message-ID: <20050118125914.GA28380@vicky.ecs.soton.ac.uk> Hi Clark, On Fri, Jan 14, 2005 at 12:41:32PM -0500, Clark C. Evans wrote: > Imagine enhancing the stack-trace with additional information about > what adaptations were made; > > Traceback (most recent call last): > File "xxx", line 1, in foo > Adapting x to File > File "yyy", line 384, in bar > Adapting x to FileName > etc. More thoughts should be devoted to this, because it would be very precious. There should also be a way to know why a given call to adapt() returned an unexpected object even if it didn't crash. Given the nature of the problem, it seems not only "nice" but essential to have a good way to debug it. > How can we express your thoughts so that they fit into a narrative > describing how adapt() should and should not be used? I'm attaching a longer, hopefully easier reformulation... Armin -------------- next part -------------- A view on adaptation ==================== Adaptation is a tool to help exchange data between two pieces of code; a very powerful tool, even. But it is easy to misunderstand its aim, and unlike other features of a programming language, misusing adaptation will quickly lead into intricate debugging nightmares. Here is the point of view on adaptation which I defend, and which I believe should be kept in mind. Let's take an example. You want to call a function in the Python standard library to do something interesting, like pickling (saving) a number of instances to a file with the ``pickle`` module. You might remember that there is a function ``pickle.dump(obj, file)``, which saves the object ``obj`` to the file ``file``, and another function ``pickle.load(file)`` which reads back the object from ``file``. (Adaptation doesn't help you to figure this out; you have to be at least a bit familiar with the standard library to know that this feature exists.) Let's take the example of ``pickle.load(file)``. Even if you remember about it, you might still have to look up the documentation if you don't remember exactly what kind of object ``file`` is supposed to be. Is it an open file object, or a file name? All you know is that ``file`` is meant to somehow "be", or "stand for", the file. Now there are at least two commonly used ways to "stand for" a file: the file path as a string, or the file object directly. Actually, it might even not be a file at all, but just a string containing the already-loaded binary data. This gives a third alternative. The point here is that the person who wrote the ``pickle.load(x)`` function also knew that the argument was supposed to "stand for" a source of binary data to read from, and he had to make a choice for one of the three common representations: file path, file object, or raw data in a string. The "source of binary data" is what both the author of the function and you would easily agree on; the formal choice of representation is more arbitrary. This is where adaptation is supposed to help. With properly setup adaptation, you can pass to ``pickle.load()`` either a file name or a file object, or possibly anything else that "reasonably stands for" an input file, and it will just work. But to understand it more fully, we need to look a bit closer. Imagine yourself as the author of functions like ``pickle.load()`` and ``pickle.dump()``. You decide if you want to use adaptation or not. Adaptation should be used in this case, and ONLY in this kind of case: there is some generally agreed concept on what a particular object -- typically an argument of function -- should represent, but not on precisely HOW it should represent it. If your function expects a "place to write the data to", it can typically be an open file or just a file name; in this case, the function would be defined like this:: def dump_data_into(target): file = adapt(target, TargetAsFile) file.write('hello') with ``TargetAsFile`` being suitably defined -- i.e. having a correct ``__adapt__()`` special method -- so that the adaptation will accept either a file or a string, and in the latter case open the named file for writing. Surely, you think that ``TargetAsFile`` is a strange name for an interface if you think about adaptation in term of interfaces. Well, for the purpose of this argument, don't. Forget about interfaces. This special object ``TargetAsFile`` means not one but two things at once: that the input argument ``target`` represents the place into which data should be written; and that the result ``file`` of the adaptation, as used within function itself, must be more precisely a file object. This two-level distinction is important to keep in mind, specially when adapting built-in objects like strings and files. For example, the adaptation that would be used in ``pickle.load(source)`` is more difficult to get right, because there are two common ways that a string object can stand for a source of data: either as the name of a file, or as raw binary data. It is not possible to distinguish between these two differents uses of ``str`` automatically. In other words, strings are very versatile and low-level objects which can have various meanings in various contexts, and sometimes these meanings even conflict in the same context! More concretely, it is not possible to use adaptation to write a function ``pickle.load(source)`` which accepts either a file, a file name, or a raw binary string. You have to make a choice. For symmetry with the case of ``TargetAsFile``, a ``SourceAsFile`` would probably interpret a string as a file name, and the caller still has to explicitely turn a raw string into a file-like object -- by wrapping it in a ``StringIO()``. However, it would be possible to extend our adapters to accept URLs, say, because it's possible to distinguish between a local file name and an URL. Similarily, various other object types could unambiguously refer to, respectively, a "source" or "target" of data. The essential point is: the criterion to keep in mind for knowing when it is reasonable or not to add new adaptation paths is whether the object you are adapting "clearly stands" for the **high-level concept** that you are adapting to, and **not** for whatever resulting type or interface the adapted object should have. It **makes no sense** to adapt a string to a file or a file-like object. *Never define an adapter from the string type to the file type!!* A string and a file are two low-level concepts that mean different things. It only makes sense to adapt a string to a "source of data" which is then represented as a file. This subtle distinction is essential when adapting built-in types. In large frameworks, it is perhaps more common to adapt to interfaces or between classes specific to your framework. These interfaces and classes merge both roles: one class is a concrete objects in the Python sense -- a type -- and a single embodied concept. In this case, the difference between a concrete instance and the concept it stands for is not so important. This is why we can often think about adaptation as creating an adapter object on top of an instance, to provide a different interface for the object. If you adapt an instance to an interface ``I`` you really mean that there is a common concept behind the instance and ``I``, and you want to change from the representation given by the instance to the one given by ``I``. I believe it is useful to keep in mind that adaptation is really about converting between different concrete representations ("str", "file") of a common abstract concept ("source of data"). You have at least to realize which abstract concept you want to adapt representations of, before you define your own adapters. If you do, then properties like the transitivity of adaptation (i.e. automatically finding longer adaptation paths A -> B -> C when asked to adapt from A to C) become desirable, because the intermediate steps are merely changes in representation for the same abstract concept ("it's the same source of data all along"). If you don't, then transitivity becomes the Source Of All Nightmares :-) From pje at telecommunity.com Tue Jan 18 15:38:51 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Jan 18 15:37:44 2005 Subject: [Python-Dev] PEP 246: lossless and stateless In-Reply-To: <20050118125914.GA28380@vicky.ecs.soton.ac.uk> References: <20050114174132.GA46344@prometheusresearch.com> <0B08934A-659B-11D9-ADA4-000A95EFAE9E@aleax.it> <7495EF6B-65AE-11D9-ADA4-000A95EFAE9E@aleax.it> <20050114010307.GA51446@prometheusresearch.com> <5.1.1.6.0.20050113232251.03d9a850@mail.telecommunity.com> <5.1.1.6.0.20050114101514.0384ddc0@mail.telecommunity.com> <20050114163900.GA21005@vicky.ecs.soton.ac.uk> <20050114174132.GA46344@prometheusresearch.com> Message-ID: <5.1.1.6.0.20050118093756.042e6020@mail.telecommunity.com> At 12:59 PM 1/18/05 +0000, Armin Rigo wrote: > > How can we express your thoughts so that they fit into a narrative > > describing how adapt() should and should not be used? > >I'm attaching a longer, hopefully easier reformulation... Well said! You've explained my "interface per use case" theory much better than I ever have. From walter at livinglogic.de Tue Jan 18 16:05:42 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Jan 18 16:05:50 2005 Subject: [Python-Dev] __str__ vs. __unicode__ Message-ID: <41ED25C6.80603@livinglogic.de> __str__ and __unicode__ seem to behave differently. A __str__ overwrite in a str subclass is used when calling str(), a __unicode__ overwrite in a unicode subclass is *not* used when calling unicode(): ------------------------------- class str2(str): def __str__(self): return "foo" x = str2("bar") print str(x) class unicode2(unicode): def __unicode__(self): return u"foo" x = unicode2(u"bar") print unicode(x) ------------------------------- This outputs: foo bar IMHO this should be fixed so that __unicode__() is used in the second case too. Bye, Walter D?rwald From jhylton at gmail.com Tue Jan 18 17:06:58 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Tue Jan 18 17:07:02 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python ceval.c, 2.420, 2.421 In-Reply-To: References: Message-ID: On Tue, 18 Jan 2005 07:56:19 -0800, mwh@users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Python > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4034/Python > > Modified Files: > ceval.c > Log Message: > Change the name of the macro used by --with-tsc builds to the less > inscrutable READ_TIMESTAMP. An obvious improvement. Thanks! Jeremy From mwh at python.net Tue Jan 18 18:00:45 2005 From: mwh at python.net (Michael Hudson) Date: Tue Jan 18 18:00:46 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <41EC38DE.8080603@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Mon, 17 Jan 2005 23:14:54 +0100") References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> <2mbrboca5r.fsf@starship.python.net> <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> <41EC38DE.8080603@v.loewis.de> Message-ID: <2my8eqbrk2.fsf@starship.python.net> "Martin v. L?wis" writes: > Guido van Rossum wrote: >>>>a) Is Exception to be new-style? >>> >>>Probably not in 2.5; Martin and others have suggested that this could >>>introduce instability for users' existing exception classes. >> Really? I thought that was eventually decided to be a very small >> amount of code. > > I still think that only an experiment could decide: somebody should > come up with a patch that does that, and we will see what breaks. > > I still have the *feeling* that this has significant impact, but > I could not pin-point this to any specific problem I anticipate. Well, some code is certainly going to break such as this from warnings.py: assert isinstance(category, types.ClassType), "category must be a class" or this from traceback.py: if type(etype) == types.ClassType: stype = etype.__name__ else: stype = etype I hope to have a new patch (which makes PyExc_Exception new-style, but allows arbitrary old-style classes as exceptions) "soon". It may even pass bits of "make test" :) Cheers, mwh -- SPIDER: 'Scuse me. [scuttles off] ZAPHOD: One huge spider. FORD: Polite though. -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From gvanrossum at gmail.com Tue Jan 18 18:17:48 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 18 18:17:51 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE721231@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DE721231@au3010avexu1.global.avaya.com> Message-ID: [Timothy Delaney] > If im_func were set to the class where the function was defined, I could > definitely avoid the second part of the trawling (not sure about the > first yet, since I need to get at the function object). Instead of waiting for unbound methods to change their functionality, just create a metaclass that sticks the attribute you want on the function objects. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Tue Jan 18 18:32:49 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 18 18:32:55 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <41ECD6D9.9000001@egenix.com> References: <1105945019.30052.26.camel@localhost> <41EC431A.90204@egenix.com> <41ECD6D9.9000001@egenix.com> Message-ID: [me] > > I'm not sure I understand how basemethod is supposed to work; I can't > > find docs for it using Google (only three hits for the query mxTools > > basemethod). How does it depend on im_class? [Marc-Andre] > It uses im_class to find the class defining the (unbound) method: > > def basemethod(object,method=None): > > """ Return the unbound method that is defined *after* method in the > inheritance order of object with the same name as method > (usually called base method or overridden method). > > object can be an instance, class or bound method. method, if > given, may be a bound or unbound method. If it is not given, > object must be bound method. > > Note: Unbound methods must be called with an instance as first > argument. > > The function uses a cache to speed up processing. Changes done > to the class structure after the first hit will not be noticed > by the function. > > """ > ... > > This is how it is used in mixin classes to call the base > method of the overridden method in the inheritance tree (of > old-style classes): > > class RequestListboxMixin: > > def __init__(self,name,viewname,viewdb,context=None,use_clipboard=0, > size=None,width=None,monospaced=1,events=None): > > # Call base method > mx.Tools.basemethod(self, RequestListboxMixin.__init__)\ > (self,name,size,width,monospaced,None,events) > > ... > > Without .im_class for the unbound method, basemethod would > cease to work since it uses this attribute to figure out > the class object defining the overriding method. Well, you could always do what Timothy Delaney's autosuper recipe does: crawl the class structure starting from object.__class__ until you find the requested method. Since you're using a cache the extra cost should be minimal. I realize that this requires you to issue a new release of mxTools to support this, but you probably want to do one anyway to support other 2.5 features. > Hmm, I have a hard time seeing how you can get rid > off unbound methods while keeping bound methods - since > both are the same type :-) Easy. There is a lot of code in the instance method type specifically to support the case where im_self is NULL. All that code can be deleted (once built-in exceptions stop using it). > I'm using PyMethod_Check() in mxProxy to automatically > wrap methods of proxied object in order to prevent references > to the object class or the object itself to slip by the > proxy. Changing the type to function object and placing > the class information into a function attribute would break > this approach. Apart from that the type change (by itself) > would not affect the eGenix code base. Isn't mxProxy a weak referencing scheme? Is it still useful given Python's own support for weak references? > I would expect code in the following areas to make use > of the type check: > * language interface code (e.g. Java, .NET bridges) Java doesn't have the concept of unbound methods, so I doubt it's useful there. Remember that as far as how you call it, the unbound method has no advantages over the function! > * security code that tries to implement object access control Security code should handle plain functions just as well as (un)bound methods anyway. > * RPC applications that use introspection to generate > interface definitions (e.g. WSDL service definitions) Why would those care about unbound methods? > * debugging tools (e.g. IDEs) Hopefuly those will use the filename + line number information in the function object. Remember, by the time the function is called, the (un)bound method object is unavailable. > >>If you want to make methods look more like functions, > >>the method object should become a subclass of the function > >>object (function + added im_* attributes). > > > > Can't do that, since the (un)bound method object supports binding > > other callables besides functions. > > Is this feature used anywhere ? Yes, by the built-in exception code. (It surprised me too; I think in modern days it would have been done using a custom descriptor.) BTW, decorators and other descriptors are one reason why approaches that insist on im_class being there will have a diminishing value in the future. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Tue Jan 18 18:38:34 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Jan 18 18:38:36 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41ED25C6.80603@livinglogic.de> References: <41ED25C6.80603@livinglogic.de> Message-ID: <41ED499A.1050206@egenix.com> Walter D?rwald wrote: > __str__ and __unicode__ seem to behave differently. A __str__ > overwrite in a str subclass is used when calling str(), a __unicode__ > overwrite in a unicode subclass is *not* used when calling unicode(): > > ------------------------------- > class str2(str): > def __str__(self): > return "foo" > > x = str2("bar") > print str(x) > > class unicode2(unicode): > def __unicode__(self): > return u"foo" > > x = unicode2(u"bar") > print unicode(x) > ------------------------------- > > This outputs: > foo > bar > > IMHO this should be fixed so that __unicode__() is used in the > second case too. If you drop the base class for unicode, this already works. This code in object.c:PyObject_Unicode() is responsible for the sub-class version not doing what you'd expect: if (PyUnicode_Check(v)) { /* For a Unicode subtype that's not a Unicode object, return a true Unicode object with the same data. */ return PyUnicode_FromUnicode(PyUnicode_AS_UNICODE(v), PyUnicode_GET_SIZE(v)); } So the question is whether conversion of a Unicode sub-type to a true Unicode object should honor __unicode__ or not. The same question can be asked for many other types, e.g. floats (and __float__), integers (and __int__), etc. >>> class float2(float): ... def __float__(self): ... return 3.141 ... >>> float(float2(1.23)) 1.23 >>> class int2(int): ... def __int__(self): ... return 42 ... >>> int(int2(123)) 123 I think we need general consensus on what the strategy should be: honor these special hooks in conversions to base types or not ? Maybe the string case is the real problem ... :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 10 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Tue Jan 18 19:13:29 2005 From: mwh at python.net (Michael Hudson) Date: Tue Jan 18 19:13:32 2005 Subject: [Python-Dev] Exceptions *must*? be old-style classes? In-Reply-To: <2my8eqbrk2.fsf@starship.python.net> (Michael Hudson's message of "Tue, 18 Jan 2005 17:00:45 +0000") References: <20050117105219.GA12763@vicky.ecs.soton.ac.uk> <2mbrboca5r.fsf@starship.python.net> <5.1.1.6.0.20050117113419.03972d20@mail.telecommunity.com> <41EC38DE.8080603@v.loewis.de> <2my8eqbrk2.fsf@starship.python.net> Message-ID: <2mu0pebo6u.fsf@starship.python.net> Michael Hudson writes: > I hope to have a new patch (which makes PyExc_Exception new-style, but > allows arbitrary old-style classes as exceptions) "soon". It may even > pass bits of "make test" :) Done: http://www.python.org/sf/1104669 It passed 'make test' apart from failures I really don't think are my fault. I'll run "regrtest -uall" overnight... Cheers, mwh -- [1] If you're lost in the woods, just bury some fibre in the ground carrying data. Fairly soon a JCB will be along to cut it for you - follow the JCB back to civilsation/hitch a lift. -- Simon Burr, cam.misc From irmen at xs4all.nl Tue Jan 18 20:32:43 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Tue Jan 18 20:32:45 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41EA9196.1020709@xs4all.nl> References: <41EA9196.1020709@xs4all.nl> Message-ID: <41ED645B.40709@xs4all.nl> Irmen de Jong wrote: > Hello > I've looked at one bug and a bunch of patches and > added a comment to them: [...] > [ 579435 ] Shadow Password Support Module > Would be nice to have, I recently just couldn't do the user > authentication that I wanted: based on the users' unix passwords I'm almost done with completing this thing. (including doc and unittest). However: 1- I can't add new files to this tracker item. Should I open a new patch and refer to it? 2- As shadow passwords can only be retrieved when you are root, is a unit test module even useful? 3- Should the order of the chapters in the documentation be preserved? I'd rather add spwd below pwd, but this pushes the other unix modules "1 down"... --Irmen From martin at v.loewis.de Tue Jan 18 23:17:46 2005 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 18 23:17:45 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41ED645B.40709@xs4all.nl> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> Message-ID: <41ED8B0A.7050201@v.loewis.de> Irmen de Jong wrote: > 1- I can't add new files to this tracker item. > Should I open a new patch and refer to it? Depends on whether you want tracker admin access (i.e. become a SF python project member). If you do, you could attach patches to bug reports not written by you. > 2- As shadow passwords can only be retrieved when > you are root, is a unit test module even useful? Probably not. Alternatively, introduce a "root" resource, and make that test depend on the presence of the root resource. > 3- Should the order of the chapters in the documentation > be preserved? I'd rather add spwd below pwd, but > this pushes the other unix modules "1 down"... You could make it a subsection (e.g. "spwd -- shadow passwords") Not sure whether this would be supported by the processing tools; if not, inserting the module in the middle might be acceptable. In any case, what is important is that the documentation is added - it can always be rearranged later. Regards, Martin From fdrake at acm.org Tue Jan 18 23:23:34 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Jan 18 23:23:47 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41ED8B0A.7050201@v.loewis.de> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> Message-ID: <200501181723.35070.fdrake@acm.org> Irmen de Jong wrote: > 3- Should the order of the chapters in the documentation > be preserved? I'd rather add spwd below pwd, but > this pushes the other unix modules "1 down"... On Tuesday 18 January 2005 17:17, Martin v. L?wis wrote: > You could make it a subsection (e.g. "spwd -- shadow passwords") > Not sure whether this would be supported by the processing > tools; if not, inserting the module in the middle might be > acceptable. I see no reason not to insert it right after pwd module docs. The order of the sections is not a backward compatibility concern. :-) -Fred -- Fred L. Drake, Jr. From tdelaney at avaya.com Tue Jan 18 23:54:59 2005 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Jan 18 23:55:25 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE025202BC@au3010avexu1.global.avaya.com> Guido van Rossum wrote: > [Timothy Delaney] >> If im_func were set to the class where the function was defined, I >> could definitely avoid the second part of the trawling (not sure >> about the first yet, since I need to get at the function object). > > Instead of waiting for unbound methods to change their functionality, > just create a metaclass that sticks the attribute you want on the > function objects. Yep - that's one approach I've considered. I've also thought about modifying the code objects, which would mean I could grab the base class directly. It's definitely not the most compelling use case in the world ;) Tim Delaney From irmen at xs4all.nl Wed Jan 19 00:25:54 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Wed Jan 19 00:25:56 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41ED8B0A.7050201@v.loewis.de> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> Message-ID: <41ED9B02.2040908@xs4all.nl> Martin, > Irmen de Jong wrote: > >> 1- I can't add new files to this tracker item. >> Should I open a new patch and refer to it? > > > Depends on whether you want tracker admin access (i.e. > become a SF python project member). If you do, > you could attach patches to bug reports not > written by you. That sounds very convenient, thanks. Does the status of 'python project member' come with certain expectations that must be complied with ? ;-) >> 2- As shadow passwords can only be retrieved when >> you are root, is a unit test module even useful? > > > Probably not. Alternatively, introduce a "root" resource, > and make that test depend on the presence of the root resource. I'm not sure what this "resource" is actually. I have seen them pass on my screen when executing the regression tests (resource "network" is not enabled, etc) but never paid much attention to them. Are they used to select optional parts of the test suite that can only be run in certain conditions? > In any case, what is important is that the documentation is > added - it can always be rearranged later. I've copied and adapted the "pwd" module chapter. I'll try to have a complete patch ready tomorrow night. Bye, -Irmen. From tim.peters at gmail.com Wed Jan 19 01:03:17 2005 From: tim.peters at gmail.com (Tim Peters) Date: Wed Jan 19 01:03:24 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41ED9B02.2040908@xs4all.nl> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl> Message-ID: <1f7befae0501181603732388f0@mail.gmail.com> [Martin asks whether Irmen wants to be a tracker admin on SF] [Irmen de Jong] > That sounds very convenient, thanks. > Does the status of 'python project member' come with > certain expectations that must be complied with ? ;-) If you're using Python, you're already required to comply with all of Guido's demands, this would just make it more official. Kinda like the difference in sanctifying cohabitation with a marriage ceremony . OK, really, the minimum required of Python project members is that they pay some attention to Python-Dev. >>> 2- As shadow passwords can only be retrieved when >>> you are root, is a unit test module even useful? >> Probably not. Alternatively, introduce a "root" resource, >> and make that test depend on the presence of the root resource. > I'm not sure what this "resource" is actually. > I have seen them pass on my screen when executing the > regression tests (resource "network" is not enabled, etc) > but never paid much attention to them. > Are they used to select optional parts of the test suite > that can only be run in certain conditions? That's right, where "the condition" is precisely that you tell regrtest.py to enable a (one or more) named resource. There's no intelligence involved. "Resource names" are arbitrary, and can be passed to regrtest.py's -u argument. See regrtest's docstring for details. For example, to run the tests that require the network resource, pass "-u network". Then it will run network tests, and regardless of whether a network is actually available. Passing "-u all" makes it try to run all tests. From mdehoon at ims.u-tokyo.ac.jp Wed Jan 19 04:03:30 2005 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Wed Jan 19 04:00:08 2005 Subject: [Python-Dev] Patch review [ 684500 ] extending readline functionality Message-ID: <41EDCE02.3020505@ims.u-tokyo.ac.jp> Patch review [ 684500 ] (extending readline functionality) This patch is a duplicate of patch [ 675551 ] (extending readline functionality), which was first submitted against stable python version 2.2.2. After the resubmitted patch [ 684500 ] against Python 2.3a1 was accepted (Modules/readline.c revision 2.73 and Doc/lib/libreadline.tex revision 1.16), the original patch [ 675551 ] was closed but patch [ 684500 ] was not. I have added a comment to patch [ 684500 ] that it can be closed. --Michiel. -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From abo at minkirri.apana.org.au Wed Jan 19 06:16:09 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Wed Jan 19 06:16:47 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 Message-ID: <1106111769.3822.52.camel@schizo> G'day, I've Cc'ed this to zope-coders as it might affect other Zope developers and it had me stumped for ages. I couldn't find anything on it anywhere, so I figured it would be good to get something into google :-). We are developing a Zope2.7 application on Debian GNU/Linux that is using fop to generate pdf's from xml-fo data. fop is a java thing, and we are using popen2.Popen3(), non-blocking mode, and select loop to write/read stdin/stdout/stderr. This was all working fine. Then over the Christmas chaos, various things on my development system were apt-get updated, and I noticed that java/fop had started segfaulting. I tried running fop with the exact same input data from the command line; it worked. I wrote a python script that invoked fop in exactly the same way as we were invoking it inside zope; it worked. It only segfaulted when invoked inside Zope. I googled and tried everything... switched from j2re1.4 to kaffe, rolled back to a previous version of python, re-built Zope, upgraded Zope from 2.7.2 to 2.7.4, nothing helped. Then I went back from a linux 2.6.8 kernel to a 2.4.27 kernel; it worked! After googling around, I found references to recent attempts to resolve some signal handling problems in Python threads. There was one post that mentioned subtle differences between how Linux 2.4 and Linux 2.6 did signals to threads. So it seems this is a problem with Python threads and Linux kernel 2.6. The attached program demonstrates that it has nothing to do with Zope. Using it to run "fop-test /usr/bin/fop . Is this the same bug? Should I submit a new bug report? Is there any other way I can help resolve this? BTW, built in file objects really could use better non-blocking support... I've got a half-drafted PEP for it... anyone interested in it? -- Donovan Baarda http://minkirri.apana.org.au/~abo/ -------------- next part -------------- A non-text attachment was scrubbed... Name: test-fop.py Type: application/x-python Size: 1685 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050119/1f7dd103/test-fop-0001.bin From walter at livinglogic.de Wed Jan 19 10:40:46 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Jan 19 10:40:49 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41ED499A.1050206@egenix.com> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> Message-ID: <41EE2B1E.8030209@livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: > >> __str__ and __unicode__ seem to behave differently. A __str__ >> overwrite in a str subclass is used when calling str(), a __unicode__ >> overwrite in a unicode subclass is *not* used when calling unicode(): >> >> [...] > > If you drop the base class for unicode, this already works. That's cheating! ;) My use case is an XML DOM API: __unicode__() should extract the character data from the DOM. For Text nodes this is the text, for comments and processing instructions this is u"" etc. To reduce memory footprint and to inherit all the unicode methods, it would be good if Text, Comment and ProcessingInstruction could be subclasses of unicode. > This code in object.c:PyObject_Unicode() is responsible for > the sub-class version not doing what you'd expect: > > if (PyUnicode_Check(v)) { > /* For a Unicode subtype that's not a Unicode object, > return a true Unicode object with the same data. */ > return PyUnicode_FromUnicode(PyUnicode_AS_UNICODE(v), > PyUnicode_GET_SIZE(v)); > } > > So the question is whether conversion of a Unicode sub-type > to a true Unicode object should honor __unicode__ or not. > > The same question can be asked for many other types, e.g. > floats (and __float__), integers (and __int__), etc. > > >>> class float2(float): > ... def __float__(self): > ... return 3.141 > ... > >>> float(float2(1.23)) > 1.23 > >>> class int2(int): > ... def __int__(self): > ... return 42 > ... > >>> int(int2(123)) > 123 > > I think we need general consensus on what the strategy > should be: honor these special hooks in conversions > to base types or not ? I'd say, these hooks should be honored, because it gives us more possibilities: If you want the original value, simply don't implement the hook. > Maybe the string case is the real problem ... :-) At least it seems that the string case is the exception. So if we fix __str__ this would be a bugfix for 2.4.1. If we fix the rest, this would be a new feature for 2.5. Bye, Walter D?rwald From bob at redivi.com Wed Jan 19 11:10:36 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed Jan 19 11:10:42 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41EE2B1E.8030209@livinglogic.de> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> Message-ID: <5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com> On Jan 19, 2005, at 4:40, Walter D?rwald wrote: > M.-A. Lemburg wrote: > >> Walter D?rwald wrote: >>> __str__ and __unicode__ seem to behave differently. A __str__ >>> overwrite in a str subclass is used when calling str(), a __unicode__ >>> overwrite in a unicode subclass is *not* used when calling unicode(): >>> >>> [...] >> If you drop the base class for unicode, this already works. > > That's cheating! ;) > > My use case is an XML DOM API: __unicode__() should extract the > character data from the DOM. For Text nodes this is the text, > for comments and processing instructions this is u"" etc. To > reduce memory footprint and to inherit all the unicode methods, > it would be good if Text, Comment and ProcessingInstruction could > be subclasses of unicode. It sounds like a really bad idea to have a class that supports both of these properties: - unicode as a base class - non-trivial result from unicode(foo) Do you REALLY think this should be True?! isinstance(foo, unicode) and foo != unicode(foo) Why don't you just call this "extract character data" method something other than __unicode__? That way, you get the reduced memory footprint and convenience methods of unicode, with none of the craziness. -bob From walter at livinglogic.de Wed Jan 19 12:19:14 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Jan 19 12:19:17 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com> Message-ID: <41EE4232.9070409@livinglogic.de> Bob Ippolito wrote: > On Jan 19, 2005, at 4:40, Walter D?rwald wrote: > >> [...] >> That's cheating! ;) >> >> My use case is an XML DOM API: __unicode__() should extract the >> character data from the DOM. For Text nodes this is the text, >> for comments and processing instructions this is u"" etc. To >> reduce memory footprint and to inherit all the unicode methods, >> it would be good if Text, Comment and ProcessingInstruction could >> be subclasses of unicode. > > It sounds like a really bad idea to have a class that supports both of > these properties: > - unicode as a base class > - non-trivial result from unicode(foo) > > Do you REALLY think this should be True?! > isinstance(foo, unicode) and foo != unicode(foo) > > Why don't you just call this "extract character data" method something > other than __unicode__? IMHO __unicode__ is the most natural and logical choice. isinstance(foo, unicode) is just an implementation detail. But you're right: the consequences of this can be a bit scary. > That way, you get the reduced memory footprint > and convenience methods of unicode, with none of the craziness. Without this craziness we wouldn't have discovered the problem. ;) Whether this craziness gets implemented, depends on the solution to this problem. Bye, Walter D?rwald From aleax at aleax.it Wed Jan 19 12:22:44 2005 From: aleax at aleax.it (Alex Martelli) Date: Wed Jan 19 12:23:03 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <5D1927D4-6A02-11D9-BB54-000A95BA5446@redivi.com> Message-ID: <708E5DA6-6A0C-11D9-9DED-000A95EFAE9E@aleax.it> On 2005 Jan 19, at 11:10, Bob Ippolito wrote: > Do you REALLY think this should be True?! > isinstance(foo, unicode) and foo != unicode(foo) Hmmmm -- why not? In the generic case, talking about some class B, it certainly violates no programming principle known to me that "isinstance(foo, B) and foo != B(foo)"; it seems a rather common case -- ``casting to the base class'' (in C++ terminology, I guess) ``slices off'' some parts of foo, and thus equality does not hold. If this is specifically a bad idea for the specific case where B is unicode, OK, that's surely possible, but if so it seems it should be possible to explain this in terms of particular properties of type unicode. Alex From mal at egenix.com Wed Jan 19 12:27:29 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Jan 19 12:27:32 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: <1105945019.30052.26.camel@localhost> <41EC431A.90204@egenix.com> <41ECD6D9.9000001@egenix.com> Message-ID: <41EE4421.80303@egenix.com> Guido van Rossum wrote: > [me] > >>>I'm not sure I understand how basemethod is supposed to work; I can't >>>find docs for it using Google (only three hits for the query mxTools >>>basemethod). How does it depend on im_class? > > > [Marc-Andre] > >>It uses im_class to find the class defining the (unbound) method: >> >>def basemethod(object,method=None): >> >> """ Return the unbound method that is defined *after* method in the >> inheritance order of object with the same name as method >> (usually called base method or overridden method). >> >> object can be an instance, class or bound method. method, if >> given, may be a bound or unbound method. If it is not given, >> object must be bound method. >> >> Note: Unbound methods must be called with an instance as first >> argument. >> >> The function uses a cache to speed up processing. Changes done >> to the class structure after the first hit will not be noticed >> by the function. >> >> """ >> ... >> >>This is how it is used in mixin classes to call the base >>method of the overridden method in the inheritance tree (of >>old-style classes): >> >>class RequestListboxMixin: >> >> def __init__(self,name,viewname,viewdb,context=None,use_clipboard=0, >> size=None,width=None,monospaced=1,events=None): >> >> # Call base method >> mx.Tools.basemethod(self, RequestListboxMixin.__init__)\ >> (self,name,size,width,monospaced,None,events) >> >> ... >> >>Without .im_class for the unbound method, basemethod would >>seize to work since it uses this attribute to figure out >>the class object defining the overriding method. > > > Well, you could always do what Timothy Delaney's autosuper recipe > does: crawl the class structure starting from object.__class__ until > you find the requested method. Since you're using a cache the extra > cost should be minimal. That won't work, since basemethod() is intended for standard classes (not new-style ones). > I realize that this requires you to issue a new release of mxTools to > support this, but you probably want to do one anyway to support other > 2.5 features. A new release wouldn't be much trouble, but I don't see any way to fix the basemethod() implementation without also requiring a change to the function arguments. Current usage is basemethod(instance, unbound_method). In order for basemethod to still be able to find the right class and method name, I'd have to change that to basemethod(instance, unbound_method_or_function, class_defining_method). This would require all users of mx.Tools.basemethod() to update their code base. Users will probably not understand why this change is necessary, since they'd have to write the class twice: mx.Tools.basemethod(self, RequestListboxMixin.__init__, RequestListboxMixin)\ (self,name,size,width,monospaced,None,events) Dropping the unbound method basically loses expressiveness: without extra help from function attributes or other descriptors, it would no longer be possible to whether a function is to be used as method or as function. The defining namespace of the method would also not be available anymore. >>Hmm, I have a hard time seeing how you can get rid >>off unbound methods while keeping bound methods - since >>both are the same type :-) > > Easy. There is a lot of code in the instance method type specifically > to support the case where im_self is NULL. All that code can be > deleted (once built-in exceptions stop using it). So this is not about removing a type, but about removing extra code. You'd still keep bound methods as separate type. >>I'm using PyMethod_Check() in mxProxy to automatically >>wrap methods of proxied object in order to prevent references >>to the object class or the object itself to slip by the >>proxy. Changing the type to function object and placing >>the class information into a function attribute would break >>this approach. Apart from that the type change (by itself) >>would not affect the eGenix code base. > > > Isn't mxProxy a weak referencing scheme? Is it still useful given > Python's own support for weak references? Sure. First of all, mxProxy is more than just a weak referencing scheme (in fact, that was only an add-on feature). mxProxy allows you to wrap any Python object in way that hides the object from the rest of the Python interpreter, putting access to the object under fine-grained and strict control. This is the main application space for mxProxy. The weak reference feature was added later on, to work around problems with circular references. Unlike the Python weak referencing scheme, mxProxy allows creating weak references to all Python objects (not just the ones that support the Python weak reference protocol). >>I would expect code in the following areas to make use >>of the type check: >>* language interface code (e.g. Java, .NET bridges) > > Java doesn't have the concept of unbound methods, so I doubt it's > useful there. Remember that as far as how you call it, the unbound > method has no advantages over the function! True, but you know that its a method and not just a function. That can make a difference in how you implement the call. >>* security code that tries to implement object access control > > > Security code should handle plain functions just as well as (un)bound > methods anyway. It is very important for security code to know which attributes are available on an object, e.g. an unbound method includes the class object, which again has a reference to the module, the builtins, etc. Functions currently don't have this problem (but will once you add the im_class attribute ;-). >>* RPC applications that use introspection to generate >> interface definitions (e.g. WSDL service definitions) > > > Why would those care about unbound methods? I was thinking of iterating over the list of methods in a class: To find out which of the entries in the class dict are methods, a tool would have to check whether myClass.myMethod maps to an unbound method (e.g. non-function callables are currently not wrapped as unbound methods). Example: >>> class C: ... def test(self): ... print 'hello' ... test1 = dict ... >>> C.test1 >>> C.test >>* debugging tools (e.g. IDEs) > > Hopefuly those will use the filename + line number information in the > function object. Remember, by the time the function is called, the > (un)bound method object is unavailable. I was thinking more in terms of being able to tell whether a function is a method or not. IDEs might want to help the user by placing a "(self," right after she types the name of an unbound method. >>>>If you want to make methods look more like functions, >>>>the method object should become a subclass of the function >>>>object (function + added im_* attributes). >>> >>>Can't do that, since the (un)bound method object supports binding >>>other callables besides functions. >> >>Is this feature used anywhere ? > > Yes, by the built-in exception code. (It surprised me too; I think in > modern days it would have been done using a custom descriptor.) > > BTW, decorators and other descriptors are one reason why approaches > that insist on im_class being there will have a diminishing value in > the future. True, as long as you put the information from im_class somewhere else (where it's easily accessible). However, I wouldn't want to start writing @method def funcname(self, arg0, arg1): return 42 just to tell Python that this particular function will only be used as method ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 10 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Wed Jan 19 12:42:15 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Jan 19 12:42:17 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41EE2B1E.8030209@livinglogic.de> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> Message-ID: <41EE4797.6030105@egenix.com> Walter D?rwald wrote: > M.-A. Lemburg wrote: >> So the question is whether conversion of a Unicode sub-type >> to a true Unicode object should honor __unicode__ or not. >> >> The same question can be asked for many other types, e.g. >> floats (and __float__), integers (and __int__), etc. >> >> >>> class float2(float): >> ... def __float__(self): >> ... return 3.141 >> ... >> >>> float(float2(1.23)) >> 1.23 >> >>> class int2(int): >> ... def __int__(self): >> ... return 42 >> ... >> >>> int(int2(123)) >> 123 >> >> I think we need general consensus on what the strategy >> should be: honor these special hooks in conversions >> to base types or not ? > > > I'd say, these hooks should be honored, because it gives > us more possibilities: If you want the original value, > simply don't implement the hook. > >> Maybe the string case is the real problem ... :-) > > > At least it seems that the string case is the exception. Indeed. > So if we fix __str__ this would be a bugfix for 2.4.1. > If we fix the rest, this would be a new feature for 2.5. I have a feeling that we're better off with the bug fix than the new feature. __str__ and __unicode__ as well as the other hooks were specifically added for the type constructors to use. However, these were added at a time where sub-classing of types was not possible, so it's time now to reconsider whether this functionality should be extended to sub-classes as well. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 10 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at iinet.net.au Wed Jan 19 13:26:14 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Jan 19 13:26:23 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41EE4797.6030105@egenix.com> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <41EE4797.6030105@egenix.com> Message-ID: <41EE51E6.3090708@iinet.net.au> M.-A. Lemburg wrote: >> So if we fix __str__ this would be a bugfix for 2.4.1. >> If we fix the rest, this would be a new feature for 2.5. > > > I have a feeling that we're better off with the bug fix than > the new feature. > > __str__ and __unicode__ as well as the other hooks were > specifically added for the type constructors to use. > However, these were added at a time where sub-classing > of types was not possible, so it's time now to reconsider > whether this functionality should be extended to sub-classes > as well. It seems oddly inconsistent though: """Define __str__ to determine what your class returns for str(x). NOTE: This won't work if your class directly or indirectly inherits from str. If that is the case, you cannot alter the results of str(x).""" At present, most of the type constructors need the caveat, whereas __str__ actually agrees with the simple explanation in the first line. Going back to PyUnicode, PyObject_Unicode's handling of subclasses of builtins is decidedly odd: Py> class C(str): ... def __str__(self): return "I am a string!" ... def __unicode__(self): return "I am not unicode!" ... Py> c = C() Py> str(c) 'I am a string!' Py> unicode(c) u'' Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From mal at egenix.com Wed Jan 19 13:50:04 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Jan 19 13:50:07 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41EE51E6.3090708@iinet.net.au> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <41EE4797.6030105@egenix.com> <41EE51E6.3090708@iinet.net.au> Message-ID: <41EE577C.3010405@egenix.com> Nick Coghlan wrote: > M.-A. Lemburg wrote: > >>> So if we fix __str__ this would be a bugfix for 2.4.1. >>> If we fix the rest, this would be a new feature for 2.5. >> >> >> >> I have a feeling that we're better off with the bug fix than >> the new feature. >> >> __str__ and __unicode__ as well as the other hooks were >> specifically added for the type constructors to use. >> However, these were added at a time where sub-classing >> of types was not possible, so it's time now to reconsider >> whether this functionality should be extended to sub-classes >> as well. > > > It seems oddly inconsistent though: > > """Define __str__ to determine what your class returns for str(x). > > NOTE: This won't work if your class directly or indirectly inherits from > str. If that is the case, you cannot alter the results of str(x).""" > > At present, most of the type constructors need the caveat, whereas > __str__ actually agrees with the simple explanation in the first line. > > Going back to PyUnicode, PyObject_Unicode's handling of subclasses of > builtins is decidedly odd: Those APIs were all written long before there were sub-classes of types. > Py> class C(str): > ... def __str__(self): return "I am a string!" > ... def __unicode__(self): return "I am not unicode!" > ... > Py> c = C() > Py> str(c) > 'I am a string!' > Py> unicode(c) > u'' Ah, looks as if the function needs a general overhaul :-) This section should be do a PyString_CheckExact(): if (PyString_Check(v)) { Py_INCREF(v); res = v; } But before we start hacking the function, we need a general picture of what we think is right. Note, BTW, that there is also a tp_str slot that serves as hook. The overall solution to this apparent mess should be consistent for all hooks (__str__, tp_str, __unicode__ and a future tp_unicode). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 10 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at iinet.net.au Wed Jan 19 14:27:43 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Jan 19 14:27:49 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41EE577C.3010405@egenix.com> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <41EE4797.6030105@egenix.com> <41EE51E6.3090708@iinet.net.au> <41EE577C.3010405@egenix.com> Message-ID: <41EE604F.3000006@iinet.net.au> M.-A. Lemburg wrote: > Those APIs were all written long before there were sub-classes > of types. Understood. PyObject_Unicode certainly looked like an 'evolved' piece of code :) > But before we start hacking the function, we need a general > picture of what we think is right. Aye. > Note, BTW, that there is also a tp_str slot that serves > as hook. The overall solution to this apparent mess should > be consistent for all hooks (__str__, tp_str, __unicode__ > and a future tp_unicode). I imagine many people are like me, with __str__ being the only one of these hooks they use frequently (Helping out with the Decimal implementation is the only time I can recall using the slots for the numeric types, and I rarely need to deal with Unicode). Anyway, they're heavy use suggests to me that __str__ and str() are likely to provide a good model for the desired behaviour - they're the ones that are likely to have been nudged in the most useful direction by bug reports and the like. Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From mwh at python.net Wed Jan 19 14:37:11 2005 From: mwh at python.net (Michael Hudson) Date: Wed Jan 19 14:37:14 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 In-Reply-To: <1106111769.3822.52.camel@schizo> (Donovan Baarda's message of "Wed, 19 Jan 2005 16:16:09 +1100") References: <1106111769.3822.52.camel@schizo> Message-ID: <2mpt01bkvs.fsf@starship.python.net> Donovan Baarda writes: > G'day, > > I've Cc'ed this to zope-coders as it might affect other Zope developers > and it had me stumped for ages. I couldn't find anything on it anywhere, > so I figured it would be good to get something into google :-). > > We are developing a Zope2.7 application on Debian GNU/Linux that is > using fop to generate pdf's from xml-fo data. fop is a java thing, and > we are using popen2.Popen3(), non-blocking mode, and select loop to > write/read stdin/stdout/stderr. This was all working fine. > > Then over the Christmas chaos, various things on my development system > were apt-get updated, and I noticed that java/fop had started > segfaulting. I tried running fop with the exact same input data from the > command line; it worked. I wrote a python script that invoked fop in > exactly the same way as we were invoking it inside zope; it worked. It > only segfaulted when invoked inside Zope. > > I googled and tried everything... switched from j2re1.4 to kaffe, rolled > back to a previous version of python, re-built Zope, upgraded Zope from > 2.7.2 to 2.7.4, nothing helped. Then I went back from a linux 2.6.8 > kernel to a 2.4.27 kernel; it worked! > > After googling around, I found references to recent attempts to resolve > some signal handling problems in Python threads. There was one post that > mentioned subtle differences between how Linux 2.4 and Linux 2.6 did > signals to threads. You've left out a very important piece of information: which version of Python you are using. I'm guessing 2.3.4. Can you try 2.4? > So it seems this is a problem with Python threads and Linux kernel 2.6. > The attached program demonstrates that it has nothing to do with Zope. > Using it to run "fop-test /usr/bin/fop fop installed will show the segfault. Running the same thing on a > machine with 2.4 kernel will instead get the fop "usage" message. It is > not a generic fop/java problem with 2.6 because the commented > un-threaded line works fine. It doesn't seem to segfault for any > command... "cat -" works OK, so it must be something about java > contributing. > > After searching the Python bugs, the closest I could find was > #971213 > . Is > this the same bug? Should I submit a new bug report? Is there any > other way I can help resolve this? I'd be astonished if this is the same bug. The main oddness about python threads (before 2.3) is that they run with all signals masked. You could play with a C wrapper (call setprocmask, then exec fop) to see if this is what is causing the problem. But please try 2.4. > BTW, built in file objects really could use better non-blocking > support... I've got a half-drafted PEP for it... anyone interested in > it? Err, this probably should be in a different mail :) Cheers, mwh -- If trees could scream, would we be so cavalier about cutting them down? We might, if they screamed all the time, for no good reason. -- Jack Handey From aahz at pythoncraft.com Wed Jan 19 16:04:29 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed Jan 19 16:04:31 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41EE2B1E.8030209@livinglogic.de> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> Message-ID: <20050119150428.GB25472@panix.com> On Wed, Jan 19, 2005, Walter D?rwald wrote: > M.-A. Lemburg wrote: >> >>Maybe the string case is the real problem ... :-) > > At least it seems that the string case is the exception. > So if we fix __str__ this would be a bugfix for 2.4.1. Nope. Unless you're claiming the __str__ behavior is new in 2.4? (Haven't been following the thread closely.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From walter at livinglogic.de Wed Jan 19 18:31:25 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Jan 19 18:31:28 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41EE604F.3000006@iinet.net.au> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <41EE4797.6030105@egenix.com> <41EE51E6.3090708@iinet.net.au> <41EE577C.3010405@egenix.com> <41EE604F.3000006@iinet.net.au> Message-ID: <41EE996D.6040601@livinglogic.de> Nick Coghlan wrote: > [...] > I imagine many people are like me, with __str__ being the only one of > these hooks they use frequently (Helping out with the Decimal > implementation is the only time I can recall using the slots for the > numeric types, and I rarely need to deal with Unicode). > > Anyway, they're heavy use suggests to me that __str__ and str() are > likely to provide a good model for the desired behaviour - they're the > ones that are likely to have been nudged in the most useful direction by > bug reports and the like. +1 __foo__ provides conversion to foo, no matter whether foo is among the direct or indirect base classes. Simply moving the PyUnicode_Check() call in PyObject_Unicode() after the __unicode__ call (after the PyErr_Clear() call) will implement this (but does not fix Nick's bug). Running the test suite with this change reveals no other problems. Bye, Walter D?rwald From bac at OCF.Berkeley.EDU Wed Jan 19 22:43:01 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Jan 19 22:43:31 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41EC4054.6000908@v.loewis.de> References: <41EA9196.1020709@xs4all.nl> <1105974308.17513.1.camel@localhost> <41EC3849.8040503@v.loewis.de> <1106000872.5931.6.camel@emperor> <41EC4054.6000908@v.loewis.de> Message-ID: <41EED465.1020305@ocf.berkeley.edu> Martin v. L?wis wrote: > I think Brett Cannon now also follows this rule; it > really falls short enough in practice because (almost) > nobody really wants to push his patch bad enough to > put some work into it to review other patches. > Yes, I am trying to support the rule, but my schedule is nutty right now so my turn-around time is rather long at the moment. -Brett From stuart at stuartbishop.net Thu Jan 20 00:06:33 2005 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu Jan 20 00:06:41 2005 Subject: [Python-Dev] Unix line endings required for PyRun* breaking embedded Python Message-ID: <41EEE7F9.7000902@stuartbishop.net> There is a discussion going on at the moment in postgresql-general about plpythonu (which allows you write stored procedures in Python) and line endings. The discussion starts here: http://archives.postgresql.org/pgsql-general/2005-01/msg00792.php The problem appears to be that things are working as documented in PEP-278: There is no support for universal newlines in strings passed to eval() or exec. It is envisioned that such strings always have the standard \n line feed, if the strings come from a file that file can be read with universal newlines. So what happens is that if a Windows or Mac user tries to create a Python stored procedure, it will go through to the server with Windows line endings and the embedded Python interpreter will raise a syntax error for everything except single line functions. I don't think it is possible for plpythonu to fix this by simply translating the line endings, as this would require significant knowledge of Python syntax to do correctly (triple quoted strings and character escaping I think). The timing of this thread is very unfortunate, as PostgreSQL 8.0 is being released this weekend and the (hopefully) last release of the 2.3 series next week :-( -- Stuart Bishop http://www.stuartbishop.net/ From fredrik at pythonware.com Thu Jan 20 00:14:55 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Jan 20 00:14:50 2005 Subject: [Python-Dev] Re: Unix line endings required for PyRun* breakingembedded Python References: <41EEE7F9.7000902@stuartbishop.net> Message-ID: Stuart Bishop wrote: > I don't think it is possible for plpythonu to fix this by simply translating the line endings, as > this would require significant knowledge of Python syntax to do correctly (triple quoted strings > and character escaping I think). of course it's possible: that's what the interpreter does when it loads a script or module, after all... or in other words, print repr(""" """) always prints "\n" (at least on Unix (\n) and Windows (\r\n)). From aleax at aleax.it Thu Jan 20 00:32:19 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 20 00:32:25 2005 Subject: [Python-Dev] Re: Unix line endings required for PyRun* breakingembedded Python In-Reply-To: References: <41EEE7F9.7000902@stuartbishop.net> Message-ID: <5CC7C030-6A72-11D9-9DED-000A95EFAE9E@aleax.it> On 2005 Jan 20, at 00:14, Fredrik Lundh wrote: > Stuart Bishop wrote: > >> I don't think it is possible for plpythonu to fix this by simply >> translating the line endings, as >> this would require significant knowledge of Python syntax to do >> correctly (triple quoted strings >> and character escaping I think). > > of course it's possible: that's what the interpreter does when it loads > a script or module, after all... or in other words, > > print repr(""" > """) > > always prints "\n" (at least on Unix (\n) and Windows (\r\n)). Mac, too (but then, that IS Unix to all intents and purposes, nowadays). Alex From firemoth at gmail.com Thu Jan 20 01:03:25 2005 From: firemoth at gmail.com (Timothy Fitz) Date: Thu Jan 20 01:03:28 2005 Subject: [Python-Dev] Re: Zen of Python In-Reply-To: <3e8ca5c8050119150358c71728@mail.gmail.com> References: <972ec5bd050119111359e358f5@mail.gmail.com> <3e8ca5c8050119150358c71728@mail.gmail.com> Message-ID: <972ec5bd05011916033242179@mail.gmail.com> On Thu, 20 Jan 2005 09:03:30 +1000, Stephen Thorne wrote: > "Flat is better than nested" has one foot in concise powerful > programming, the other foot in optimisation. > > foo.bar.baz.arr involves 4 hashtable lookups. arr is just one hashtable lookup. I find it amazingly hard to believe that this is implying optimization over functionality or clarity. There has to be another reason, yet I can't think of any. From pje at telecommunity.com Thu Jan 20 01:14:47 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 20 01:14:26 2005 Subject: [Python-Dev] Re: Zen of Python In-Reply-To: <972ec5bd05011916033242179@mail.gmail.com> References: <3e8ca5c8050119150358c71728@mail.gmail.com> <972ec5bd050119111359e358f5@mail.gmail.com> <3e8ca5c8050119150358c71728@mail.gmail.com> Message-ID: <5.1.1.6.0.20050119190842.032382a0@mail.telecommunity.com> At 07:03 PM 1/19/05 -0500, Timothy Fitz wrote: >On Thu, 20 Jan 2005 09:03:30 +1000, Stephen Thorne > wrote: > > "Flat is better than nested" has one foot in concise powerful > > programming, the other foot in optimisation. > > > > foo.bar.baz.arr involves 4 hashtable lookups. arr is just one hashtable > lookup. > >I find it amazingly hard to believe that this is implying optimization >over functionality or clarity. There has to be another reason, yet I >can't think of any. Actually, this is one of those rare cases where optimization and clarity go hand in hand. Human brains just don't handle nesting that well. It's easy to visualize two levels of nested structure, but three is a stretch unless you can abstract at least one of the layers. For example, I can remember 'peak.binding.attributes' because the 'peak' is the same for all the packages in PEAK. I can also handle 'peak.binding.tests.test_foo' because 'tests' is also always the same. But that's pretty much the limit of my mental stack, which is why PEAK's namespaces are organized so that APIs are normally accessed as 'binding.doSomething' or 'naming.fooBar', instead of requiring people to type 'peak.binding.attributes.doSomething'. Clearly Java developers have this brain-stack issue as well, in that you usually see Java imports set up to have a flat namespace within the given module... er, class. You don't often see people creating org.apache.jakarta.foo.bar.Baz instances in their method bodies. From bac at OCF.Berkeley.EDU Thu Jan 20 02:12:59 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Jan 20 02:13:17 2005 Subject: [Python-Dev] python-dev Summary for 2004-12-01 through 2004-12-15 [draft] Message-ID: <41EF059B.5090908@ocf.berkeley.edu> Uh, life has been busy. Will probably send this one out this weekend some time so please get corrections in before then. ------------------------------------ ===================== Summary Announcements ===================== PyCon_ 2005 is well underway. The schedule is in the process of being finalized (just figuring out the order of the talks). And there is still time for the early-bird registration price of $175 ($125 students) before it expires on January 28th. Some day I will be all caught up with the Summaries... .. _PyCon: http://www.pycon.org ========= Summaries ========= ---------------------------------- PEPS: those existing and gestating ---------------------------------- [for emails on PEP updates, subscribe to python-checkins_ and choose the 'PEP' topic] A proto-PEP covering the __source__ proposal from the `last summary`_ has been posted to python-dev. `PEP 338`_ proposes how to modify the '-m' modifier so as to be able to execute modules contained within packages. .. _python-checkins: http://mail.python.org/mailman/listinfo/python-checkins .. _PEP 338: http://www.python.org/peps/pep-0338.html Contributing threads: - `PEP: __source__ proposal <>`__ - `PEP 338: Executing modules inside packages with '-m' <>`__ ------------------- Deprecating modules ------------------- The xmllib module was deprecated but not listed in `PEP 4`_. What does one do? Well, this led to a long discussion on how to handle module deprecation. With the 'warning' module now in existence, PEP 4 seemed to be less important. It was generally agreed that listing modules in PEP 4 was no longer needed. It was also agreed that deleting deprecated modules was not needed; it breaks code and disk space is cheap. It seems that no longer listing documentation and adding a deprecation warning is what is needed to properly deprecate a module. By no longer listing documentation new programmers will not use the code since they won't know about it. And adding the warning will let old users know that they should be using something else. .. _PEP 4: http://www.python.org/peps/pep-0004.html Contributing threads: - `Deprecated xmllib module <>`__ - `Rewriting PEP4 <>`__ ------------------------------------------ PR to fight the idea that Python is "slow" ------------------------------------------ An article_ in ACM TechNews that covered 2.4 had several mentions that Python was "slow" while justifying the slowness (whether it be flexibility or being fast enough). Guido (rightfully) didn't love all of the "slow" mentions which I am sure we have all heard at some point or another. The suggestions started to pour in on how to combat this. The initial one was to have a native compiler. The thinking was that if we compiled to a native executable that people psychologically would stop the association of Python being interpreted which is supposed to be slow. Some people didn't love this idea since a native compiler is not an easy thing. Others suggested including Pyrex with CPython, but didn't catch on (maintenance issue plus one might say Pyrex is not the most Pythonic solution). This didn't get anywhere in the end beyond the idea of a SIG about the various bundling tools (py2app, py2exe, etc.). The other idea was to just stop worrying about speed and move on stomping out bugs and making Python functionally more useful. With modules in the stdlib being rewritten in C for performance reasons it was suggested we are putting out the perception that performance is important to us. Several other people also suggested that we just not mention speed as a big deal in release notes and such. This also tied into the idea that managers don't worry too much about speed as much as being able to hire a bunch of Python programmers. This led to the suggestion of also emphasizing that Python is very easy to learn and thus is a moot point. There are a good number of Python programmers, though; Stephan Deibel had some rough calculations that put the number at about 750K Python developers worldwide (give or take; rough middle point of two different calculations). .. _article: http://gcn.com/vol1_no1/daily-updates/28026-1.html Contributing threads: - `2.4 news reaches interesting places <>`__ =============== Skipped Threads =============== - MS VC compiler versions - Any reason why CPPFLAGS not used in compiling? Extension modules now compile with directories specified in the LDFLAGS and CPPFLAGS env vars - adding key argument to min and max min and max now have a 'key' argument like list.sort - Unicode in doctests - SRE bug and notifications - PyInt_FromLong returning NULL - PyOS_InputHook enhancement proposal - The other Py2.4 issue - MinGW And The other Py2.4 issue - Supporting Third Party Modules - Python in education From abo at minkirri.apana.org.au Thu Jan 20 02:43:43 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Thu Jan 20 02:44:21 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 In-Reply-To: <2mpt01bkvs.fsf@starship.python.net> References: <1106111769.3822.52.camel@schizo> <2mpt01bkvs.fsf@starship.python.net> Message-ID: <1106185423.3784.26.camel@schizo> On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote: > Donovan Baarda writes: [...] > You've left out a very important piece of information: which version > of Python you are using. I'm guessing 2.3.4. Can you try 2.4? Debian Python2.3 (2.3.4-18), Debian kernel-image-2.6.8-1-686 (2.6.8-10), and Debian kernel-image-2.4.27-1-686 (2.4.27-6) > I'd be astonished if this is the same bug. > > The main oddness about python threads (before 2.3) is that they run > with all signals masked. You could play with a C wrapper (call > setprocmask, then exec fop) to see if this is what is causing the > problem. But please try 2.4. Python 2.4 does indeed fix the problem. Unfortunately we are using Zope 2.7.4, and I'm a bit wary of attempting to migrate it all from 2.3 to 2.4. Is there any way this "Fix" can be back-ported to 2.3? Note that this problem is being triggered when using Popen3() in a thread. Popen3() simply uses os.fork() and os.execvp(). The segfault is occurring in the excecvp'ed process. I'm sure there must be plenty of cases where this could happen. I think most people manage to avoid it because the processes they are popen'ing or exec'ing happen to not use signals. After testing a bit, it seems the fork() in Popen3 is not a contributing factor. The problem occurs whenever os.execvp() is executed in a thread. It looks like the exec'ed command inherits the masked signals from the thread. I'm not sure what the correct behaviour should be. The fact that it works in python2.4 feels more like a byproduct of the thread mask change than correct behaviour. To me it seems like execvp() should be setting the signal mask back to defaults or at least the mask of the main process before doing the exec. > > BTW, built in file objects really could use better non-blocking > > support... I've got a half-drafted PEP for it... anyone interested in > > it? > > Err, this probably should be in a different mail :) The verboseness of the attached test code because of this issue prompted that comment... so vaguely related :-) -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From skip at pobox.com Thu Jan 20 02:42:03 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 20 03:00:09 2005 Subject: [Python-Dev] Unix line endings required for PyRun* breaking embedded Python In-Reply-To: <41EEE7F9.7000902@stuartbishop.net> References: <41EEE7F9.7000902@stuartbishop.net> Message-ID: <16879.3179.793126.174549@montanaro.dyndns.org> Stuart> I don't think it is possible for plpythonu to fix this by simply Stuart> translating the line endings, as this would require significant Stuart> knowledge of Python syntax to do correctly (triple quoted Stuart> strings and character escaping I think). I don't see why not. If you treat the string as a file in text mode, I think you'd replace all [\r\n]+ with \n, even if it was embedded in a string: >>> s 'from math import pi\r\n"""triple-quoted string embedding CR:\rrest of string"""\r\nprint 2*pi*7\r' >>> open("foo", "w").write(s) >>> open("foo", "rU").read() 'from math import pi\n"""triple-quoted string embedding CR:\nrest of string"""\nprint 2*pi*7\n' Just re.sub("[\r\n]+", "\n", s) and I think you're good to go. Skip From skip at pobox.com Thu Jan 20 02:47:02 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 20 03:00:15 2005 Subject: [Python-Dev] Re: Zen of Python In-Reply-To: <5.1.1.6.0.20050119190842.032382a0@mail.telecommunity.com> References: <3e8ca5c8050119150358c71728@mail.gmail.com> <972ec5bd050119111359e358f5@mail.gmail.com> <5.1.1.6.0.20050119190842.032382a0@mail.telecommunity.com> Message-ID: <16879.3478.511912.271438@montanaro.dyndns.org> Phillip> Actually, this is one of those rare cases where optimization Phillip> and clarity go hand in hand. Human brains just don't handle Phillip> nesting that well. It's easy to visualize two levels of nested Phillip> structure, but three is a stretch unless you can abstract at Phillip> least one of the layers. Also, if you think about nesting in a class/instance context, something like self.attr.foo.xyz() says you are noodling around in the implementation details of self.attr (you know it has a data attribute called "foo"). This provides for some very tight coupling between the implementation of whatever self.attr is and your code. If there is a reason for you to get at whatever xyz() returns, it's probably best to publish a method as part of the api for self.attr. Skip From stephen.thorne at gmail.com Thu Jan 20 01:14:55 2005 From: stephen.thorne at gmail.com (Stephen Thorne) Date: Thu Jan 20 07:47:19 2005 Subject: [Python-Dev] Re: Zen of Python In-Reply-To: <972ec5bd05011916033242179@mail.gmail.com> References: <972ec5bd050119111359e358f5@mail.gmail.com> <3e8ca5c8050119150358c71728@mail.gmail.com> <972ec5bd05011916033242179@mail.gmail.com> Message-ID: <3e8ca5c805011916144ce90c27@mail.gmail.com> On Wed, 19 Jan 2005 19:03:25 -0500, Timothy Fitz wrote: > On Thu, 20 Jan 2005 09:03:30 +1000, Stephen Thorne > wrote: > > "Flat is better than nested" has one foot in concise powerful > > programming, the other foot in optimisation. > > > > foo.bar.baz.arr involves 4 hashtable lookups. arr is just one hashtable lookup. > > I find it amazingly hard to believe that this is implying optimization > over functionality or clarity. There has to be another reason, yet I > can't think of any. What I meant to say was, 'flat is better than nested' allows you to write more concise code, while also writing faster code. Stephen. From aleax at aleax.it Thu Jan 20 09:09:36 2005 From: aleax at aleax.it (Alex Martelli) Date: Thu Jan 20 09:09:42 2005 Subject: [Python-Dev] Re: Zen of Python In-Reply-To: <16879.3478.511912.271438@montanaro.dyndns.org> References: <3e8ca5c8050119150358c71728@mail.gmail.com> <972ec5bd050119111359e358f5@mail.gmail.com> <5.1.1.6.0.20050119190842.032382a0@mail.telecommunity.com> <16879.3478.511912.271438@montanaro.dyndns.org> Message-ID: <9FDAA05E-6ABA-11D9-9DED-000A95EFAE9E@aleax.it> On 2005 Jan 20, at 02:47, Skip Montanaro wrote: > Phillip> Actually, this is one of those rare cases where > optimization > Phillip> and clarity go hand in hand. Human brains just don't > handle > Phillip> nesting that well. It's easy to visualize two levels of > nested > Phillip> structure, but three is a stretch unless you can abstract > at > Phillip> least one of the layers. > > Also, if you think about nesting in a class/instance context, > something like > > self.attr.foo.xyz() > > says you are noodling around in the implementation details of > self.attr (you > know it has a data attribute called "foo"). This provides for some > very > tight coupling between the implementation of whatever self.attr is and > your > code. If there is a reason for you to get at whatever xyz() returns, > it's > probably best to publish a method as part of the api for self.attr. Good point: this is also known as "Law of Demeter" and relevant summaries and links are for example at http://www.ccs.neu.edu/home/lieber/LoD.html . Alex From just at letterror.com Thu Jan 20 09:48:41 2005 From: just at letterror.com (Just van Rossum) Date: Thu Jan 20 09:48:48 2005 Subject: [Python-Dev] Unix line endings required for PyRun* breaking embedded Python In-Reply-To: <16879.3179.793126.174549@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > Just re.sub("[\r\n]+", "\n", s) and I think you're good to go. I don't think that in general you want to fold multiple empty lines into one. This would be my prefered regex: s = re.sub(r"\r\n?", "\n", s) Catches both DOS and old-style Mac line endings. Alternatively, you can use s.splitlines(): s = "\n".join(s.splitlines()) + "\n" This also makes sure the string ends with a \n, which may or may not be a good thing, depending on your application. Just From fredrik at pythonware.com Thu Jan 20 09:57:50 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Jan 20 09:57:43 2005 Subject: [Python-Dev] Re: Unix line endings required for PyRun* breaking embedded Python References: <16879.3179.793126.174549@montanaro.dyndns.org> Message-ID: Just van Rossum wrote: > I don't think that in general you want to fold multiple empty lines into > one. This would be my prefered regex: > > s = re.sub(r"\r\n?", "\n", s) > > Catches both DOS and old-style Mac line endings. Alternatively, you can > use s.splitlines(): > > s = "\n".join(s.splitlines()) + "\n" > > This also makes sure the string ends with a \n, which may or may not be > a good thing, depending on your application. s = s.replace("\r", "\n"["\n" in s:]) From gvanrossum at gmail.com Thu Jan 20 12:07:35 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Jan 20 12:07:41 2005 Subject: [Python-Dev] Updated Monkey Typing pre-PEP In-Reply-To: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com> References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com> Message-ID: [Phillip J. Eby] > I've revised the draft today to simplify the terminology, discussing only > two broad classes of adapters. Since Clark's pending proposals for PEP 246 > align well with the concept of "extenders" vs. "independent adapters", I've > refocused my PEP to focus exclusively on adding support for "extenders", > since PEP 246 already provides everything needed for independent adapters. > > The new draft is here: > http://peak.telecommunity.com/DevCenter/MonkeyTyping On the plane to the Amazon.com internal developers conference in Seattle (a cool crowd BTW) I finally got to read this. I didn't see a way to attach comments to Phillip's draft, so here's my response. (And no, it hasn't affected my ideas about optional typing. :) The Monkey Typing proposal is trying to do too much, I believe. There are two or three separate problem, and I think it would be better to deal with each separately. The first problem is what I'd call incomplete duck typing. There is a function that takes a sequence argument, and you have an object that partially implements the sequence protocol. What do you do? In current Python, you just pass the object and pray -- if the function only uses the methods that your object implements, it works, otherwise you'll get a relatively clean AttributeError (e.g. "Foo instance has no attribute '__setitem__'"). Phillip worries that solving this with interfaces would cause a proliferation of "partial sequence" interfaces representing the needs of various libraries. Part of his proposal comes down to having a way to declare that some class C implements some interface I, even if C doesn't implement all of I's methods (as long as implements at least one). I like having this ability, but I think this fits in the existing proposals for declaring interface conformance: there's no reason why C couldn't have a __conform__ method that claims it conforms to I even if it doesn't implement all methods. Or if you don't want to modify C, you can do the same thing using the external adapter registry. I'd also like to explore ways of creating partial interfaces on the fly. For example, if we need only the read() and readlines() methods of the file protocol, maybe we could declare that as follows:: def foo(f: file['read', 'readlines']): ... I find the quoting inelegant, so maybe this would be better:: file[file.read, file.readlines] Yet another idea (which places a bigger burden on the typecheck() function presumed by the type declaration notation, see my blog on Artima.com) would be to just use a list of the needed methods:: [file.read, file.readlines] All this would work better if file weren't a concrete type but an interface. Now on to the other problems Phillip is trying to solve with his proposal. He says, sometimes there's a class that has the functionality that you need, but it's packaged differently. I'm not happy with his proposal for solving this by declaring various adapting functions one at a time, and I'd much rather see this done without adding new machinery or declarations: when you're using adaptation, just write an adapter class and register it; without adaptation, you can still write the adapter class and explicitly instantiate it. I have to admit that I totally lost track of the proposal when it started to talk about JetPacks. I believe that this is trying to deal with stateful adapters. I hope that Phillip can write something up about these separately from all the other issues, maybe then it's clearer. There's one other problem that Phillip tries to tackle in his proposal: how to implement the "rich" version of an interface if all you've got is a partial implementation (e.g. you might have readline() but you need readlines()). I think this problem is worthy of a solution, but I think the solution could be found, again, in a traditional adapter class. Here's a sketch:: class RichFile: def __init__(self, ref): self.__ref = ref if not hasattr(ref, 'readlines'): self.readlines = self.__readlines # Other forms of this magic are conceivably def __readlines(self): # Ignoring the rarely used optional argument # It's tempting to use [line for line in self.__ref] here but that doesn't use readline() lines = [] while True: line = self.__ref.readline() if not line: break lines.append(line) return lines def __getattr__(self, name): # Delegate all other attributes to the underlying object return getattr(self.__ref, name) Phillip's proposal reduces the amount of boilerplate in this class somewhat (mostly the constructor and the __getattr__() method), but apart from that it doesn't really seem to do a lot except let you put pieces of the adapter in different places, which doesn't strike me as such a great idea. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From marktrussell at btopenworld.com Thu Jan 20 13:33:05 2005 From: marktrussell at btopenworld.com (Mark Russell) Date: Thu Jan 20 13:33:08 2005 Subject: [Python-Dev] Updated Monkey Typing pre-PEP In-Reply-To: References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com> Message-ID: <1106224384.5347.6.camel@localhost> On Thu, 2005-01-20 at 11:07, Guido van Rossum wrote: > I'd also like to explore ways of creating partial interfaces on the > fly. For example, if we need only the read() and readlines() methods > of the file protocol, maybe we could declare that as follows:: > > def foo(f: file['read', 'readlines']): ... > > I find the quoting inelegant, so maybe this would be better:: > > file[file.read, file.readlines] Could you not just have a builtin which constructs an interface on the fly, so you could write: def foo(f: interface(file.read, file.readlines)): ... For commonly used subsets of course you'd do something like: IInputStream = interface(file.read, file.readlines) def foo(f: IInputStream): ... I can't see that interface() would need much magic - I would guess you could implement it in python with ordinary introspection. Mark Russell From arigo at tunes.org Thu Jan 20 13:38:26 2005 From: arigo at tunes.org (Armin Rigo) Date: Thu Jan 20 13:50:19 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <41ECD6D9.9000001@egenix.com> References: <1105945019.30052.26.camel@localhost> <41EC431A.90204@egenix.com> <41ECD6D9.9000001@egenix.com> Message-ID: <20050120123826.GA30873@vicky.ecs.soton.ac.uk> Hi, Removing unbound methods also breaks the 'py' lib quite a bit. The 'py.test' framework handles function and bound/unbound method objects all over the place, and uses introspection on them, as they are the objects defining the tests to run. It's nothing that can't be repaired, and at places the fix even looks nicer than the original code, but I thought that it points to large-scale breakage. I'm expecting any code that relies on introspection to break at least here or there. My bet is that even if it's just for fixes a couple of lines long everyone will have to upgrade a number of their packages when switching to Python 2.5 -- unheard of ! For reference, the issues I got with the py lib are described at http://codespeak.net/pipermail/py-dev/2005-January/000159.html Armin From skip at pobox.com Thu Jan 20 13:51:41 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 20 14:00:16 2005 Subject: [Python-Dev] Re: Unix line endings required for PyRun* breaking embedded Python In-Reply-To: References: <16879.3179.793126.174549@montanaro.dyndns.org> Message-ID: <16879.43357.357745.467891@montanaro.dyndns.org> Fredrik> s = s.replace("\r", "\n"["\n" in s:]) This fails on admittedly weird strings that mix line endings: >>> s = "abc\rdef\r\n" >>> s = s.replace("\r", "\n"["\n" in s:]) >>> s 'abcdef\n' where universal newline mode or Just's re.sub() gadget would work. Skip From skip at pobox.com Thu Jan 20 13:44:00 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 20 14:00:18 2005 Subject: [Python-Dev] Unix line endings required for PyRun* breaking embedded Python In-Reply-To: References: <16879.3179.793126.174549@montanaro.dyndns.org> Message-ID: <16879.42896.49304.693682@montanaro.dyndns.org> Just> Skip Montanaro wrote: >> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go. Just> I don't think that in general you want to fold multiple empty Just> lines into one. Whoops. Yes. Skip From mdehoon at ims.u-tokyo.ac.jp Thu Jan 20 14:18:52 2005 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Thu Jan 20 14:14:54 2005 Subject: [Python-Dev] Patch review [ 723201 ] PyArg_ParseTuple problem with 'L' format Message-ID: <41EFAFBC.8000203@ims.u-tokyo.ac.jp> Patch review [ 723201 ] PyArg_ParseTuple problem with 'L' format The PyArg_ParseTuple function (PyObject *args, char *format, ...) parses the arguments args and stores them in the variables specified following the format argument. If format=="i", indicating an integer, but the corresponding Python object in args is not a Python int or long, a TypeError is thrown: TypeError: an integer is required For the "L" format, indicating a long long, instead a SystemError is thrown: SystemError: Objects/longobject.c:788: bad argument to internal function The submitted patch fixes this, however I think it is not the best way to do it. The original code (part of the convertsimple function in Python/getargs.c) is case 'L': {/* PY_LONG_LONG */ PY_LONG_LONG *p = va_arg( *p_va, PY_LONG_LONG * ); PY_LONG_LONG ival = PyLong_AsLongLong( arg ); if( ival == (PY_LONG_LONG)-1 && PyErr_Occurred() ) { return converterr("long", arg, msgbuf, bufsize); } else { *p = ival; } break; } In the patch, a PyLong_Check and a PyInt_Check are added: case 'L': {/* PY_LONG_LONG */ PY_LONG_LONG *p = va_arg(*p_va, PY_LONG_LONG *); PY_LONG_LONG ival; /* ********** patch starts here ********** */ if (!PyLong_Check(arg) && !PyInt_Check(arg)) return converterr("long", arg, msgbuf, bufsize); /* ********** patch ends here ********** */ ival = PyLong_AsLongLong(arg); if (ival == (PY_LONG_LONG)-1 && PyErr_Occurred()) { return converterr("long", arg, msgbuf, bufsize); } else { *p = ival; } break; } However, the PyLong_AsLongLong function (in Objects/longobject.c) also contains a call to PyLong_Check and PyInt_Check, so there should be no need for another such check here: PY_LONG_LONG PyLong_AsLongLong(PyObject *vv) { PY_LONG_LONG bytes; int one = 1; int res; if (vv == NULL) { PyErr_BadInternalCall(); return -1; } if (!PyLong_Check(vv)) { if (PyInt_Check(vv)) return (PY_LONG_LONG)PyInt_AsLong(vv); PyErr_BadInternalCall(); return -1; } A better solution would be to replace the PyErr_BadInternalCall() in the PyLong_AsLongLong function by PyErr_SetString(PyExc_TypeError, "an integer is required"); This would make it consistent with PyInt_AsLong in Objects/intobject.c: long PyInt_AsLong(register PyObject *op) { PyNumberMethods *nb; PyIntObject *io; long val; if (op && PyInt_Check(op)) return PyInt_AS_LONG((PyIntObject*) op); if (op == NULL || (nb = op->ob_type->tp_as_number) == NULL || nb->nb_int == NULL) { PyErr_SetString(PyExc_TypeError, "an integer is required"); return -1; } By the way, I noticed that a Python float is converted to an int (with a deprecation warning), while trying to convert a Python float into a long long int results in a TypeError. Also, I'm not sure about the function of the calls to converterr (in various places in the convertsimple function); none of the argument type errors seem to lead to the warning messages created by converterr. --Michiel. -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From mwh at python.net Thu Jan 20 15:12:37 2005 From: mwh at python.net (Michael Hudson) Date: Thu Jan 20 15:12:40 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 In-Reply-To: <1106185423.3784.26.camel@schizo> (Donovan Baarda's message of "Thu, 20 Jan 2005 12:43:43 +1100") References: <1106111769.3822.52.camel@schizo> <2mpt01bkvs.fsf@starship.python.net> <1106185423.3784.26.camel@schizo> Message-ID: <2mllaob356.fsf@starship.python.net> Donovan Baarda writes: > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote: >> Donovan Baarda writes: > [...] >> You've left out a very important piece of information: which version >> of Python you are using. I'm guessing 2.3.4. Can you try 2.4? > > Debian Python2.3 (2.3.4-18), Debian kernel-image-2.6.8-1-686 (2.6.8-10), > and Debian kernel-image-2.4.27-1-686 (2.4.27-6) > >> I'd be astonished if this is the same bug. >> >> The main oddness about python threads (before 2.3) is that they run >> with all signals masked. You could play with a C wrapper (call >> setprocmask, then exec fop) to see if this is what is causing the >> problem. But please try 2.4. > > Python 2.4 does indeed fix the problem. That's good to hear. > Unfortunately we are using Zope 2.7.4, and I'm a bit wary of > attempting to migrate it all from 2.3 to 2.4. That's not so good to hear, albeit unsurprising. > Is there any way this "Fix" can be back-ported to 2.3? Probably not. It was quite invasive and a bit scary. OTOH, it hasn't been the cause of any bug reports yet, so it can't be all bad. > Note that this problem is being triggered when using > Popen3() in a thread. Popen3() simply uses os.fork() and os.execvp(). > The segfault is occurring in the excecvp'ed process. I'm sure there must > be plenty of cases where this could happen. I think most people manage > to avoid it because the processes they are popen'ing or exec'ing happen > to not use signals. Indeed. > After testing a bit, it seems the fork() in Popen3 is not a contributing > factor. The problem occurs whenever os.execvp() is executed in a thread. > It looks like the exec'ed command inherits the masked signals from the > thread. Yeah. I could have told you that, sorry :) > I'm not sure what the correct behaviour should be. The fact that it > works in python2.4 feels more like a byproduct of the thread mask change > than correct behaviour. Well, getting rid of the thread mask changes was one of the goals of the change. > To me it seems like execvp() should be setting the signal mask back > to defaults or at least the mask of the main process before doing > the exec. Possibly. I think the 2.4 change -- not fiddling the process mask at all -- is the Right Thing, but that doesn't help 2.3 users. This has all been discussed before at some length, on python-dev and in various bug reports on SF. In your situation, I think the simplest thing you can do is dig out an old patch of mine that exposes sigprocmask + co to Python and either make a custom Python incorporating the patch and use that, or put the code from the patch into an extension module. Then before execing fop, use the new code to set the signal mask to something sane. Not pretty, particularly, but it should work. >> > BTW, built in file objects really could use better non-blocking >> > support... I've got a half-drafted PEP for it... anyone interested in >> > it? >> >> Err, this probably should be in a different mail :) > > The verboseness of the attached test code because of this issue prompted > that comment... so vaguely related :-) Oh right :) Didn't actually read the test code, not having fop to hand... Cheers, mwh -- The ability to quote is a serviceable substitute for wit. -- W. Somerset Maugham From skip at pobox.com Thu Jan 20 15:22:09 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 20 16:00:01 2005 Subject: [Python-Dev] ANN: Free Trac/Subversion hosting at Python-Hosting.com (fwd) Message-ID: <16879.48785.395695.742124@montanaro.dyndns.org> Thought I'd pass this along for people who don't read comp.lang.python. Skip -------------- next part -------------- An embedded message was scrubbed... From: remi@cherrypy.org (Remi Delon) Subject: ANN: Free Trac/Subversion hosting at Python-Hosting.com Date: 19 Jan 2005 10:15:02 -0800 Size: 5405 Url: http://mail.python.org/pipermail/python-dev/attachments/20050120/b0ad5d8f/attachment-0001.mht From pje at telecommunity.com Thu Jan 20 17:02:38 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Jan 20 17:02:34 2005 Subject: [Python-Dev] Updated Monkey Typing pre-PEP In-Reply-To: References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com> <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050120101741.0405fb30@mail.telecommunity.com> At 03:07 AM 1/20/05 -0800, Guido van Rossum wrote: >Phillip worries that solving this with interfaces would cause a >proliferation of "partial sequence" interfaces representing the needs >of various libraries. Part of his proposal comes down to having a way >to declare that some class C implements some interface I, even if C >doesn't implement all of I's methods (as long as implements at least >one). I like having this ability, but I think this fits in the >existing proposals for declaring interface conformance: there's no >reason why C couldn't have a __conform__ method that claims it >conforms to I even if it doesn't implement all methods. Or if you >don't want to modify C, you can do the same thing using the external >adapter registry. There are some additional things that it does in this area: 1. Avoids namespace collisions when an object has a method with the same name as one in an interface, but which doesn't do the same thing. (A common criticism of duck typing by static typing advocates; i.e. how do you know that 'read()' has the same semantics as what this routine expects?) 2. Provides a way to say that you conform, without writing a custom __conform__ method 3. Syntax for declaring conformance is the same as for adaptation 4. Allows *external* (third-party) code to declare a type's conformance, which is important for integrating existing code with code with type declarations >I'd also like to explore ways of creating partial interfaces on the >fly. For example, if we need only the read() and readlines() methods >of the file protocol, maybe we could declare that as follows:: > > def foo(f: file['read', 'readlines']): ... FYI, this is similar to the suggestion from Samuele Pedroni that lead to PyProtocols having a: protocols.protocolForType(file, ['read','readlines']) capability, that implements this idea. However, the problem with implementing it by actually having distinct protocols is that declaring as few as seven methods results in 127 different protocol objects with conformance relationships to manage. In practice, I've also personally never used this feature, and probably never would unless it had meaning for type declarations. Also, your proposal as shown would be tedious for the declarer compared to just saying 'file' and letting the chips fall where they may. >Now on to the other problems Phillip is trying to solve with his >proposal. He says, sometimes there's a class that has the >functionality that you need, but it's packaged differently. I'm not >happy with his proposal for solving this by declaring various adapting >functions one at a time, and I'd much rather see this done without >adding new machinery or declarations: when you're using adaptation, >just write an adapter class and register it; without adaptation, you >can still write the adapter class and explicitly instantiate it. In the common case (at least for my code) an adapter class has only one or two methods, but the additional code and declarations needed to make it an adapter can increase the code size by 20-50%. Using @like directly on an adapting method would result in a more compact expression in the common case. >I have to admit that I totally lost track of the proposal when it >started to talk about JetPacks. I believe that this is trying to deal >with stateful adapters. I hope that Phillip can write something up >about these separately from all the other issues, maybe then it's >clearer. Yes, it was for per-object ("as a") adapter state, rather than per-adapter ("has a") state, however. The PEP didn't try to tackle "has a" adapters at all. >Phillip's proposal reduces the amount of boilerplate in this class >somewhat (mostly the constructor and the __getattr__() method), Actually, it wouldn't implement the __getattr__; a major point of the proposal is that when adapting to an interface, you get *only* the attributes from the interface, and of those only the ones that the adaptee has implementations for. So, arbitrary __getattr__ doesn't pass down to the adapted item. > but >apart from that it doesn't really seem to do a lot except let you put >pieces of the adapter in different places, which doesn't strike me as >such a great idea. The use case for that is that you are writing a package which extends an interface IA to create interface IB, and there already exist numerous adapters to IA. As long as IB's additional methods can be defined in terms of IA, then you can extend all of those adapters at one stroke. In other words, external abstract operations are exactly equivalent to stateless, lossless, interface-to-interface adapters applied transitively. But the point of the proposal was to avoid having to explain to somebody what all those terms mean, while making it easier to do such an adaptation correctly and succinctly. One problem with using concrete adapter classes to full interfaces rather than partial interfaces is that it leads to situations like Alex's adapter diamond examples, because you end up with not-so-good adapters and few common adaptation targets. The idea of operation conformance and interface-as-namespace is to make it easier to have fewer interfaces and therefore fewer adapter diamonds. And, equally important, if you have only partial conformance you don't have to worry about claiming to have more information/ability than you actually have, which was the source of the problem in one class of Alex's examples. If you substitute per-operation adapters in Alex's PersonName example, the issue disappears because there isn't an adapter claiming to supply a middle name that it doesn't have; that operation or attribute simply doesn't appear on the dynamic adapter class in that case. By the way, this concept is also exactly equivalent to single-dispatched generic functions in a language like Dylan. In Dylan, a protocol consists of a set of abstract generic functions, not unlike the no-op methods in a Python interface. However, instead of adapting objects or declaring their conformance, you declare how those methods are implemented for a particular subject type, and that does not have to be in the class for the subject type, or in the class where the method is. And when you invoke the operation, you do the moral equivalent of 'file.read(file_like_object, bytes)', rather than 'file_like_object.read(bytes)', and the right implementation is looked up by the concrete type of 'file_like_object'. Of course, that's not a very Pythonic style, so the idea of this PEP was to swap it around so the type declaration of 'file' is automatically turning 'filelike.read(bytes)' into 'file_interface.read(filelike,bytes)' internally. Pickling and copying and such in the stdlib are already generic functions of this kind. You have a dictionary of type->implementation for each of these operations. The table is explicit and the lookup is explicit, and adaptation doesn't come into it, but this is basically the same as what you'd do in Dylan by having the moral equivalent of a 'picklable' protocol with a 'pickle(ob,stream)' generic function, and implementations declared elsewhere. So, the concept of registering implementations of an operation in an interface for a given concrete type (that can happen from third-party code!) certainly isn't without precedent in Python. Once you look at it through that lens, then you will see that everything in the proposal that doesn't deal with stateful adaptation is just a straightforward way to flip from 'operation(ob,...)' to 'ob.operation(...)', where the original 'operation()' is a type-registered operation like 'pickle', but created automatically for existing operations like file.read. So if it "does too much", it's only because that one concept of a type-dispatched function in Python provides for many possibilities. :) From martin at v.loewis.de Thu Jan 20 17:22:53 2005 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 20 17:22:49 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41ED9B02.2040908@xs4all.nl> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl> Message-ID: <41EFDADD.10701@v.loewis.de> Irmen de Jong wrote: > That sounds very convenient, thanks. Ok, welcome to the project! Please let me know whether it "works". > Does the status of 'python project member' come with > certain expectations that must be complied with ? ;-) There are a few conventions that are followed more or less stringently. You should be aware of the things in the developer FAQ, http://www.python.org/dev/devfaq.html Initially, "new" developers should follow a "write-after-approval" procedure, i.e. they should not commit anything until they got somebody's approval. Later, we commit things which we feel confident about, and post other things to SF. For CVS, I'm following a few more conventions which I think are not documented anywhere. - Always add a CVS commit message - Add an entry to Misc/NEWS, if there is a new feature, or if it is a bug fix for a maintenance branch (I personally don't list bug fixed in the HEAD revision, but others apparently do) - When committing configure.in, always remember to commit configure also (and pyconfig.h.in if it changed; remember to run autoheader) - Always run the test suite before committing - If you are committing a bug fix, consider to backport it to maintenance branches right away. If you don't backport it immediately, it likely won't appear in the next release. At the moment, backports to 2.4 are encouraged; backports to 2.3 are still possible for a few more days. If you chose not to backport for some reason, document that reason in the commit message. If you plan to backport, document that intention in the commit message (I usually say "Will backport to 2.x") - In the commit message, always refer to the SF tracker id. In the tracker item, always refer to CVS version numbers. I use the script attached to extract those numbers from the CVS commit message, to paste them into the SF tracker. I probably forgot to mention a few things; you'll notice few enough :-) HTH, Martin From tim.peters at gmail.com Thu Jan 20 17:59:31 2005 From: tim.peters at gmail.com (Tim Peters) Date: Thu Jan 20 17:59:34 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41EFDADD.10701@v.loewis.de> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl> <41EFDADD.10701@v.loewis.de> Message-ID: <1f7befae050120085919bdfd2f@mail.gmail.com> [Martin v. L?wis] ... > - Add an entry to Misc/NEWS, if there is a new feature, > or if it is a bug fix for a maintenance branch > (I personally don't list bug fixed in the HEAD revision, > but others apparently do) You should. In part this is to comply with license requirements: we're a derivative work from CNRI and BeOpen's Python releases, and their licenses require that we include "a brief summary of the changes made to Python". That certainly includes changes made to repair bugs. It's also extremely useful in practice to have a list of repaired bugs in NEWS! That saved me hours just yesterday, when trying to account for a Zope3 test that fails under Python 2.4 but works under 2.3.4. 2.4 NEWS pointed out that tuple hashing changed to close bug 942952, which I can't imagine how I would have remembered otherwise. From martin at v.loewis.de Thu Jan 20 18:29:19 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 20 18:29:15 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <1f7befae050120085919bdfd2f@mail.gmail.com> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl> <41EFDADD.10701@v.loewis.de> <1f7befae050120085919bdfd2f@mail.gmail.com> Message-ID: <41EFEA6F.7010600@v.loewis.de> Tim Peters wrote: > It's also extremely useful in practice to have a list of repaired bugs > in NEWS! I'm not convinced about that - it makes the NEWS file almost unreadable, as the noise is now so high if every tiny change is listed; it is very hard to see what the important changes are. Regards, Martin From tim.peters at gmail.com Thu Jan 20 18:44:56 2005 From: tim.peters at gmail.com (Tim Peters) Date: Thu Jan 20 18:44:59 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <41EFEA6F.7010600@v.loewis.de> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl> <41EFDADD.10701@v.loewis.de> <1f7befae050120085919bdfd2f@mail.gmail.com> <41EFEA6F.7010600@v.loewis.de> Message-ID: <1f7befae05012009444ad4c822@mail.gmail.com> [Tim Peters] >> It's also extremely useful in practice to have a list of repaired >> bugs in NEWS! [Martin v. L?wis] > I'm not convinced about that - it makes the NEWS file almost > unreadable, as the noise is now so high if every tiny change is > listed; it is very hard to see what the important changes are. My experience disagrees, and I gave a specific example from just the last day. High-level coverage of the important bits is served (and served well) by Andrew's "What's New in Python" doc. (Although I'll note that when I did releases, I tried to sort section contents in NEWS, to put the more important items at the top.) In any case, you snipped the other part here: a brief summary of changes is required by the licenses, and they don't distinguish between changes due to features or bugs. If, for example, we didn't note that tuple hashing changed in NEWS, we would be required to note that in some other file. NEWS is the historical place for it, and works fine for this purpose (according to me ). From irmen at xs4all.nl Thu Jan 20 19:05:55 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Thu Jan 20 19:05:56 2005 Subject: [Python-Dev] A short introduction In-Reply-To: <41EFDADD.10701@v.loewis.de> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl> <41EFDADD.10701@v.loewis.de> Message-ID: <41EFF303.1060004@xs4all.nl> Martin v. L?wis wrote: > Irmen de Jong wrote: > >> That sounds very convenient, thanks. > > > Ok, welcome to the project! Please let me know whether > it "works". It looks that it works, I seem to be able to add a new attachment to the spwd patch- which I will do shortly. * Now that I'm part of the developers group I feel obliged to tell a little bit about myself. I'm a guy from the Netherlands, 30 years old, currently employed at a company designing and developing front- and mid-office web based applications (mostly). We do this in Java (j2ee). I'm using Python where the job allows it, which is much too little IMO :) and for my private stuff, or hobby if you wish. I've been introduced to Python in 1995-6, I think, wanted it at home too, ported it to AmigaDOS (voila AmigaPython), and got more and more involved ever since (starting mostly as a lurker in comp.lang.python). My interests are broad but there's two areas that I particularly like: internet/networking and web/browsers. Over the course of the past few years I developed Pyro, which many of you will probably know (still doing small improvements on that) and more recently, Snakelets and Frog (my own web server and blog app). My C/C++ skills are getting a bit rusty now because I do almost all of my programming in Java or Python, but it's still good enough to be able to contribute to (C)Python, I think. My interest in contributing to Python itself was sparked by the last bug day organized by Johannes Gijsbers. I hope to be able to find time to contribute more often. Well, that about sums it up I think ! >> Does the status of 'python project member' come with >> certain expectations that must be complied with ? ;-) > > > There are a few conventions that are followed more > or less stringently. You should be aware of the > things in the developer FAQ, > > http://www.python.org/dev/devfaq.html > > Initially, "new" developers should follow a > "write-after-approval" procedure, i.e. they should not > commit anything until they got somebody's approval. > Later, we commit things which we feel confident about, > and post other things to SF. I don't think I will be committing stuff any time soon. But thanks for mentioning. Bye -Irmen de Jong From martin at v.loewis.de Thu Jan 20 19:07:23 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 20 19:07:19 2005 Subject: [Python-Dev] a bunch of Patch reviews In-Reply-To: <1f7befae05012009444ad4c822@mail.gmail.com> References: <41EA9196.1020709@xs4all.nl> <41ED645B.40709@xs4all.nl> <41ED8B0A.7050201@v.loewis.de> <41ED9B02.2040908@xs4all.nl> <41EFDADD.10701@v.loewis.de> <1f7befae050120085919bdfd2f@mail.gmail.com> <41EFEA6F.7010600@v.loewis.de> <1f7befae05012009444ad4c822@mail.gmail.com> Message-ID: <41EFF35B.3020706@v.loewis.de> Tim Peters wrote: > My experience disagrees, and I gave a specific example from just the > last day. High-level coverage of the important bits is served (and > served well) by Andrew's "What's New in Python" doc. (Although I'll > note that when I did releases, I tried to sort section contents in > NEWS, to put the more important items at the top.) I long ago stopped reading the NEWS file, because it is just too much text. However, if it is desirable to list any change in the NEWS file, I'm willing to comply. Regards, Martin From python at rcn.com Fri Jan 21 00:21:17 2005 From: python at rcn.com (Raymond Hettinger) Date: Fri Jan 21 00:24:48 2005 Subject: [Python-Dev] Updated Monkey Typing pre-PEP In-Reply-To: Message-ID: <001d01c4ff46$bec52040$793ec797@oemcomputer> [Guido van Rossum] > There's one other problem that Phillip tries to tackle in his > proposal: how to implement the "rich" version of an interface if all > you've got is a partial implementation (e.g. you might have readline() > but you need readlines()). I think this problem is worthy of a > solution, but I think the solution could be found, again, in a > traditional adapter class. Here's a sketch:: > > class RichFile: > def __init__(self, ref): > self.__ref = ref > if not hasattr(ref, 'readlines'): > self.readlines = self.__readlines # Other forms of this > magic are conceivably > def __readlines(self): # Ignoring the rarely used optional argument > # It's tempting to use [line for line in self.__ref] here but > that doesn't use readline() > lines = [] > while True: > line = self.__ref.readline() > if not line: > break > lines.append(line) > return lines > def __getattr__(self, name): # Delegate all other attributes to > the underlying object > return getattr(self.__ref, name) Instead of a __getattr__ solution, I recommend subclassing from a mixin: class RichMap(SomePartialMapping, UserDict.DictMixin): pass class RichFile(SomePartialFileClass, Mixins.FileMixin): pass Raymond From abo at minkirri.apana.org.au Fri Jan 21 00:56:00 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Fri Jan 21 00:56:39 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 In-Reply-To: <2mllaob356.fsf@starship.python.net> References: <1106111769.3822.52.camel@schizo> <2mpt01bkvs.fsf@starship.python.net> <1106185423.3784.26.camel@schizo> <2mllaob356.fsf@starship.python.net> Message-ID: <1106265360.1537.24.camel@schizo> On Thu, 2005-01-20 at 14:12 +0000, Michael Hudson wrote: > Donovan Baarda writes: > > > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote: > >> Donovan Baarda writes: [...] > >> The main oddness about python threads (before 2.3) is that they run > >> with all signals masked. You could play with a C wrapper (call > >> setprocmask, then exec fop) to see if this is what is causing the > >> problem. But please try 2.4. > > > > Python 2.4 does indeed fix the problem. > > That's good to hear. [...] I still don't understand what Linux 2.4 vs Linux 2.6 had to do with it. Reading the man pages for execve(), pthread_sigmask() and sigprocmask(), I can see some ambiguities, but mostly only if you do things they warn against (ie, use sigprocmask() instead of pthread_sigmask() in a multi-threaded app). The man page for execve() says that the new process will inherit the "Process signal mask (see sigprocmask() )". This implies to me it will inherit the mask from the main process, not the thread's signal mask. It looks like Linux 2.4 uses the signal mask of the main thread or process for the execve(), whereas Linux 2.6 uses the thread's signal mask. Given that execve() replaces the whole process, including all threads, I dunno if using the thread's mask is right. Could this be a Linux 2.6 kernel bug? > > I'm not sure what the correct behaviour should be. The fact that it > > works in python2.4 feels more like a byproduct of the thread mask change > > than correct behaviour. > > Well, getting rid of the thread mask changes was one of the goals of > the change. I gathered that... which kinda means the fact that it fixed execvp in threads is a side effect...(though I also guess it fixed a lot of other things like this too). > > To me it seems like execvp() should be setting the signal mask back > > to defaults or at least the mask of the main process before doing > > the exec. > > Possibly. I think the 2.4 change -- not fiddling the process mask at > all -- is the Right Thing, but that doesn't help 2.3 users. This has > all been discussed before at some length, on python-dev and in various > bug reports on SF. Would a simple bug-fix for 2.3 be to have os.execvp() set the mask to something sane before executing C execvp()? Given that Python does not have any visibility of the procmask... This might be a good idea regardless as it will protect against this bug resurfacing in the future if someone decides fiddling with the mask for threads is a good idea again. > In your situation, I think the simplest thing you can do is dig out an > old patch of mine that exposes sigprocmask + co to Python and either > make a custom Python incorporating the patch and use that, or put the > code from the patch into an extension module. Then before execing > fop, use the new code to set the signal mask to something sane. Not > pretty, particularly, but it should work. The extension module that exposes sigprocmask() is probably best for now... -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From pycon at python.org Fri Jan 21 01:06:05 2005 From: pycon at python.org (Steve Holden) Date: Fri Jan 21 01:06:06 2005 Subject: [Python-Dev] PyCon Preliminary Program Announced! Message-ID: <20050121000605.AAC3C1E400F@bag.python.org> Dear Python Colleague: You will be happy to know that the PyCon Program Committee, after lengthy deliberations, has now finalized the program for PyCon DC 2005. I can tell you that the decision-making was very difficult, as the standard of submissions was even higher than last year. You can see the preliminary program at http://www.python.org/pycon/2005/schedule.html and it's obvious that this year's PyCon is going to be even fuller than the last. On innovation is that there will be activities of a more social nature on the Wednesday (and perhaps the Thursday) evening, as well as keynote speeches from Guido and two other luminaries. Remember that the early bird registration rates end in just over a week, so hurry on down to http://www.python.org/pycon/2005/register.html to be sure of your place in what will surely be the premier Python event of the year. As always, I would appreciate your help in getting the word out. Please forward this message to your favorite mailing lists and newsgroups to make sure that everyone has a chance to join in the fun! regards Steve Holden Chairman, PyCON DC 2005 -- PyCon DC 2005: The third Python Community Conference http://www.pycon.org/ http://www.python.org/pycon/ The scoop on Python implementations and applications From noamraph at gmail.com Fri Jan 21 01:50:01 2005 From: noamraph at gmail.com (Noam Raphael) Date: Fri Jan 21 01:50:04 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: Message-ID: Hello, I would like to add here another small thing which I encountered this week, and seems to follow the same logic as does Guido's proposal. It's about staticmethods. I was writing a class, and its pretty-printing method got a function for converting a value to a string as an argument. I wanted to supply a default function. I thought that it should be in the namespace of the class, since its main use lies there. So I made it a staticmethod. But - alas! After I declared the function a staticmethod, I couldn't make it a default argument for the method, since there's nothing to do with staticmethod instances. The minor solution for this is to make staticmethod objects callable. This would solve my problem. But I suggest a further step: I suggest that if this is done, it would be nice if classname.staticmethodname would return the classmethod instance, instead of the function itself. I know that this things seems to contradict Guido's proposal, since he suggests to return the function instead of a strange object, and I suggest to return a strange object instead of a function. But this is not true; Both are according to the idea that class attributes should be, when possible, the same objects that were created when defining the class. This is more consistent with the behaviour of modules (module attributes are the objects that were created when the code was run), and is more consistent with the general convention, that running A = B causes A == B to be true. Currently, Class.func = staticmethod(func), and Class.func = func, don't behave by this rule. If the suggestions are accepted, both will. I just think it's simpler and cleaner that way. Just making staticmethods callable would solve my practical problem too. Noam Raphael From gvanrossum at gmail.com Fri Jan 21 05:20:24 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 21 05:20:26 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: Message-ID: > It's about staticmethods. I was writing a class, and its > pretty-printing method got a function for converting a value to a > string as an argument. I wanted to supply a default function. I > thought that it should be in the namespace of the class, since its > main use lies there. So I made it a staticmethod. > > But - alas! After I declared the function a staticmethod, I couldn't > make it a default argument for the method, since there's nothing to do > with staticmethod instances. > > The minor solution for this is to make staticmethod objects callable. > This would solve my problem. But I suggest a further step: I suggest > that if this is done, it would be nice if classname.staticmethodname > would return the classmethod instance, instead of the function itself. > I know that this things seems to contradict Guido's proposal, since he > suggests to return the function instead of a strange object, and I > suggest to return a strange object instead of a function. But this is > not true; Both are according to the idea that class attributes should > be, when possible, the same objects that were created when defining > the class. This is more consistent with the behaviour of modules > (module attributes are the objects that were created when the code was > run), and is more consistent with the general convention, that running > A = B > causes > A == B > to be true. Currently, Class.func = staticmethod(func), and Class.func > = func, don't behave by this rule. If the suggestions are accepted, > both will. Well, given that attribute assignment can be overloaded, you can't depend on that requirement all the time. > I just think it's simpler and cleaner that way. Just making > staticmethods callable would solve my practical problem too. The use case is fairly uncommon (though not invalid!), and making staticmethod callable would add more code without much benefits. I recommend that you work around it by setting the default to None and substituting the real default in the function. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Fri Jan 21 05:21:20 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 21 05:21:22 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <20050120123826.GA30873@vicky.ecs.soton.ac.uk> References: <1105945019.30052.26.camel@localhost> <41EC431A.90204@egenix.com> <41ECD6D9.9000001@egenix.com> <20050120123826.GA30873@vicky.ecs.soton.ac.uk> Message-ID: > Removing unbound methods also breaks the 'py' lib quite a bit. The 'py.test' > framework handles function and bound/unbound method objects all over the > place, and uses introspection on them, as they are the objects defining the > tests to run. OK, I'm convinced. Taking away im_class is going to break too much code. I hereby retract the patch. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Fri Jan 21 05:27:57 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 21 05:28:03 2005 Subject: [Python-Dev] Updated Monkey Typing pre-PEP In-Reply-To: <5.1.1.6.0.20050120101741.0405fb30@mail.telecommunity.com> References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com> <5.1.1.6.0.20050120101741.0405fb30@mail.telecommunity.com> Message-ID: Phillip, it looks like you're not going to give up. :) I really don't want to accept your proposal into core Python, but I think you ought to be able to implement everything you propose as part of PEAK (or whatever other framework). Therefore, rather than continuing to argue over the merits of your proposal, I'd like to focus on what needs to be done so you can implement it. The basic environment you can assume: an adaptation module according to PEP 246, type declarations according to my latest blog (configurable per module or per class by defining __typecheck__, but defaulting to something conservative that either returns the original object or raises an exception). What do you need then? [My plane is about to leave, gotta run!] -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mdehoon at ims.u-tokyo.ac.jp Fri Jan 21 06:38:50 2005 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Fri Jan 21 06:34:58 2005 Subject: [Python-Dev] Patch review [ 1093585 ] sanity check for readline remove/replace Message-ID: <41F0956A.4010606@ims.u-tokyo.ac.jp> Patch review [ 1093585 ] sanity check for readline remove/replace The functions remove_history_item and replace_history_item in the readline module respectively remove and replace an item in the history of commands. As outlined in bug [ 1086603 ], both functions cause a segmentation fault if the item index is negative. This is actually a bug in the corresponding functions in readline, which return a NULL pointer if the item index is larger than the size of the history, but does not check for the item index being negative. I sent a patch to bug-readline@gnu.org, so this will probably be fixed in future versions of readline. But for now, we need a workaround in Python. The patched code checks if the item index is negative, and issues an error message if so. I have run the test suite after applying this patch, and I found no problems with it. Note that there is one more way to fix this bug, which is to interpret negative indeces as counting from the end (same as lists and strings for exampe). So remove_history_item(-1) removes the last item added to the history etc. In that case, get_history_item should change as well. Right now get_history_item(-1) returns None, so the patch introduces a small (and probably insignificant) inconsistency: get_history_item(-1) returns None but remove_history_item(-1) raises an error. --Michiel. -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From stuart at stuartbishop.net Fri Jan 21 08:18:59 2005 From: stuart at stuartbishop.net (Stuart Bishop) Date: Fri Jan 21 08:19:12 2005 Subject: [Python-Dev] Unix line endings required for PyRun* breaking embedded Python In-Reply-To: References: Message-ID: <41F0ACE3.8030002@stuartbishop.net> Just van Rossum wrote: > Skip Montanaro wrote: > > >>Just re.sub("[\r\n]+", "\n", s) and I think you're good to go. > > > I don't think that in general you want to fold multiple empty lines into > one. This would be my prefered regex: > > s = re.sub(r"\r\n?", "\n", s) > > Catches both DOS and old-style Mac line endings. Alternatively, you can > use s.splitlines(): > > s = "\n".join(s.splitlines()) + "\n" > > This also makes sure the string ends with a \n, which may or may not be > a good thing, depending on your application. Do people consider this a bug that should be fixed in Python 2.4.1 and Python 2.3.6 (if it ever exists), or is the resposibility for doing this transformation on the application that embeds Python? -- Stuart Bishop http://www.stuartbishop.net/ From fredrik at pythonware.com Fri Jan 21 08:27:39 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Jan 21 08:27:31 2005 Subject: [Python-Dev] Re: Unix line endings required for PyRun* breakingembedded Python References: <41F0ACE3.8030002@stuartbishop.net> Message-ID: Stuart Bishop wrote: > Do people consider this a bug that should be fixed in Python 2.4.1 and Python 2.3.6 (if it ever > exists), or is the resposibility for doing this transformation on the application that embeds > Python? the text you quoted is pretty clear on this: It is envisioned that such strings always have the standard \n line feed, if the strings come from a file that file can be read with universal newlines. just add the fix, already (you don't want plpythonu to depend on a future release anyway) From noamraph at gmail.com Fri Jan 21 08:59:47 2005 From: noamraph at gmail.com (Noam Raphael) Date: Fri Jan 21 08:59:50 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: References: Message-ID: > > and is more consistent with the general convention, that running > > A = B > > causes > > A == B > > to be true. Currently, Class.func = staticmethod(func), and Class.func > > = func, don't behave by this rule. If the suggestions are accepted, > > both will. > > Well, given that attribute assignment can be overloaded, you can't > depend on that requirement all the time. > Yes, I know. For example, I don't know how you can make this work for classmethods. (although I have the idea that if nested scopes were including classes, and there was a way to assign names to a different scope, then there would be no need for them. But I have no idea how this can be done, so never mind.) I just think of it as a very common convention, and I don't find the exceptions "aesthetically pleasing". But of course, I accept practical reasons for not making it that way. > I recommend that you work around it by setting the default to None and > substituting the real default in the function. That's a good idea, I will probably use it. (I thought of a different way: don't use decorators, and wrap the function in a staticmethod after defining the function that uses it. But this is really ugly.) Thanks for your reply, Noam From walter at livinglogic.de Fri Jan 21 13:10:03 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri Jan 21 13:10:06 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41EE4797.6030105@egenix.com> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <41EE4797.6030105@egenix.com> Message-ID: <41F0F11B.8000600@livinglogic.de> M.-A. Lemburg wrote: > [...] > __str__ and __unicode__ as well as the other hooks were > specifically added for the type constructors to use. > However, these were added at a time where sub-classing > of types was not possible, so it's time now to reconsider > whether this functionality should be extended to sub-classes > as well. So can we reach consensus on this, or do we need a BDFL pronouncement? Bye, Walter D?rwald From Jack.Jansen at cwi.nl Fri Jan 21 13:36:55 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Fri Jan 21 13:36:04 2005 Subject: [Python-Dev] Updated Monkey Typing pre-PEP In-Reply-To: References: <5.1.1.6.0.20050116130723.034a10d0@mail.telecommunity.com> Message-ID: <2268316A-6BA9-11D9-88A6-000A958D1666@cwi.nl> On 20 Jan 2005, at 12:07, Guido van Rossum wrote: > The first problem is what I'd call incomplete duck typing. Confit de canard-typing? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From mwh at python.net Fri Jan 21 13:46:41 2005 From: mwh at python.net (Michael Hudson) Date: Fri Jan 21 13:46:43 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 In-Reply-To: <1106265360.1537.24.camel@schizo> (Donovan Baarda's message of "Fri, 21 Jan 2005 10:56:00 +1100") References: <1106111769.3822.52.camel@schizo> <2mpt01bkvs.fsf@starship.python.net> <1106185423.3784.26.camel@schizo> <2mllaob356.fsf@starship.python.net> <1106265360.1537.24.camel@schizo> Message-ID: <2m651rar0u.fsf@starship.python.net> Donovan Baarda writes: > On Thu, 2005-01-20 at 14:12 +0000, Michael Hudson wrote: >> Donovan Baarda writes: >> >> > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote: >> >> Donovan Baarda writes: > [...] >> >> The main oddness about python threads (before 2.3) is that they run >> >> with all signals masked. You could play with a C wrapper (call >> >> setprocmask, then exec fop) to see if this is what is causing the >> >> problem. But please try 2.4. >> > >> > Python 2.4 does indeed fix the problem. >> >> That's good to hear. > [...] > > I still don't understand what Linux 2.4 vs Linux 2.6 had to do with > it. I have to admit to not being that surprised that behaviour appears somewhat inexplicable. As you probably know, linux 2.6 has a more-or-less entirely different threads implementation (NPTL) than 2.4 (LinuxThreads) -- so changes in behaviour aren't exactly surprising. Whether they were intentional, a good thing, etc, I have a careful lack of opinion :) > Reading the man pages for execve(), pthread_sigmask() and sigprocmask(), > I can see some ambiguities, but mostly only if you do things they warn > against (ie, use sigprocmask() instead of pthread_sigmask() in a > multi-threaded app). Uh, I don't know how much I'd trust documentation in this situation. Really. Threads and signals are almost inherently incompatible, unfortunately. > The man page for execve() says that the new process will inherit the > "Process signal mask (see sigprocmask() )". This implies to me it will > inherit the mask from the main process, not the thread's signal mask. Um. Maybe. But this is the sort of thing I meant above -- if signals are delivered to threads, not processes, what does the "Process signal mask" mean? The signal mask of the thread that executed main()? I guess you could argue that, but I don't know how much I'd bet on it. > It looks like Linux 2.4 uses the signal mask of the main thread or > process for the execve(), whereas Linux 2.6 uses the thread's signal > mask. I'm not sure that this is the case -- I'm reasonably sure I saw problems caused by the signal masks before 2.6 was ever released. But I could be wrong. > Given that execve() replaces the whole process, including all > threads, I dunno if using the thread's mask is right. Could this be > a Linux 2.6 kernel bug? You could ask, certainly... Although I've done a certain amount of battle with these problems, I don't know what any published standards have to say about these things which is the only real criteria by which it could be called "a bug". >> > I'm not sure what the correct behaviour should be. The fact that it >> > works in python2.4 feels more like a byproduct of the thread mask change >> > than correct behaviour. >> >> Well, getting rid of the thread mask changes was one of the goals of >> the change. > > I gathered that... which kinda means the fact that it fixed execvp in > threads is a side effect...(though I also guess it fixed a lot of other > things like this too). Um. I meant "getting rid of the thread mask" was one of the goals *because* it would fix the problems with execve and system() and friends. >> > To me it seems like execvp() should be setting the signal mask back >> > to defaults or at least the mask of the main process before doing >> > the exec. >> >> Possibly. I think the 2.4 change -- not fiddling the process mask at >> all -- is the Right Thing, but that doesn't help 2.3 users. This has >> all been discussed before at some length, on python-dev and in various >> bug reports on SF. > > Would a simple bug-fix for 2.3 be to have os.execvp() set the mask to > something sane before executing C execvp()? Perhaps. I'm not sure I want to go fiddling there. Maybe someone else does. system(1) presents a problem too, though, which is harder to worm around unless we want to implement it ourselves, in practice. > Given that Python does not have any visibility of the procmask... > > This might be a good idea regardless as it will protect against this bug > resurfacing in the future if someone decides fiddling with the mask for > threads is a good idea again. In the long run, everyone will use 2.4. There are some other details to the changes in 2.4 that have a slight chance of breaking programs which is why I'm uneasy about putting them in 2.3.5 -- for a bug fix release it's much much worse to break a program that was working than to fail to fix one that wasn't. >> In your situation, I think the simplest thing you can do is dig out an >> old patch of mine that exposes sigprocmask + co to Python and either >> make a custom Python incorporating the patch and use that, or put the >> code from the patch into an extension module. Then before execing >> fop, use the new code to set the signal mask to something sane. Not >> pretty, particularly, but it should work. > > The extension module that exposes sigprocmask() is probably best for > now... I hope it helps! Cheers, mwh -- Jokes around here tend to get followed by implementations. -- from Twisted.Quotes From Jack.Jansen at cwi.nl Fri Jan 21 13:44:22 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Fri Jan 21 13:48:32 2005 Subject: [Python-Dev] Unix line endings required for PyRun* breaking embedded Python In-Reply-To: <41F0ACE3.8030002@stuartbishop.net> References: <41F0ACE3.8030002@stuartbishop.net> Message-ID: <2C9A8DCC-6BAA-11D9-88A6-000A958D1666@cwi.nl> On 21 Jan 2005, at 08:18, Stuart Bishop wrote: > Just van Rossum wrote: >> Skip Montanaro wrote: >>> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go. >> I don't think that in general you want to fold multiple empty lines >> into >> one. This would be my prefered regex: >> s = re.sub(r"\r\n?", "\n", s) >> Catches both DOS and old-style Mac line endings. Alternatively, you >> can >> use s.splitlines(): >> s = "\n".join(s.splitlines()) + "\n" >> This also makes sure the string ends with a \n, which may or may not >> be >> a good thing, depending on your application. > > Do people consider this a bug that should be fixed in Python 2.4.1 and > Python 2.3.6 (if it ever exists), or is the resposibility for doing > this transformation on the application that embeds Python? It could theoretically break something: a program that uses unix line-endings but embeds \r or \r\n in string data. But this is rather theoretical, I don't think I'd have a problem with fixing this. The real problem is: who will fix it, because the fix isn't going to be as trivial as the Python code posted here, I'm afraid... -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From ncoghlan at iinet.net.au Fri Jan 21 14:02:18 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri Jan 21 14:02:21 2005 Subject: [Python-Dev] PEP 246 - concrete assistance to developers of new adapter classes Message-ID: <41F0FD5A.1000206@iinet.net.au> Phillip's monkey-typing PEP (and his goal of making it easy to write well behaved adapters) got me wondering about the benefits of providing an adaptation. Adapter class that could be used to reduce the boiler plate required when developing new adapters. Inheriting from it wouldn't be *required* in any way - doing so would simply make it easier to write a good adapter by eliminating or simplifying some of the required code. Being written in Python, it could also serve as good documentation of recommended adapter behaviour. For instance, it could by default preserve a reference to the original object and use that for any further adaptation requests: class Adapter(object): def __init__(self, original): self.original = original def __conform__(self, protocol): return adapt(self.original, protocol) Does anyone else (particularly those with PEAK and Zope interface experience) think such a class would be beneficial in encouraging good practices? Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From bob at redivi.com Fri Jan 21 14:07:40 2005 From: bob at redivi.com (Bob Ippolito) Date: Fri Jan 21 14:07:51 2005 Subject: [Python-Dev] Unix line endings required for PyRun* breaking embedded Python In-Reply-To: <2C9A8DCC-6BAA-11D9-88A6-000A958D1666@cwi.nl> References: <41F0ACE3.8030002@stuartbishop.net> <2C9A8DCC-6BAA-11D9-88A6-000A958D1666@cwi.nl> Message-ID: <6E57C958-6BAD-11D9-B12E-000A95BA5446@redivi.com> On Jan 21, 2005, at 7:44, Jack Jansen wrote: > > On 21 Jan 2005, at 08:18, Stuart Bishop wrote: > >> Just van Rossum wrote: >>> Skip Montanaro wrote: >>>> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go. >>> I don't think that in general you want to fold multiple empty lines >>> into >>> one. This would be my prefered regex: >>> s = re.sub(r"\r\n?", "\n", s) >>> Catches both DOS and old-style Mac line endings. Alternatively, you >>> can >>> use s.splitlines(): >>> s = "\n".join(s.splitlines()) + "\n" >>> This also makes sure the string ends with a \n, which may or may not >>> be >>> a good thing, depending on your application. >> >> Do people consider this a bug that should be fixed in Python 2.4.1 >> and Python 2.3.6 (if it ever exists), or is the resposibility for >> doing this transformation on the application that embeds Python? > > It could theoretically break something: a program that uses unix > line-endings but embeds \r or \r\n in string data. > > But this is rather theoretical, I don't think I'd have a problem with > fixing this. The real problem is: who will fix it, because the fix > isn't going to be as trivial as the Python code posted here, I'm > afraid... Well, Python already does the right thing in Py_Main, but it does not do the right thing from the other places you can use to run code, surely it can't be that hard to fix if the code is already there? -bob From aleax at aleax.it Fri Jan 21 14:29:04 2005 From: aleax at aleax.it (Alex Martelli) Date: Fri Jan 21 14:29:12 2005 Subject: [Python-Dev] PEP 246 - concrete assistance to developers of new adapter classes In-Reply-To: <41F0FD5A.1000206@iinet.net.au> References: <41F0FD5A.1000206@iinet.net.au> Message-ID: <6B31ACEE-6BB0-11D9-9DED-000A95EFAE9E@aleax.it> On 2005 Jan 21, at 14:02, Nick Coghlan wrote: > Phillip's monkey-typing PEP (and his goal of making it easy to write > well behaved adapters) got me wondering about the benefits of > providing an adaptation. Adapter class that could be used to reduce > the boiler plate required when developing new adapters. Inheriting > from it wouldn't be *required* in any way - doing so would simply make > it easier to write a good adapter by eliminating or simplifying some > of the required code. Being written in Python, it could also serve as > good documentation of recommended adapter behaviour. > > For instance, it could by default preserve a reference to the original > object and use that for any further adaptation requests: > > class Adapter(object): > def __init__(self, original): > self.original = original > > def __conform__(self, protocol): > return adapt(self.original, protocol) > > Does anyone else (particularly those with PEAK and Zope interface > experience) think such a class would be beneficial in encouraging good > practices? Yes, there was something just like that in Nevow (pre-move to the zope interfaces) and it sure didn't hurt. Alex From Jack.Jansen at cwi.nl Sat Jan 22 21:50:09 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Sat Jan 22 21:49:24 2005 Subject: [Python-Dev] Unix line endings required for PyRun* breaking embedded Python In-Reply-To: <6E57C958-6BAD-11D9-B12E-000A95BA5446@redivi.com> References: <41F0ACE3.8030002@stuartbishop.net> <2C9A8DCC-6BAA-11D9-88A6-000A958D1666@cwi.nl> <6E57C958-6BAD-11D9-B12E-000A95BA5446@redivi.com> Message-ID: <341299DE-6CB7-11D9-82B4-000D934FF6B4@cwi.nl> On 21-jan-05, at 14:07, Bob Ippolito wrote: > > On Jan 21, 2005, at 7:44, Jack Jansen wrote: > >> >> On 21 Jan 2005, at 08:18, Stuart Bishop wrote: >> >>> Just van Rossum wrote: >>>> Skip Montanaro wrote: >>>>> Just re.sub("[\r\n]+", "\n", s) and I think you're good to go. >>>> I don't think that in general you want to fold multiple empty lines >>>> into >>>> one. This would be my prefered regex: >>>> s = re.sub(r"\r\n?", "\n", s) >>>> Catches both DOS and old-style Mac line endings. Alternatively, you >>>> can >>>> use s.splitlines(): >>>> s = "\n".join(s.splitlines()) + "\n" >>>> This also makes sure the string ends with a \n, which may or may >>>> not be >>>> a good thing, depending on your application. >>> >>> Do people consider this a bug that should be fixed in Python 2.4.1 >>> and Python 2.3.6 (if it ever exists), or is the resposibility for >>> doing this transformation on the application that embeds Python? >> >> It could theoretically break something: a program that uses unix >> line-endings but embeds \r or \r\n in string data. >> >> But this is rather theoretical, I don't think I'd have a problem with >> fixing this. The real problem is: who will fix it, because the fix >> isn't going to be as trivial as the Python code posted here, I'm >> afraid... > > Well, Python already does the right thing in Py_Main, but it does not > do the right thing from the other places you can use to run code, > surely it can't be that hard to fix if the code is already there? IIRC the universal newline support is in the file I/O routines, which I assume aren't used when you execute Python code from a string. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From mal at egenix.com Sun Jan 23 15:27:59 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sun Jan 23 15:28:11 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41F0F11B.8000600@livinglogic.de> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <41EE4797.6030105@egenix.com> <41F0F11B.8000600@livinglogic.de> Message-ID: <41F3B46F.5040205@egenix.com> Walter D?rwald wrote: > M.-A. Lemburg wrote: > > > [...] > >> __str__ and __unicode__ as well as the other hooks were >> specifically added for the type constructors to use. >> However, these were added at a time where sub-classing >> of types was not possible, so it's time now to reconsider >> whether this functionality should be extended to sub-classes >> as well. > > > So can we reach consensus on this, or do we need a > BDFL pronouncement? I don't have a clear picture of what the consensus currently looks like :-) If we're going for for a solution that implements the hook awareness for all ____ hooks, I'd be +1 on that. If we only touch the __unicode__ case, we'd only be created yet another special case. I'd vote -0 on that. Another solution would be to have all type constructors ignore the ____ hooks (which were originally added to provide classes with a way to mimic type behavior). In general, I think we should try to get rid off special cases and go for a clean solution (either way). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 23 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From nnorwitz at gmail.com Sun Jan 23 19:39:42 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun Jan 23 19:39:45 2005 Subject: [Python-Dev] Speed up function calls Message-ID: I added a patch to SF: http://python.org/sf/1107887 I would like feedback on whether the approach is desirable. The patch adds a new method type (flags) METH_ARGS that is used in PyMethodDef. METH_ARGS means the min and max # of arguments are specified in the PyMethodDef by adding 2 new fields. This information can be used in ceval to call the method. No tuple packing/unpacking is required since the C stack is used. The benefits are: * faster function calls * simplify function call machinery by removing METH_NOARGS, METH_O, and possibly METH_VARARGS * more introspection info for C functions (ie, min/max arg count) (not implemented) The drawbacks are: * the defn of the MethodDef (# args) is separate from the function defn * potentially more error prone to write C methods??? I've measured between 13-22% speed improvement (debug build on Operton) when doing simple tests like: ./python ./Lib/timeit.py -v 'pow(3, 5)' I think the difference tends to be fairly constant at about .3 usec per loop. Here's a portion of the patch to show the difference between conventions: -builtin_filter(PyObject *self, PyObject *args) +builtin_filter(PyObject *self, PyObject *func, PyObject *seq) { - PyObject *func, *seq, *result, *it, *arg; + PyObject *result, *it, *arg; int len; /* guess for result list size */ register int j; - if (!PyArg_UnpackTuple(args, "filter", 2, 2, &func, &seq)) - return NULL; - # the are no other changes between METH_O and METH_ARGS - {"abs", builtin_abs, METH_O, abs_doc}, + {"abs", builtin_abs, METH_ARGS, abs_doc, 1, 1}, - {"filter", builtin_filter, METH_VARARGS, filter_doc}, + {"filter", builtin_filter, METH_ARGS, filter_doc, 2, 2}, Neal From fredrik at pythonware.com Sun Jan 23 20:23:09 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun Jan 23 20:23:12 2005 Subject: [Python-Dev] Re: Speed up function calls References: Message-ID: Neal Norwitz wrote: > The patch adds a new method type (flags) METH_ARGS that is used in > PyMethodDef. METH_ARGS means the min and max # of arguments are > specified in the PyMethodDef by adding 2 new fields. > * the defn of the MethodDef (# args) is separate from the function defn > * potentially more error prone to write C methods??? "potentially"? sounds like a recipe for disaster. but the patch is nice, and more speed never hurts. maybe it's time to write that module fixup preprocessor thing that Guido should have written some 15 years ago... ;-) From kbk at shore.net Sun Jan 23 21:15:45 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun Jan 23 21:16:01 2005 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200501232015.j0NKFjhi001559@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 273 open ( +1) / 2746 closed ( +9) / 3019 total (+10) Bugs : 797 open ( +4) / 4789 closed (+12) / 5586 total (+16) RFE : 166 open ( +1) / 141 closed ( +0) / 307 total ( +1) New / Reopened Patches ______________________ fix distutils.install.dump_dirs() with negated options (2005-01-17) CLOSED http://python.org/sf/1103844 opened by Wummel Add O_SHLOCK/O_EXLOCK to posix (2005-01-17) http://python.org/sf/1103951 opened by Skip Montanaro setup.py --help and --help-commands altered. (2005-01-17) http://python.org/sf/1104111 opened by Titus Brown new-style exceptions (2005-01-18) http://python.org/sf/1104669 opened by Michael Hudson misc doc typos (2005-01-18) CLOSED http://python.org/sf/1104868 opened by DSM chr, ord, unichr documentation updates (2004-10-31) http://python.org/sf/1057588 reopened by mike_j_brown Faster commonprefix in macpath, ntpath, etc. (2005-01-19) http://python.org/sf/1105730 opened by Jimmy Retzlaff get rid of unbound methods (mostly) (2005-01-17) CLOSED http://python.org/sf/1103689 opened by Guido van Rossum Updated "Working on Cygwin" section (2005-01-22) http://python.org/sf/1107221 opened by Alan Green Add Thread.isActive() (2005-01-23) http://python.org/sf/1107656 opened by Alan Green Speed up function calls/can add more introspection info (2005-01-23) http://python.org/sf/1107887 opened by Neal Norwitz Patches Closed ______________ fix distutils.install.dump_dirs() with negated options (2005-01-17) http://python.org/sf/1103844 closed by theller ast-branch: fix for coredump from new import grammar (2005-01-11) http://python.org/sf/1100563 closed by kbk Shadow Password Support Module (2002-07-10) http://python.org/sf/579435 closed by loewis misc doc typos (2005-01-18) http://python.org/sf/1104868 closed by fdrake extending readline functionality (2003-02-11) http://python.org/sf/684500 closed by fdrake self.button.pack() in tkinter.tex example (2005-01-03) http://python.org/sf/1094815 closed by fdrake Clean up discussion of new C thread idiom (2004-09-20) http://python.org/sf/1031233 closed by fdrake Description of args to IMAP4.store() in imaplib (2004-12-12) http://python.org/sf/1084092 closed by fdrake get rid of unbound methods (mostly) (2005-01-17) http://python.org/sf/1103689 closed by gvanrossum New / Reopened Bugs ___________________ email.base64MIME.header_encode vs RFC 1522 (2005-01-17) http://python.org/sf/1103926 opened by Ucho wishlist: os.feed_urandom(input) (2005-01-17) http://python.org/sf/1104021 opened by Zooko O'Whielacronx configure doesn't set up CFLAGS properly (2005-01-17) http://python.org/sf/1104249 opened by Bryan O'Sullivan Bugs in _csv module - lineterminator (2004-11-24) http://python.org/sf/1072404 reopened by fresh Wrong expression with \w+? (2005-01-18) CLOSED http://python.org/sf/1104608 opened by rengel Bug in String rstrip method (2005-01-18) CLOSED http://python.org/sf/1104923 opened by Rick Coupland Undocumented implicit strip() in split(None) string method (2005-01-19) http://python.org/sf/1105286 opened by YoHell Warnings in Python.h with gcc 4.0.0 (2005-01-19) http://python.org/sf/1105699 opened by Bob Ippolito incorrect constant names in curses window objects page (2005-01-19) http://python.org/sf/1105706 opened by dcrosta null source chars handled oddly (2005-01-19) http://python.org/sf/1105770 opened by Reginald B. Charney bug with idle's stdout when executing load_source (2005-01-20) http://python.org/sf/1105950 opened by imperialfists os.stat int/float oddity (2005-01-20) CLOSED http://python.org/sf/1105998 opened by George Yoshida README of 2.4 source download says 2.4a3 (2005-01-20) http://python.org/sf/1106057 opened by Roger Erens semaphore errors from Python 2.3.x on AIX 5.2 (2005-01-20) http://python.org/sf/1106262 opened by The Written Word slightly easier way to debug from the exception handler (2005-01-20) http://python.org/sf/1106316 opened by Leonardo Rochael Almeida os.makedirs() ignores mode parameter (2005-01-21) http://python.org/sf/1106572 opened by Andreas Jung split() takes no keyword arguments (2005-01-21) http://python.org/sf/1106694 opened by Vinz os.pathsep is wrong on Mac OS X (2005-01-22) CLOSED http://python.org/sf/1107258 opened by Mac-arena the Bored Zo Bugs Closed ___________ --without-cxx flag of configure isn't documented. (2003-03-12) http://python.org/sf/702147 closed by bcannon presentation typo in lib: 6.21.4.2 How callbacks are called (2004-12-22) http://python.org/sf/1090139 closed by gward rfc822 Deprecated since release 2.3? (2005-01-15) http://python.org/sf/1102469 closed by anthonybaxter codecs.open and iterators (2003-03-20) http://python.org/sf/706595 closed by doerwalter Wrong expression with \w+? (2005-01-18) http://python.org/sf/1104608 closed by niemeyer Wrong expression with \w+? (2005-01-18) http://python.org/sf/1104608 closed by effbot Bug in String rstrip method (2005-01-18) http://python.org/sf/1104923 closed by tim_one No documentation for zipimport module (2003-12-03) http://python.org/sf/853800 closed by fdrake distutils/tests not installed (2004-12-30) http://python.org/sf/1093173 closed by fdrake urllib2 doesn't handle urls without a scheme (2005-01-07) http://python.org/sf/1097834 closed by fdrake vertical bar typeset horizontal in docs (2004-08-13) http://python.org/sf/1008998 closed by fdrake write failure ignored in Py_Finalize() (2004-11-27) http://python.org/sf/1074011 closed by loewis os.stat int/float oddity (2005-01-20) http://python.org/sf/1105998 closed by loewis os.pathsep is wrong on Mac OS X (2005-01-22) http://python.org/sf/1107258 closed by bcannon From pycon at python.org Sun Jan 23 22:06:58 2005 From: pycon at python.org (Steve Holden) Date: Sun Jan 23 22:06:59 2005 Subject: [Python-Dev] Microsoft to Provide PyCon Opening Keynote Message-ID: <20050123210658.BFB1E1E400B@bag.python.org> Dear Python Colleague: The PyCon Program Committee is happy to announce that the opening keynote speech, at 9:30 am on Wednesday March 23 will be: Python on the .NET Platform, by Jim Hugunin, Microsoft Corporation Jim Hugunin is well-known in the Python world for his pioneering work on JPython (now Jython), and more recently for the IronPython .NET implementation of Python. Jim joined Microsoft's Common Language Runtime team in August last year to continue his work on Iron Python and further improve the CLR's support for dynamic languages like Python. I look forward to hearing what Jim has to say, and hope that you will join me and the rest of the Python community at PyCon DC 2005, at George Washington University from March 23-25, with a four-day sprint starting on Saturday March 19. Early bird registration rates are still available for a few more days. Go to http://www.python.org/moin/PyConDC2005/Schedule for the current schedule, and register at http://www.python.org/pycon/2005/ regards Steve Holden Chairman, PyCON DC 2005 -- PyCon DC 2005: The third Python Community Conference http://www.pycon.org/ http://www.python.org/pycon/ The scoop on Python implementations and applications From ejones at uwaterloo.ca Sun Jan 23 23:19:22 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Sun Jan 23 23:19:01 2005 Subject: [Python-Dev] Improving the Python Memory Allocator Message-ID: This message is a follow up to a thread I started on python-dev back in October, archived here: http://mail.python.org/pipermail/python-dev/2004-October/049480.html Basically, the problem I am trying to solve is that the Python memory allocator never frees memory back to the operating system. I have attached a patch against obmalloc.c for discussion. The patch still has some rough edges and possibly some bugs, so I don't think it should be merged as is. However, I would appreciate any feedback on the chances for getting this implementation into the core. The rest of this message lists some disadvantages to this implementation, a description of the important changes, a benchmark, and my future plans if this change gets accepted. The patch works for any version of Python that uses obmalloc.c (which includes Python 2.3 and 2.4), but I did my testing with Python 2.5 from CVS under Linux and Mac OS X. This version of the allocator will actually free memory. It has two disadvantages: First, there is slightly more overhead with programs that allocate a lot of memory, release it, then reallocate it. The original allocator simply holds on to all the memory, allowing it to be efficiently reused. This allocator will call free(), so it also must call malloc() again when the memory is needed. I have a "worst case" benchmark which shows that this cost isn't too significant, but it could be a problem for some workloads. If it is, I have an idea for how to work around it. Second, the previous allocator went out of its way to permit a module to call PyObject_Free while another thread is executing PyObject_Malloc. Apparently, this was a backwards compatibility hack for old Python modules which erroneously call these functions without holding the GIL. These modules will have to be fixed if this implementation is accepted into the core. Summary of the changes: - Add an "arena_object" structure for tracking pages that belong to each 256kB arena. - Change the "arenas" array from an array of pointers to an array of arena_object structures. - When freeing a page (a pool), it is placed on a free pool list for the arena it belongs to, instead of a global free pool list. - When freeing a page, if the arena is completely unused, the arena is deallocated. - When allocating a page, it is taken from the arena that is the most full. This gives arenas that are almost completely unused a chance to be freed. Benchmark: The only benchmark I have performed at the moment is the worst case for this allocator: A program that allocates 1 000 000 Python objects which occupy nearly 200MB, frees them, reallocates them, then quits. I ran the program four times, and discarded the initial time. Here is the object: class Obj: def __init__( self ): self.dumb = "hello" And here are the average execution times for this program: Python 2.5: real time: 16.304 user time: 16.016 system: 0.257 Python 2.5 + patch: real time: 16.062 user time: 15.593 system: 0.450 As expected, the patched version spends nearly twice as much system time than the original version. This is because it calls free() and malloc() twice as many times. However, this difference is offset by the fact that the user space execution time is actually *less* than the original version. How is this possible? The likely cause is because the original version defined the arenas pointer to be "volatile" in order to work when Free and Malloc were called simultaneously. Since this version breaks that, the pointer no longer needs to be volatile, which allows the value to be stored in a register instead of being read from memory on each operation. Here are some graphs of the memory allocator behaviour running this benchmark. Original: http://www.eng.uwaterloo.ca/~ejones/original.png New: http://www.eng.uwaterloo.ca/~ejones/new.png Future Plans: - More detailed benchmarking. - The "specialized" allocators for the basic types, such as ints, also need to free memory back to the system. - Potentially the allocator should keep some amount of free memory around to improve the performance of programs that cyclically allocate and free large amounts of memory. This amount should be "self-tuned" to the application. Thank you for your feedback, Evan Jones -------------- next part -------------- A non-text attachment was scrubbed... Name: python-allocator.diff Type: application/octet-stream Size: 19080 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050123/33ee017b/python-allocator-0001.obj From python at rcn.com Mon Jan 24 09:11:05 2005 From: python at rcn.com (Raymond Hettinger) Date: Mon Jan 24 09:14:37 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: Message-ID: <001d01c501ec$40da86e0$5822a044@oemcomputer> [Neal Norwitz] > I would like feedback on whether the approach is desirable. > > The patch adds a new method type (flags) METH_ARGS that is used in > PyMethodDef. METH_ARGS means the min and max # of arguments are > specified in the PyMethodDef by adding 2 new fields. This information > can be used in ceval to > call the method. No tuple packing/unpacking is required since the C > stack is used. > > The benefits are: > * faster function calls > * simplify function call machinery by removing METH_NOARGS, METH_O, > and possibly METH_VARARGS > * more introspection info for C functions (ie, min/max arg count) > (not implemented) An additional benefit would be improving the C-API by allowing C calls without creating temporary argument tuples. Also, some small degree of introspection becomes possible when a method knows its own arity. Replacing METH_O and METH_NOARGS seems straight-forward, but METH_VARARGS has much broader capabilities. How would you handle the simple case of "O|OO"? How could you determine useful default values (NULL, 0, -1, -909, etc.)? If you solve the default value problem, then please also try to come up with a better flag name than METH_ARGS which I find to be indistinct from METH_VARARGS and also not very descriptive of its functionality. Perhaps something like METH_UNPACKED would be an improvement. > The drawbacks are: > * the defn of the MethodDef (# args) is separate from the function defn > * potentially more error prone to write C methods??? No worse than with METH_O or METH_NOARGS. > I've measured between 13-22% speed improvement (debug build on > Operton) when doing simple tests like: > > ./python ./Lib/timeit.py -v 'pow(3, 5)' > > I think the difference tends to be fairly constant at about .3 usec per > loop. If speed is the main advantage being sought, it would be worthwhile to conduct more extensive timing tests with a variety of code and not using a debug build. Running test.test_decimal would be a useful overall benchmark. In theory, I don't see how you could improve on METH_O and METH_NOARGS. The only saving is the time for the flag test (a predictable branch). Offsetting that savings is the additional time for checking min/max args and for constructing a C call with the appropriate number of args. I suspect there is no savings here and that the timings will get worse. In all likelihood, the only real opportunity for savings is replacing METH_VARARGS in cases that have already been sped-up using PyTuple_Unpack(). Those can be further improved by eliminating the time to build and unpack the temporary argument tuple. Even then, I don't see how to overcome the need to set useful default values for optional object arguments. Raymond Hettinger From ncoghlan at iinet.net.au Mon Jan 24 12:30:01 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon Jan 24 12:30:08 2005 Subject: [Python-Dev] Allowing slicing of iterators Message-ID: <41F4DC39.9020603@iinet.net.au> I just wrote a new C API function (PyItem_GetItem) that supports slicing for arbitrary iterators. A patch for current CVS is at http://www.python.org/sf/1108272 For simple indices it does the iteration manually, and for extended slices it returns an itertools.islice object. As a trivial example, here's how to skip the head of a zero-numbered list: for i, item in enumerate("ABCDEF")[1:]: print i, item Is this idea a non-starter, or should I spend my holiday on Wednesday finishing it off and writing the documentation and tests for it? Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From rodsenra at gpr.com.br Mon Jan 24 13:00:10 2005 From: rodsenra at gpr.com.br (Rodrigo Dias Arruda Senra) Date: Mon Jan 24 12:58:35 2005 Subject: [Python-Dev] Improving the Python Memory Allocator In-Reply-To: References: Message-ID: <41F4E34A.1050101@gpr.com.br> Evan Jones wrote: > This message is a follow up to a thread I started on python-dev back in > October, archived here: > First, there is slightly more overhead with programs that allocate a lot > of memory, release it, then reallocate it. > Summary of the changes: > > - When freeing a page, if the arena is completely unused, the arena is > deallocated. Depending on the cost of arena allocation, it might help to define a lower threshold keeping a minimum of empty arena_objects permanently available. Do you think this can bring any speedup ? cheers, Senra -- Rodrigo Senra MSc Computer Engineer rodsenra@gpr.com.br GPr Sistemas Ltda http://www.gpr.com.br From ejones at uwaterloo.ca Mon Jan 24 14:50:19 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Mon Jan 24 14:50:20 2005 Subject: [Python-Dev] Improving the Python Memory Allocator In-Reply-To: <41F4E34A.1050101@gpr.com.br> References: <41F4E34A.1050101@gpr.com.br> Message-ID: On Jan 24, 2005, at 7:00, Rodrigo Dias Arruda Senra wrote: > Depending on the cost of arena allocation, it might help to define a > lower threshold keeping a minimum of empty arena_objects permanently > available. Do you think this can bring any speedup ? Yes, I think it might. I have to do some more benchmarking first, to try and figure out how expensive the allocations are. This is one of my "future work" items to work on if this change gets accepted. I have not implemented it yet, because I don't want to have to merge one *massive* patch. My rough idea is to do something like this: 1. Keep track of the largest number of pages in use at one time. 2. Every N memory operations (or some other measurement of "time"), reset this value and calculate a moving average of the number of pages. This estimates the current memory requirements of the application. 3. If (used + free) > average, free arenas until freeing one more arena would make (used + free) < average. This is better than a static scheme which says "keep X MB of free memory around" because it will self-tune to the application's requirements. If you have an applications that needs lots of RAM, it will keep lots of RAM. If it has very low RAM usage, it will be more aggressive in reclaiming free space. The challenge is how to determine a good measurement of "time." Ideally, if the application was idle for a while, you would perform some housekeeping like this. Does Python's cyclic garbage collector currently do this? If so, I could hook this "management" stuff on to its calls to gc.collect() Evan Jones From rodsenra at gpr.com.br Mon Jan 24 15:21:52 2005 From: rodsenra at gpr.com.br (Rodrigo Dias Arruda Senra) Date: Mon Jan 24 15:20:16 2005 Subject: [Python-Dev] Improving the Python Memory Allocator In-Reply-To: References: <41F4E34A.1050101@gpr.com.br> Message-ID: <41F50480.1070003@gpr.com.br> [Evan Jones] : -------------- > 2. Every N memory operations (or some other measurement of "time"), > reset this value and calculate a moving average of the number of pages. > This estimates the current memory requirements of the application. > The challenge is how to determine a good measurement of "time." > Ideally, if the application was idle for a while, > you would perform some housekeeping like this. Does Python's cyclic > garbage collector currently do this? If so, I could hook this > "management" stuff on to its calls to gc.collect() IMVHO, any measurement of "time" chosen would hurt performance of non-memory greedy applications. OTOH, makes sense for the developers of memory greedy applications (they should be aware of it ) to call gc.collect() periodically. Therefore, *hooking* gc.collect() sounds about right to me, let the janitoring pace be defined by those who really care about it. Looking forward to see this evolve, Senra -- Rodrigo Senra MSc Computer Engineer rodsenra@gpr.com.br GPr Sistemas Ltda http://www.gpr.com.br From gvanrossum at gmail.com Mon Jan 24 16:25:27 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 24 16:25:54 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: <41F4DC39.9020603@iinet.net.au> References: <41F4DC39.9020603@iinet.net.au> Message-ID: > I just wrote a new C API function (PyItem_GetItem) that supports slicing for > arbitrary iterators. A patch for current CVS is at http://www.python.org/sf/1108272 > > For simple indices it does the iteration manually, and for extended slices it > returns an itertools.islice object. > > As a trivial example, here's how to skip the head of a zero-numbered list: > > for i, item in enumerate("ABCDEF")[1:]: > print i, item > > Is this idea a non-starter, or should I spend my holiday on Wednesday finishing > it off and writing the documentation and tests for it? Since we already have the islice iterator, what's the point? It seems to me that introducing this notation would mostly lead to confuse users, since in most other places a slice produces an independent *copy* of the data. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From FBatista at uniFON.com.ar Mon Jan 24 16:34:25 2005 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Jan 24 16:38:04 2005 Subject: [Python-Dev] Allowing slicing of iterators Message-ID: [Guido van Rossum] #- > As a trivial example, here's how to skip the head of a #- zero-numbered list: #- > #- > for i, item in enumerate("ABCDEF")[1:]: #- > print i, item #- > #- > Is this idea a non-starter, or should I spend my holiday #- on Wednesday finishing #- > it off and writing the documentation and tests for it? #- #- Since we already have the islice iterator, what's the point? It seems #- to me that introducing this notation would mostly lead to confuse #- users, since in most other places a slice produces an independent #- *copy* of the data. I think that breaking the common idiom... for e in something[:]: something.remove(e) is a no-no... . Facundo Bit?cora De Vuelo: http://www.taniquetil.com.ar/plog PyAr - Python Argentina: http://pyar.decode.com.ar/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA. La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050124/67a73544/attachment.html From martin at v.loewis.de Mon Jan 24 23:36:24 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 24 23:36:23 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: References: Message-ID: <41F57868.1010404@v.loewis.de> Neal Norwitz wrote: > I would like feedback on whether the approach is desirable. I'm probably missing something really essential, but... Where are the Py_DECREFs done for the function arguments? Also, changing PyArg_ParseTuple is likely incorrect. Currently, chr/unichr expects float values; with your change, I believe it won't anymore. Apart from that, the change looks fine to me. Regards, Martin From nnorwitz at gmail.com Tue Jan 25 00:08:29 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue Jan 25 00:08:32 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: <41F57868.1010404@v.loewis.de> References: <41F57868.1010404@v.loewis.de> Message-ID: On Mon, 24 Jan 2005 23:36:24 +0100, "Martin v. L?wis" wrote: > Neal Norwitz wrote: > > I would like feedback on whether the approach is desirable. > > I'm probably missing something really essential, but... > > Where are the Py_DECREFs done for the function arguments? The original code path still handles the Py_DECREFs. This is the while loop at the end of call_function(). I hope to refine the patch further in this area. > Also, changing PyArg_ParseTuple is likely incorrect. > Currently, chr/unichr expects float values; with your > change, I believe it won't anymore. You are correct there is an unintended change in behaviour: Python 2.5a0 (#51, Jan 23 2005, 18:54:53) >>> chr(5.3) '\x05' Python 2.3.4 (#1, Dec 7 2004, 12:24:19) >>> chr(5.3) __main__:1: DeprecationWarning: integer argument expected, got float '\x05' This needs to be fixed. Neal From martin at v.loewis.de Tue Jan 25 00:16:24 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 25 00:16:24 2005 Subject: [Python-Dev] Improving the Python Memory Allocator In-Reply-To: References: Message-ID: <41F581C8.6070109@v.loewis.de> Here my comments, from more general to more subtle: - please don't post patches here; post them to SF You may ask for comments here after you posted them to SF. - please follow Python coding style. In particular, don't write if ( available_arenas == NULL ) { but write if (available_arenas == NULL) { > Second, the previous allocator went out of its way to permit a module to > call PyObject_Free while another thread is executing PyObject_Malloc. > Apparently, this was a backwards compatibility hack for old Python > modules which erroneously call these functions without holding the GIL. > These modules will have to be fixed if this implementation is accepted > into the core. I'm not certain it is acceptable to make this assumption. Why is it not possible to use the same approach that was previously used (i.e. leak the arenas array)? > - When allocating a page, it is taken from the arena that is the most > full. This gives arenas that are almost completely unused a chance to be > freed. It would be helpful if that was documented in the data structures somewhere. The fact that the nextarena list is sorted by nfreepools is only mentioned in the place where this property is preserved; it should be mentioned in the introductory comments as well. Regards, Martin From martin at v.loewis.de Tue Jan 25 00:30:44 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 25 00:30:44 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: References: <41F57868.1010404@v.loewis.de> Message-ID: <41F58524.1020600@v.loewis.de> Neal Norwitz wrote: >>Where are the Py_DECREFs done for the function arguments? > > > The original code path still handles the Py_DECREFs. > This is the while loop at the end of call_function(). Can you please elaborate? For METH_O and METH_ARGS, the arguments have already been popped off the stack, and the "What does this do" loop only pops off the function itself. So (without testing) methinks your code currently leaks references. Regards, Martin From nnorwitz at gmail.com Tue Jan 25 00:37:04 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue Jan 25 00:37:07 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: <001d01c501ec$40da86e0$5822a044@oemcomputer> References: <001d01c501ec$40da86e0$5822a044@oemcomputer> Message-ID: On Mon, 24 Jan 2005 03:11:05 -0500, Raymond Hettinger wrote: > > Replacing METH_O and METH_NOARGS seems straight-forward, but > METH_VARARGS has much broader capabilities. How would you handle the > simple case of "O|OO"? How could you determine useful default values > (NULL, 0, -1, -909, etc.)? I have a new version of the patch that handles this condition. I pass NULLs for non-existant optional parameters. In your case above, the arguments passed would be: (obj, NULL, NULL) This is handled pretty cleanly in the callees, since it is pretty common to initialize optional params to NULL. > If you solve the default value problem, then please also try to come up > with a better flag name than METH_ARGS which I find to be indistinct > from METH_VARARGS and also not very descriptive of its functionality. > Perhaps something like METH_UNPACKED would be an improvement. I agree METH_ARGS is a poor name. UNPACKED is fine with me. If I don't hear a better suggestion, I'll go with that. > > The drawbacks are: > > * the defn of the MethodDef (# args) is separate from the function > defn > > * potentially more error prone to write C methods??? > > No worse than with METH_O or METH_NOARGS. I agree, plus the signature changes if METH_KEYWORDS is used. I was interested if others viewed the change as better, worse, or about the same. I agree with /F that it could be a disaster if it really is more error prone. I don't view the change as much different. Do others view this as a real problem? > If speed is the main advantage being sought, it would be worthwhile to > conduct more extensive timing tests with a variety of code and not using > a debug build. Running test.test_decimal would be a useful overall > benchmark. I was hoping others might try it out and see. I don't have access to Windows, Mac, or other arches. I only have x86 and amd64. It would also be interesting to test this on some real world code. I have tried various builtin functions and methods and the gain seems to be consistent across all of them. I tried things like dict.get, pow, isinstance. Since the overhead is fairly constant, I would expect functions with more arguments to have an even better improvement. > In theory, I don't see how you could improve on METH_O and METH_NOARGS. > The only saving is the time for the flag test (a predictable branch). > Offsetting that savings is the additional time for checking min/max args > and for constructing a C call with the appropriate number of args. I > suspect there is no savings here and that the timings will get worse. I think tested a method I changed from METH_O to METH_ARGS and could not measure a difference. A beneift would be to consolidate METH_O, METH_NOARGS, and METH_VARARGS into a single case. This should make code simpler all around (IMO). > In all likelihood, the only real opportunity for savings is replacing > METH_VARARGS in cases that have already been sped-up using > PyTuple_Unpack(). Those can be further improved by eliminating the time > to build and unpack the temporary argument tuple. Which this patch accomplishes. > Even then, I don't see how to overcome the need to set useful default > values for optional object arguments. Take a look at the updated patch (#2). I still think it's pretty clean and an overall win. But I'd really like to know what others think. I also implemented most (all?) of METH_O and METH_NOARGS plus many METH_VARARGS, so benchmarkers can compare a difference with and without the patch. Neal From nnorwitz at gmail.com Tue Jan 25 00:48:51 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue Jan 25 00:48:53 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: <41F58524.1020600@v.loewis.de> References: <41F57868.1010404@v.loewis.de> <41F58524.1020600@v.loewis.de> Message-ID: On Tue, 25 Jan 2005 00:30:44 +0100, "Martin v. L?wis" wrote: > Neal Norwitz wrote: > >>Where are the Py_DECREFs done for the function arguments? > > > > The original code path still handles the Py_DECREFs. > > This is the while loop at the end of call_function(). > > Can you please elaborate? I'll try. Do you really trust me, given my first explanation was so poor? :-) EXT_POP() modifies stack_pointer on the stack. In call_function(), stack_pointer is PyObject ***. But in new_fast_function(), stack_pointer is only PyObject **. So the modifications by EXT_POP to stack_pointer (moving it down) are lost in new_fast_function(). So when it returns to call_function(), the stack_pointer is still at the top of the stack. The while loop pops off the arguments. If there was a ref leak, this scenario should demonstrate the refs increasing: >>> isinstance(5, int) True [25363 refs] >>> isinstance(5, int) True [25363 refs] >>> isinstance(5, int) True [25363 refs] The current code is not optimal. new_fast_function() should take PyObject*** and it should also do the DECREF, but I had some bugs when I tried to get that working, so I've deferred fixing that. It ought to be fixed though. HTH, Neal From martin at v.loewis.de Tue Jan 25 01:00:34 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 25 01:00:34 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: References: <41F57868.1010404@v.loewis.de> <41F58524.1020600@v.loewis.de> Message-ID: <41F58C22.9080903@v.loewis.de> Neal Norwitz wrote: > EXT_POP() modifies stack_pointer on the stack. In call_function(), > stack_pointer is PyObject ***. But in new_fast_function(), stack_pointer > is only PyObject **. So the modifications by EXT_POP to stack_pointer > (moving it down) are lost in new_fast_function(). Thanks - that is the detail I was missing. Regards, Martin From ejones at uwaterloo.ca Tue Jan 25 01:33:22 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Tue Jan 25 01:33:12 2005 Subject: [Python-Dev] Improving the Python Memory Allocator In-Reply-To: <41F581C8.6070109@v.loewis.de> References: <41F581C8.6070109@v.loewis.de> Message-ID: On Jan 24, 2005, at 18:16, Martin v. L?wis wrote: > - please don't post patches here; post them to SF > You may ask for comments here after you posted them to SF. Sure. This should be done even for patches which should absolutely not be committed? > - please follow Python coding style. In particular, don't write > if ( available_arenas == NULL ) { > but write > if (available_arenas == NULL) { Yikes! This is a "bad" habit of mine that is in the minority of coding style . Thank you for catching it. >> Second, the previous allocator went out of its way to permit a module >> to call PyObject_Free while another thread is executing >> PyObject_Malloc. Apparently, this was a backwards compatibility hack >> for old Python modules which erroneously call these functions without >> holding the GIL. These modules will have to be fixed if this >> implementation is accepted into the core. > I'm not certain it is acceptable to make this assumption. Why is it > not possible to use the same approach that was previously used (i.e. > leak the arenas array)? This is definitely a very important point of discussion. The main problem is that leaking the "arenas" arena is not sufficient to make the memory allocator thread safe. Back in October, Tim Peters suggested that it might be possible to make the breakage easily detectable: http://mail.python.org/pipermail/python-dev/2004-October/049502.html > If we changed PyMem_{Free, FREE, Del, DEL} to map to the system > free(), all would be golden (except for broken old code mixing > PyObject_ with PyMem_ calls). If any such broken code still exists, > that remapping would lead to dramatic failures, easy to reproduce; and > old code broken in the other, infinitely more subtle way (calling > PyMem_{Free, FREE, Del, DEL} when not holding the GIL) would continue > to work fine. I'll be honest, I only have a theoretical understanding of why this support is necessary, or why it is currently correct. For example, is it possible to call PyMem_Free from two threads simultaneously? Since the problem is that threads could call PyMem_Free without holding the GIL, it seems to be that it is possible. Shouldn't it also be supported? In the current memory allocator, I believe that situation can lead to inconsistent state. For example, see obmalloc.c:746, where it has been determined that a block needs to be put on the list of free blocks: *(block **)p = lastfree = pool->freeblock; pool->freeblock = (block *)p; Imagine two threads are simultaneously freeing blocks that belong to the same pool. They both read the same value for pool->freeblock, and assign that same value to p. The changes to pool->freeblock will have some arbitrary ordering. The result? You have just leaked a block of memory. Basically, if a concurrent memory allocator is the requirement, then I think some other approach is necessary. >> - When allocating a page, it is taken from the arena that is the most >> full. This gives arenas that are almost completely unused a chance to >> be freed. > It would be helpful if that was documented in the data structures > somewhere. The fact that the nextarena list is sorted by nfreepools > is only mentioned in the place where this property is preserved; > it should be mentioned in the introductory comments as well. This is one of those rough edges I mentioned before. If there is some concensus that these changes should be accepted, then I will need to severely edit the comments at the beginning of obmalloc.c. Thanks for your feedback, Evan Jones From steve at holdenweb.com Tue Jan 25 00:24:21 2005 From: steve at holdenweb.com (Steve Holden) Date: Tue Jan 25 02:34:54 2005 Subject: [Python-Dev] PyCon: The Spam Continues ;-) Message-ID: <41F583A5.8020206@holdenweb.com> Dear python-dev: The current (as of even date) summary of my recent contributions to Python -dev appears to be spam about PyCon. Not being one to break habits, even not those of a lifetime sometimes, I spam you yet again to show you what a beautiful summary ActiveState have provided (I don't know whether this URL is cacheable or not): If I remember Trent Lott (?) described at an IPC the SQL Server database that drives this system, and it was a great example of open source technology driving a proprietary (but I expect (?) relatively portable) repository. Since I have your attention (and if I haven't then it really doesn't matter what I write hereafter, goodbye ...) I will also point out that the current top hit on Google for "Microsoft to Provide PyCon Opening Keynote" is [Python-Dev] Microsoft to Provide PyCon Opening Keynote by Steve Holden (you can repeat the search to see whether this assertion is true as you read this mail, and read the opening keynote announcement [I hope...]). Space at PyCon is again enlarged, but it certainly isn't infinite. I'd love to see it filled in my third and last year as chair. The program committee have worked incredibly hard to make sure we all have to choose between far more technical content than a single individual can possibly take in on their own. They [disclaimer: I was program chair, but this should be kudos for the committee membership - without whom this conference would have failed in many dimensions] they have succeeded so well we all, I hope, have to agonize between two sumptuous but equidistant technical bales of hay. Only by providing such rich choice can we ensure that an even broader community forms around Python, with free interchange between the technical communities of the proprietary and open source worlds, and equitable participation in the benefit. Sorry I haven't made many CVS contributions lately. We really should be showcasing more Python technologies via www.python.org. targeted-marketing-to-talented-professionals-ly y'rs - steve -- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/ Holden Web LLC +1 703 861 4237 +1 800 494 3119 From steve at holdenweb.com Tue Jan 25 00:40:04 2005 From: steve at holdenweb.com (Steve Holden) Date: Tue Jan 25 02:34:57 2005 Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ;-) In-Reply-To: <41F583A5.8020206@holdenweb.com> References: <41F583A5.8020206@holdenweb.com> Message-ID: <41F58754.1020607@holdenweb.com> Steve Holden wrote: [some things followed by] > > If I remember Trent Lott (?) described at an IPC the SQL Server database > that drives this system, and it was a great example of open source > technology driving a proprietary (but I expect (?) relatively portable) > repository. > Please forgive me for this not-so-talented Transatlantic confusion, since I mistook one famous name for another. I did of course mean Trent Mick at ActiveState. Apologies for the confusion. regards Steve -- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/ Holden Web LLC +1 703 861 4237 +1 800 494 3119 From DavidA at ActiveState.com Tue Jan 25 02:43:16 2005 From: DavidA at ActiveState.com (David Ascher) Date: Tue Jan 25 02:45:14 2005 Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ;-) In-Reply-To: <41F583A5.8020206@holdenweb.com> References: <41F583A5.8020206@holdenweb.com> Message-ID: <41F5A434.1060905@ActiveState.com> Steve Holden wrote: > Dear python-dev: > > The current (as of even date) summary of my recent contributions to > Python -dev appears to be spam about PyCon. > > Not being one to break habits, even not those of a lifetime sometimes, I > spam you yet again to show you what a beautiful summary ActiveState have > provided (I don't know whether this URL is cacheable or not): > > Yup, we try to make all our URLs portable and persistent. > If I remember Trent Lott (?) Nah, that's a US politician. T'was Trent Mick. > described at an IPC the SQL Server database > that drives this system, and it was a great example of open source > technology driving a proprietary (but I expect (?) relatively portable) > repository. Modulo some SQLServer features we're using. > Since I have your attention (and if I haven't then it really doesn't > matter what I write hereafter, goodbye ...) I will also point out that > the current top hit on Google for > > "Microsoft to Provide PyCon Opening Keynote" What a bizarre search. (note that some of your To's and Cc's were pretty strange... --david From tjreedy at udel.edu Tue Jan 25 03:04:44 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Jan 25 03:04:50 2005 Subject: [Python-Dev] Re: PyCon: The Spam Continues ;-) References: <41F583A5.8020206@holdenweb.com> Message-ID: Huh? I get a mostly blank page. Perhaps there are no authors by thatname.tjr From anthony at interlink.com.au Tue Jan 25 06:57:55 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue Jan 25 06:59:14 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 In-Reply-To: <1106185423.3784.26.camel@schizo> References: <1106111769.3822.52.camel@schizo> <2mpt01bkvs.fsf@starship.python.net> <1106185423.3784.26.camel@schizo> Message-ID: <200501251657.57682.anthony@interlink.com.au> On Thursday 20 January 2005 12:43, Donovan Baarda wrote: > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote: > > The main oddness about python threads (before 2.3) is that they run > > with all signals masked. You could play with a C wrapper (call > > setprocmask, then exec fop) to see if this is what is causing the > > problem. But please try 2.4. > > Python 2.4 does indeed fix the problem. Unfortunately we are using Zope > 2.7.4, and I'm a bit wary of attempting to migrate it all from 2.3 to > 2.4. Is there any wa this "Fix" can be back-ported to 2.3? It's extremely unlikely - I couldn't make myself comfortable with it when attempting to figure out it's backportedness. While the current behaviour on 2.3.4 is broken in some cases, I fear very much that the new behaviour will break other (working) code - and this is something I try very hard to avoid in a bugfix release, particularly in one that's probably the final one of a series. Fundamentally, the answer is "don't do signals+threads, you will get burned". For your application, you might want to instead try something where you write requests to a file in a spool directory, and have a python script that loops looking for requests, and generates responses. This is likely to be much simpler to debug and work with. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From raymond.hettinger at verizon.net Tue Jan 25 12:42:57 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue Jan 25 12:47:03 2005 Subject: [Python-Dev] Speed up function calls Message-ID: <000e01c502d3$0458a340$18fccc97@oemcomputer> > > In theory, I don't see how you could improve on METH_O and METH_NOARGS. > > The only saving is the time for the flag test (a predictable branch). > > Offsetting that savings is the additional time for checking min/max args > > and for constructing a C call with the appropriate number of args. I > > suspect there is no savings here and that the timings will get worse. > > I think tested a method I changed from METH_O to METH_ARGS and could > not measure a difference. Something is probably wrong with the measurements. The new call does much more work than METH_O or METH_NOARGS. Those two common and essential cases cannot be faster and are likely slower on at least some compilers and some machines. If some timing shows differently, then it is likely a mirage (falling into an unsustainable local minimum). The patch introduces range checks, an extra C function call, nine variable initializations, and two additional unpredictable branches (the case statements). The only benefit (in terms of timing) is possibly saving a tuple allocation/deallocation. That benefit only kicks in for METH_VARARGS and even then only when the tuple free list is empty. I recommend not changing ANY of the METH_O and METH_NOARGS calls. These are already close to optimal. > A beneift would be to consolidate METH_O, > METH_NOARGS, and METH_VARARGS into a single case. This should > make code simpler all around (IMO). Will backwards compatibility allow those cases to be eliminated? It would be a bummer if most existing extensions could not compile with Py2.5. Also, METH_VARARGS will likely have to hang around unless a way can be found to handle more than nine arguments. This patch appears to be taking on a life of its own and is being applied more broadly than is necessary or wise. The patch is extensive and introduces a new C API that cannot be taken back later, so we ought to be careful with it. For the time being, try not to touch the existing METH_O and METH_NOARGS methods. Focus on situations that do stand a chance of being improved (such as methods with a signature like "O|O"). That being said, I really like the concept. I just worry that many of the stated benefits won't materialize: * having to keep the old versions for backwards compatibility, * being slower than METH_O and METH_NOARGS, * not handling more than nine arguments, * separating function signature info from the function itself, * the time to initialize all the argument variables to NULL, * somewhat unattractive case stmt code for building the c function call. Raymond From anthony at interlink.com.au Tue Jan 25 14:05:02 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue Jan 25 14:05:14 2005 Subject: [Python-Dev] 2.3 BRANCH FREEZE imminent! Message-ID: <200501260005.02705.anthony@interlink.com.au> As those of you playing along at home with python-checkins would know, we're going to be cutting a 2.3.5c1 shortly (in about 12 hours time). Can people not in the set of the normal release team (you know the drill) please hold off on checkins to the branch from about 0000 UTC, 26th January (in about 12 hours time). After than, we'll have a one-week delay from release candidate until the final 2.3.5 - until then, please be ultra-conservative with checkins to the 2.3 branch (unless you're also volunteering to cut an emergency 2.3.6 ). Assuming nothing horrible goes wrong, this will be the final release of Python 2.3. The next bugfix release will be 2.4.1, in a couple of months. (As usual - any questions, comments or whatever, let me know via email, or #python-dev on irc.freenode.net) Anthony -- Anthony Baxter It's never too late to have a happy childhood. From ncoghlan at iinet.net.au Tue Jan 25 14:36:23 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Jan 25 14:36:30 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: References: <41F4DC39.9020603@iinet.net.au> Message-ID: <41F64B57.3040807@iinet.net.au> Guido van Rossum wrote: > Since we already have the islice iterator, what's the point? I'd like to see iterators become as easy to work with as lists are. At the moment, anything that returns an iterator forces you to use the relatively cumbersome itertools.islice mechanism, rather than Python's native slice syntax. In the example below (printing the first 3 items of a sequence), the fact that sorted() produces a new iterable list, while reversed() produces an iterator over the original list *should* be an irrelevant implementation detail from the programmer's point of view. However, the fact that iterators aren't natively sliceable throws this detail in the programmer's face, and forces them to alter their code to deal with it. The conversion from list comprehensions to generator expressions results in similar irritation - the things just aren't as convenient, because the syntactic support isn't there. Py> lst = "1 5 23 1234 57 89 2 1 54 7".split() Py> lst ['1', '5', '23', '1234', '57', '89', '2', '1', '54', '7'] Py> for i in sorted(lst)[:3]: ... print i ... 1 1 1234 Py> for i in reversed(lst)[:3]: ... print i ... Traceback (most recent call last): File "", line 1, in ? TypeError: unsubscriptable object Py> from itertools import islice Py> for i in islice(reversed(lst), 3): ... print i ... 7 54 1 >It seems > to me that introducing this notation would mostly lead to confuse > users, since in most other places a slice produces an independent > *copy* of the data. Well, certainly everything I can think of in the core that currently supports slicing produces a copy. Slicing on a numarray, however, gives you a view. The exact behaviour (view or copy) really depends on what is being sliced. For iterators, I think Raymond's islice exemplifies the most natural slicing behaviour. Invoking itertools.tee() behind the scenes (to get copying semantics) would eliminate the iterator nature of the approach in many cases. I don't think native slicing support will introduce any worse problems with the iterator/iterable distinction than already exist (e.g. a for loop consumes an iterator, but leaves an iterable unchanged. Similarly, slicing does not alter an iterable, but consumes an iterator). Continuing the sorted() vs reversed() example from above: Py> sortlst = sorted(lst) Py> max(sortlst) '89' Py> len(sortlst) 10 Py> revlst = reversed(lst) Py> max(revlst) '89' Py> len(revlst) 0 Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From ncoghlan at iinet.net.au Tue Jan 25 14:41:32 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Jan 25 14:41:40 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: References: Message-ID: <41F64C8C.6090902@iinet.net.au> Batista, Facundo wrote: > I think that breaking the common idiom... > > for e in something[:]: > something.remove(e) > > is a no-no... The patch doesn't change existing behaviour - anything which is already sliceable (e.g. lists) goes through the existing __getitem__ or __getslice__ code paths. All the patch adds is two additional checks (the first for an iterator, the second for an iterable) before PyObject_GetItem fails with the traditional "TypeError: unsubscriptable object". Defining __getitem__ also allows any given iterator or iterable type to override the default slicing behaviour if they so choose. Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From abo at minkirri.apana.org.au Tue Jan 25 15:01:55 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Tue Jan 25 15:02:19 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 References: <1106111769.3822.52.camel@schizo> <2mpt01bkvs.fsf@starship.python.net> <1106185423.3784.26.camel@schizo> <200501251657.57682.anthony@interlink.com.au> Message-ID: <004b01c502e6$6db77380$24ed0ccb@apana.org.au> G'day, From: "Anthony Baxter" > On Thursday 20 January 2005 12:43, Donovan Baarda wrote: > > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote: > > > The main oddness about python threads (before 2.3) is that they run > > > with all signals masked. You could play with a C wrapper (call > > > setprocmask, then exec fop) to see if this is what is causing the > > > problem. But please try 2.4. > > > > Python 2.4 does indeed fix the problem. Unfortunately we are using Zope > > 2.7.4, and I'm a bit wary of attempting to migrate it all from 2.3 to > > 2.4. Is there any wa this "Fix" can be back-ported to 2.3? > > It's extremely unlikely - I couldn't make myself comfortable with it > when attempting to figure out it's backportedness. While the current > behaviour on 2.3.4 is broken in some cases, I fear very much that > the new behaviour will break other (working) code - and this is > something I try very hard to avoid in a bugfix release, particularly > in one that's probably the final one of a series. > > Fundamentally, the answer is "don't do signals+threads, you will > get burned". For your application, you might want to instead try In this case it turns out to be "don't do exec() in a thread, because what you exec can have all it's signals masked". That turns out to be a hell of a lot of things; popen, os.command, etc. They all only work OK in a threaded application if what you are exec'ing doesn't use any signals. > something where you write requests to a file in a spool directory, > and have a python script that loops looking for requests, and > generates responses. This is likely to be much simpler to debug > and work with. Hmm, interprocess communications; great fun :-) And no spawning the process from within the zope application; it's gotta be a separate daemon. Actually, I've noticed that zope often has a sorta zombie "which" process which it spawns. I wonder it this is a stuck thread waiting for some signal... ---------------------------------------------------------------- Donovan Baarda http://minkirri.apana.org.au/~abo/ ---------------------------------------------------------------- From anthony at interlink.com.au Tue Jan 25 15:53:20 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue Jan 25 15:53:34 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 In-Reply-To: <004b01c502e6$6db77380$24ed0ccb@apana.org.au> References: <1106111769.3822.52.camel@schizo> <200501251657.57682.anthony@interlink.com.au> <004b01c502e6$6db77380$24ed0ccb@apana.org.au> Message-ID: <200501260153.21672.anthony@interlink.com.au> On Wednesday 26 January 2005 01:01, Donovan Baarda wrote: > In this case it turns out to be "don't do exec() in a thread, because what > you exec can have all it's signals masked". That turns out to be a hell of > a lot of things; popen, os.command, etc. They all only work OK in a > threaded application if what you are exec'ing doesn't use any signals. Yep. You just have to be aware of it. We do a bit of this at work, and we either spool via a database table, or a directory full of spool files. > Actually, I've noticed that zope often has a sorta zombie "which" process > which it spawns. I wonder it this is a stuck thread waiting for some > signal... Quite likely. -- Anthony Baxter It's never too late to have a happy childhood. From python at rcn.com Tue Jan 25 18:06:31 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 25 18:11:03 2005 Subject: [Python-Dev] state of 2.4 final release In-Reply-To: <1f7befae04112919183144973b@mail.gmail.com> Message-ID: <000001c50300$37ad1320$18fccc97@oemcomputer> > [Anthony Baxter] > > I didn't see any replies to the last post, so I'll ask again with a > > better subject line - as I said last time, as far as I'm aware, I'm > > not aware of anyone having done a fix for the issue Tim identified > > ( http://www.python.org/sf/1069160 ) > > > > So, my question is: Is this important enough to delay a 2.4 final > > for? [Tim] > Not according to me; said before I'd be happy if everyone pretended I > hadn't filed that report until a month after 2.4 final was released. Any chance of this getting fixed before 2.4.1 goes out in February? Raymond From tim.peters at gmail.com Tue Jan 25 18:24:58 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue Jan 25 18:25:01 2005 Subject: [Python-Dev] state of 2.4 final release In-Reply-To: <000001c50300$37ad1320$18fccc97@oemcomputer> References: <1f7befae04112919183144973b@mail.gmail.com> <000001c50300$37ad1320$18fccc97@oemcomputer> Message-ID: <1f7befae05012509241b3164f3@mail.gmail.com> [Anthony Baxter] >>> I didn't see any replies to the last post, so I'll ask again with a >>> better subject line - as I said last time, as far as I'm aware, I'm >>> not aware of anyone having done a fix for the issue Tim identified >>> ( http://www.python.org/sf/1069160 ) >>> >>> So, my question is: Is this important enough to delay a 2.4 final >>> for? [Tim] >> Not according to me; said before I'd be happy if everyone pretended I >> hadn't filed that report until a month after 2.4 final was released. [Raymond Hettinger] > Any chance of this getting fixed before 2.4.1 goes out in February? It probably won't be fixed by me. It would be better if a Unix-head volunteered to repair it, because the most likely kind of thread race (explained in the bug report) has proven impossible to provoke on Windows (short of carefully inserting sleeps into Python's C code) any of the times this bug has been reported in the past (the same kind of bug has appeared several times in different parts of Python's threading code -- holding the GIL is not sufficient protection against concurrent mutation of the tstate chain, for reasons explained in the bug report). A fix is very simple (also explained in the bug report) -- acquire the damn mutex, don't trust to luck. From DavidA at ActiveState.com Tue Jan 25 18:29:59 2005 From: DavidA at ActiveState.com (David Ascher) Date: Tue Jan 25 18:31:59 2005 Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ;-) In-Reply-To: <41F67F43.4010403@holdenweb.com> References: <41F583A5.8020206@holdenweb.com> <41F5A434.1060905@ActiveState.com> <41F67F43.4010403@holdenweb.com> Message-ID: <41F68217.8000408@ActiveState.com> Steve Holden wrote: >> Modulo some SQLServer features we're using. >> > Well free-text indexing would be my first guess. Anything else of > interest? MySQL's free text indexing really sucks compared with SQL > Server's, which to my mind is a good justification for the Microsoft > product. Freetext search is one of them, but there may be others (I think there are some stored procedure in some MS language). I'm hardly a SQL expert, or an expert on our ASPN infrastructure. --david From fredrik at pythonware.com Tue Jan 25 18:39:54 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Jan 25 18:39:48 2005 Subject: [Python-Dev] Re: Allowing slicing of iterators References: <41F4DC39.9020603@iinet.net.au> Message-ID: Guido van Rossum wrote: >> As a trivial example, here's how to skip the head of a zero-numbered list: >> >> for i, item in enumerate("ABCDEF")[1:]: >> print i, item >> >> Is this idea a non-starter, or should I spend my holiday on Wednesday finishing >> it off and writing the documentation and tests for it? > > Since we already have the islice iterator, what's the point? readability? I don't have to import seqtools to work with traditional sequences, so why should I have to import itertools to be able to use the goodies in there? better leave that to the compiler. From bob at redivi.com Tue Jan 25 19:13:24 2005 From: bob at redivi.com (Bob Ippolito) Date: Tue Jan 25 19:13:30 2005 Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ; -) In-Reply-To: <41F68217.8000408@ActiveState.com> References: <41F583A5.8020206@holdenweb.com> <41F5A434.1060905@ActiveState.com> <41F67F43.4010403@holdenweb.com> <41F68217.8000408@ActiveState.com> Message-ID: On Jan 25, 2005, at 12:29, David Ascher wrote: > Steve Holden wrote: > >>> Modulo some SQLServer features we're using. >>> >> Well free-text indexing would be my first guess. Anything else of >> interest? MySQL's free text indexing really sucks compared with SQL >> Server's, which to my mind is a good justification for the Microsoft >> product. > > Freetext search is one of them, but there may be others (I think there > are some stored procedure in some MS language). I'm hardly a SQL > expert, or an expert on our ASPN infrastructure. There is OpenFTS for PostgreSQL . I'm not sure how it compares to SQL Server's or MySQL's, but it's been around a while so I expect it's pretty decent. -bob From gvanrossum at gmail.com Tue Jan 25 19:30:39 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Jan 25 19:30:42 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: <41F64B57.3040807@iinet.net.au> References: <41F4DC39.9020603@iinet.net.au> <41F64B57.3040807@iinet.net.au> Message-ID: [me] > > Since we already have the islice iterator, what's the point? [Nick] > I'd like to see iterators become as easy to work with as lists are. At the > moment, anything that returns an iterator forces you to use the relatively > cumbersome itertools.islice mechanism, rather than Python's native slice syntax. Sorry. Still -1. I read your defense, and I'm not convinced. Even Fredrik's support didn't convince me. Iterators are for single sequential access. It's a feature that you have to import itertools (or at least that you have to invoke its special operations) -- iterators are not sequences and shouldn't be confused with such. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steven.bethard at gmail.com Tue Jan 25 20:41:56 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue Jan 25 20:42:01 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: <41F64B57.3040807@iinet.net.au> References: <41F4DC39.9020603@iinet.net.au> <41F64B57.3040807@iinet.net.au> Message-ID: Nick Coghlan wrote: > In the example below (printing the first 3 items of a sequence), the fact that > sorted() produces a new iterable list, while reversed() produces an iterator > over the original list *should* be an irrelevant implementation detail from the > programmer's point of view. You have to be aware on some level of whether or not you're using a list when you use slice notation -- what would you do for iterators when given a negative step index? Presumably it would have to raise an exception, where doing so with lists would not... Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From jcarlson at uci.edu Tue Jan 25 21:57:39 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue Jan 25 22:00:24 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: <41F64B57.3040807@iinet.net.au> References: <41F64B57.3040807@iinet.net.au> Message-ID: <20050125095456.992C.JCARLSON@uci.edu> Nick Coghlan wrote: > Guido van Rossum wrote: > > Since we already have the islice iterator, what's the point? > > I'd like to see iterators become as easy to work with as lists are. At the > moment, anything that returns an iterator forces you to use the relatively > cumbersome itertools.islice mechanism, rather than Python's native slice syntax. If you want to use full sequence slicing semantics, then make yourself a list or tuple. I promise it will take less typing than itertools.islice() (at least in the trivial case of list(iterable)). Using language syntax to pretend that an arbitrary iterable is a list or tuple may well lead to unexpected behavior, whether that behavior is data loss or a caching of results. Which behavior is desireable is generally application specific, and I don't believe that Python should make that assumption for the user or developer. - Josiah From python at rcn.com Tue Jan 25 22:05:37 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 25 22:10:32 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: Message-ID: <000e01c50321$a20d1e60$18fccc97@oemcomputer> > Iterators are for single sequential access. It's a feature that you > have to import itertools (or at least that you have to invoke its > special operations) -- iterators are not sequences and shouldn't be > confused with such. FWIW, someone (Bengt Richter perhaps) once suggested syntactic support differentiated from sequences but less awkward than a call to itertools.islice(). itertools.islice(someseq, lo, hi) would be rendered as someseq'[lo:hi]. Raymond From python at rcn.com Tue Jan 25 22:09:32 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue Jan 25 22:14:20 2005 Subject: [Python-Dev] state of 2.4 final release In-Reply-To: <1f7befae05012509241b3164f3@mail.gmail.com> Message-ID: <000f01c50322$2a91d960$18fccc97@oemcomputer> [Anthony Baxter] > >>> I'm > >>> not aware of anyone having done a fix for the issue Tim identified > >>> ( http://www.python.org/sf/1069160 ) [Raymond Hettinger] > > Any chance of this getting fixed before 2.4.1 goes out in February? [Timbot] > It probably won't be fixed by me. It would be better if a Unix-head > volunteered to repair it, because the most likely kind of thread race > (explained in the bug report) has proven impossible to provoke on > Windows (short of carefully inserting sleeps into Python's C code) any > of the times this bug has been reported in the past (the same kind of > bug has appeared several times in different parts of Python's > threading code -- holding the GIL is not sufficient protection against > concurrent mutation of the tstate chain, for reasons explained in the > bug report). > > A fix is very simple (also explained in the bug report) -- acquire the > damn mutex, don't trust to luck. Hey Unix-heads. Any takers? Raymond From steven.bethard at gmail.com Tue Jan 25 22:20:24 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue Jan 25 22:20:32 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: <000e01c50321$a20d1e60$18fccc97@oemcomputer> References: <000e01c50321$a20d1e60$18fccc97@oemcomputer> Message-ID: Raymond Hettinger wrote: > > FWIW, someone (Bengt Richter perhaps) once suggested syntactic support > differentiated from sequences but less awkward than a call to > itertools.islice(). > > itertools.islice(someseq, lo, hi) would be rendered as someseq'[lo:hi]. Just to make sure I'm reading this right, the difference between sequence slicing and iterator slicing is a single-quote? IMVHO, that's pretty hard to read... If we're really looking for a builtin, wouldn't it be better to go the route of getattr/setattr and have something like getslice that could operate on both lists and iterators? Then getslice(lst, lo, hi) would just be an alias for lst[lo:hi] and getslice(itr, lo, hi) would just be an alias for itertools.islice(itr, lo, hi) Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From jimjjewett at gmail.com Tue Jan 25 22:50:27 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue Jan 25 22:50:30 2005 Subject: [Python-Dev] Deprecating modules (python-dev summary for early Dec, 2004) Message-ID: > It was also agreed that deleting deprecated modules was not needed; it breaks > code and disk space is cheap. > It seems that no longer listing documentation and adding a deprecation warning > is what is needed to properly deprecate a module. By no longer listing > documentation new programmers will not use the code since they won't know > about it.[*] And adding the warning will let old users know that they should be > using something else. [* Unless they try to maintain old code. Hopefully, they know to find the documentation at python.org.] Would it make sense to add an attic (or even "deprecated") directory to the end of sys.path, and move old modules there? This would make the search for non-deprecated modules a bit faster, and would make it easier to verify that new code isn't depending (perhaps indirectly) on any deprecated features. New programmers may just browse the list of files for names that look right. They're more likely to take the first (possibly false) hit if the list is long. I'm not the only one who ended up using markupbase for that reason. Also note that some shouldn't-be-used modules don't (yet?) raise a deprecation warning. For instance, I'm pretty sure regex_syntax and and reconvert are both fairly useless without deprecated regex, but they aren't deprecated on their own -- so they show up as tempting choices in a list of library files. (Though reconvert does something other than I expected, based on the name.) I understand not bothering to repeat the deprecation for someone who is using them correctly, but it would be nice to move them to an attic. Bastion and rexec should probably also raise Deprecation errors, if that becomes the right way to mark them deprecated. (They import fine; they just don't work -- which could be interpreted as merely an "XXX not done yet" comment.) -jJ From walter at livinglogic.de Tue Jan 25 23:13:08 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Jan 25 23:13:11 2005 Subject: [Python-Dev] __str__ vs. __unicode__ In-Reply-To: <41F3B46F.5040205@egenix.com> References: <41ED25C6.80603@livinglogic.de> <41ED499A.1050206@egenix.com> <41EE2B1E.8030209@livinglogic.de> <41EE4797.6030105@egenix.com> <41F0F11B.8000600@livinglogic.de> <41F3B46F.5040205@egenix.com> Message-ID: <41F6C474.8030700@livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: > >> M.-A. Lemburg wrote: >> >> > [...] >> >>> __str__ and __unicode__ as well as the other hooks were >>> specifically added for the type constructors to use. >>> However, these were added at a time where sub-classing >>> of types was not possible, so it's time now to reconsider >>> whether this functionality should be extended to sub-classes >>> as well. >> >> So can we reach consensus on this, or do we need a >> BDFL pronouncement? > > I don't have a clear picture of what the consensus currently > looks like :-) > > If we're going for for a solution that implements the hook > awareness for all ____ hooks, I'd be +1 on that. > If we only touch the __unicode__ case, we'd only be created > yet another special case. I'd vote -0 on that. > [...] Here's the patch that implements this for int/long/float/unicode: http://www.python.org/sf/1109424 Note that complex already did the right thing. For int/long/float this is implemented in the following way: Converting an instance of a subclass to the base class is done in the appropriate slot of the type (i.e. intobject.c::int_int() etc.) instead of in PyNumber_Int()/PyNumber_Long()/PyNumber_Float(). It's still possible for a conversion method to return an instance of a subclass of int/long/float. Bye, Walter D?rwald From skip at pobox.com Tue Jan 25 23:21:34 2005 From: skip at pobox.com (Skip Montanaro) Date: Tue Jan 25 23:58:23 2005 Subject: [Python-Dev] Deprecating modules (python-dev summary for early Dec, 2004) In-Reply-To: References: Message-ID: <16886.50798.56069.314227@montanaro.dyndns.org> Jim> Would it make sense to add an attic (or even "deprecated") Jim> directory to the end of sys.path, and move old modules there? This Jim> would make the search for non-deprecated modules a bit faster, and Jim> would make it easier to verify that new code isn't depending Jim> (perhaps indirectly) on any deprecated features. That's what lib-old is for. All people have to do is append it to sys.path to get access to its contents: % python Python 2.5a0 (#72, Jan 20 2005, 20:14:27) [GCC 3.3 20030304 (Apple Computer, Inc. build 1493)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import glob >>> for f in glob.glob("/Users/skip/local/lib/python2.5/lib-old/*.py"): ... print f ... /Users/skip/local/lib/python2.5/lib-old/addpack.py /Users/skip/local/lib/python2.5/lib-old/cmp.py /Users/skip/local/lib/python2.5/lib-old/cmpcache.py /Users/skip/local/lib/python2.5/lib-old/codehack.py /Users/skip/local/lib/python2.5/lib-old/dircmp.py /Users/skip/local/lib/python2.5/lib-old/dump.py /Users/skip/local/lib/python2.5/lib-old/find.py /Users/skip/local/lib/python2.5/lib-old/fmt.py /Users/skip/local/lib/python2.5/lib-old/grep.py /Users/skip/local/lib/python2.5/lib-old/lockfile.py /Users/skip/local/lib/python2.5/lib-old/newdir.py /Users/skip/local/lib/python2.5/lib-old/ni.py /Users/skip/local/lib/python2.5/lib-old/packmail.py /Users/skip/local/lib/python2.5/lib-old/Para.py /Users/skip/local/lib/python2.5/lib-old/poly.py /Users/skip/local/lib/python2.5/lib-old/rand.py /Users/skip/local/lib/python2.5/lib-old/statcache.py /Users/skip/local/lib/python2.5/lib-old/tb.py /Users/skip/local/lib/python2.5/lib-old/tzparse.py /Users/skip/local/lib/python2.5/lib-old/util.py /Users/skip/local/lib/python2.5/lib-old/whatsound.py /Users/skip/local/lib/python2.5/lib-old/whrandom.py /Users/skip/local/lib/python2.5/lib-old/zmod.py That doesn't help for deprecated extension modules, but I think they are much less frequently candidates for deprecation. Skip From jimjjewett at gmail.com Wed Jan 26 00:26:58 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed Jan 26 00:27:01 2005 Subject: [Python-Dev] Deprecating modules (python-dev summary for early Dec, 2004) In-Reply-To: <16886.50798.56069.314227@montanaro.dyndns.org> References: <16886.50798.56069.314227@montanaro.dyndns.org> Message-ID: On Tue, 25 Jan 2005 16:21:34 -0600, Skip Montanaro wrote: > > Jim> Would it make sense to add an attic (or even "deprecated") > Jim> directory to the end of sys.path, and move old modules there? This > Jim> would make the search for non-deprecated modules a bit faster, and > Jim> would make it easier to verify that new code isn't depending > Jim> (perhaps indirectly) on any deprecated features. > That's what lib-old is for. All people have to do is append it to sys.path > to get access to its contents: That seems to be for "obsolete" modules. Should deprecated modules be moved there as well? I had proposed a middle ground, where they were moved to a separate directory, but that directory was (by default) included on the search path. Moving deprecated modules to lib-old (not on the search path at all) seems to risk breaking code. -jJ From nnorwitz at gmail.com Wed Jan 26 04:35:42 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed Jan 26 04:35:45 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: <000e01c502d3$0458a340$18fccc97@oemcomputer> References: <000e01c502d3$0458a340$18fccc97@oemcomputer> Message-ID: On Tue, 25 Jan 2005 06:42:57 -0500, Raymond Hettinger wrote: > > > > I think tested a method I changed from METH_O to METH_ARGS and could > > not measure a difference. > > Something is probably wrong with the measurements. The new call does much more work than METH_O or METH_NOARGS. Those two common and essential cases cannot be faster and are likely slower on at least some compilers and some machines. If some timing shows differently, then it is likely a mirage (falling into an unsustainable local minimum). I tested w/chr() which Martin pointed out is broken in my patch. I just tested with len('') and got these results (again on opteron): # without patch neal@janus clean $ ./python ./Lib/timeit.py -v "len('')" 10 loops -> 8.11e-06 secs 100 loops -> 6.7e-05 secs 1000 loops -> 0.000635 secs 10000 loops -> 0.00733 secs 100000 loops -> 0.0634 secs 1000000 loops -> 0.652 secs raw times: 0.654 0.652 0.654 1000000 loops, best of 3: 0.652 usec per loop # with patch neal@janus src $ ./python ./Lib/timeit.py -v "len('')" 10 loops -> 9.06e-06 secs 100 loops -> 7.01e-05 secs 1000 loops -> 0.000692 secs 10000 loops -> 0.00693 secs 100000 loops -> 0.0708 secs 1000000 loops -> 0.703 secs raw times: 0.712 0.714 0.713 1000000 loops, best of 3: 0.712 usec per loop So with the patch METH_O is .06 usec slower. I'd like to discuss this later after I explain a bit more about the direction I'm headed. I agree that METH_O and METH_NOARGS are near optimal wrt to performance. But if we could have one METH_UNPACKED instead of 3 METH_*, I think that would be a win. > > A beneift would be to consolidate METH_O, > > METH_NOARGS, and METH_VARARGS into a single case. This should > > make code simpler all around (IMO). > > Will backwards compatibility allow those cases to be eliminated? It would be a bummer if most existing extensions could not compile with Py2.5. Also, METH_VARARGS will likely have to hang around unless a way can be found to handle more than nine arguments. Sorry, I meant eliminated w/3.0. METH_O couldn't be eliminated, but METH_NOARGS actually could since min/max args would be initialized to 0. so #define METH_NOARGS METH_UNPACKED would work. But I'm not proposing that, unless there is consensus that it's ok. > This patch appears to be taking on a life of its own and is being applied more broadly than is necessary or wise. The patch is extensive and introduces a new C API that cannot be taken back later, so we ought to be careful with it. I agree we should be careful. But it's all experimentation right now. The reason to modify METH_O and METH_NOARGS is verify direction and various effects. It's not necessarily meant to be integrated. > That being said, I really like the concept. I just worry that many of the stated benefits won't materialize: > * having to keep the old versions for backwards compatibility, > * being slower than METH_O and METH_NOARGS, > * not handling more than nine arguments, There are very few functions I've found that take more than 2 arguments. Should 9 be lower, higher? I don't have a good feel. From what I've seen, 5 may be more reasonable as far as catching 90% of the cases. > * separating function signature info from the function itself, I haven't really seen any discussion on this point. I think Raymond pointed out this isn't really much different today with METH_NOARGS and METH_KEYWORD. METH_O too if you consider how the arg is used even though the signature is still the same. > * the time to initialize all the argument variables to NULL, See below how this could be fixed. > * somewhat unattractive case stmt code for building the c function call. This is the python test coverage: http://coverage.livinglogic.de/coverage/web/selectEntry.do?template=2850&entryToSelect=182530 Note that VARARGS is over 3 times as likely as METH_O or METH_NOARGS. Plus we could get rid of a couple of if statements. So far it seems there isn't any specific problems with the approach. There are simply concerns. I not sure it would be best to modify this patch over many iterations and then make one huge checkin. I also don't want to lose the changes or the results. Perhaps I should make a branch for this work? It's easy to abondon it or take only the pieces we want if it should ever see the light of day. ---- Here's some thinking out loud. Raymond mentioned about some of the warts of the current patch. In particular, all nine argument variables are initialized each time and there's a switch on the number of arguments. Ultimately, I think we can speed things up more by having 9 different op codes, ie, one for each # of arguments. CALL_FUNCTION_0, CALL_FUNCTION_1, ... (9 is still arbitrary and subject to change) Then we would have N little functions, each with the exact # of parameters. Each would still need a switch to call the C function because there may be optional parameters. Ultimately, it's possible the code would be small enough to stick it into the eval_frame loop. Each of these steps would need to be tested, but that's a possible longer term direction. There would only be an if to check if it was a C function or not. Maybe we could even get rid of this by more fixup at import time. Neal From ncoghlan at iinet.net.au Wed Jan 26 04:59:12 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Jan 26 04:59:19 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: References: <000e01c502d3$0458a340$18fccc97@oemcomputer> Message-ID: <41F71590.9090501@iinet.net.au> Neal Norwitz wrote: > So far it seems there isn't any specific problems with the approach. > There are simply concerns. I not sure it would be best to modify this > patch over many iterations and then make one huge checkin. I also > don't want to lose the changes or the results. Perhaps I should make > a branch for this work? It's easy to abondon it or take only the > pieces we want if it should ever see the light of day. A branch would seem the best way to allow other people to contribute to the experiment. I'll also note that this mechanism should make it easier to write C functions which are easily used both from Python and as direct entries in a C API. Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From ncoghlan at iinet.net.au Wed Jan 26 05:40:53 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Jan 26 05:41:00 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: References: <000e01c50321$a20d1e60$18fccc97@oemcomputer> Message-ID: <41F71F55.8010104@iinet.net.au> Steven Bethard wrote: > If we're really looking for a builtin, wouldn't it be better to go the > route of getattr/setattr and have something like getslice that could > operate on both lists and iterators? Such a builtin should probably be getitem() rather than getslice() (since getitem(iterable, slice(start, stop, step)) covers the getslice() case). However, I don't really see the point of this, since "from itertools import islice" is nearly as good as such a builtin. More importantly, I don't see how this alters Guido's basic criticism that slicing a list and slicing an iterator represent fundamentally different concepts. (ie. if "itr[x]" is unacceptable, I don't see how changing the spelling to "getitem(itr, x)" could make it any more acceptable). If slicing is taken as representing random access to a data structure (which seems to be Guido's view), then using it to represent sequential access to an item in or region of an iterator is not appropriate. I'm not sure how compatible that viewpoint is with wanting Python 3k to be as heavily iterator based as 2.x is list based, but that's an issue for the future. For myself, I don't attach such specific semantics to slicing (I see it as highly dependent on the type of object being sliced), and consider it obvious syntactic sugar for the itertools islice operation. As mentioned in my previous message, I also think the iterator/iterable distinction should be able to be ignored as much as possible, and the lack of syntactic support for working with iterators is the major factor that throws the distinction into a programmer's face. It currently makes the fact that some builtins return lists and others iterators somewhat inconvenient. Those arguments have already failed to persuade Guido though, so I guess the idea is dead for the moment (unless/until someone comes up with a convincing argument that I haven't thought of). Given Guido's lack of enthusiasm for *this* idea though, I'm not even going to venture into the realms of "+" on iterators defaulting to itertools.chain or "*" to itertools.repeat. Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From skip at pobox.com Wed Jan 26 05:30:53 2005 From: skip at pobox.com (Skip Montanaro) Date: Wed Jan 26 05:59:59 2005 Subject: [Python-Dev] I think setup.py needs major rework Message-ID: <16887.7421.746339.594221@montanaro.dyndns.org> I just submitted a bug report for setup.py: http://python.org/sf/1109602 It begins: Python's setup.py has grown way out of control. I'm trying to build and install Python 2.4.0 on a Solaris system with Tcl/Tk installed in a non-standard place and I can't figure out the incantation to tell setup.py to look where they are installed. ... and ends: This might be an excellent sprint topic for PyCon. Skip From kbk at shore.net Wed Jan 26 06:29:26 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed Jan 26 06:29:45 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: (Neal Norwitz's message of "Tue, 25 Jan 2005 22:35:42 -0500") References: <000e01c502d3$0458a340$18fccc97@oemcomputer> Message-ID: <87is5kzrk9.fsf@hydra.bayview.thirdcreek.com> Neal Norwitz writes: >> * not handling more than nine arguments, > > There are very few functions I've found that take more than 2 > arguments. Should 9 be lower, higher? I don't have a good feel. > From what I've seen, 5 may be more reasonable as far as catching 90% > of the cases. Five is probably conservative. http://mail.python.org/pipermail/python-dev/2004-February/042847.html -- KBK From fdrake at acm.org Wed Jan 26 07:18:46 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Jan 26 07:18:55 2005 Subject: [Python-Dev] I think setup.py needs major rework In-Reply-To: <16887.7421.746339.594221@montanaro.dyndns.org> References: <16887.7421.746339.594221@montanaro.dyndns.org> Message-ID: <200501260118.47192.fdrake@acm.org> On Tuesday 25 January 2005 23:30, Skip Montanaro wrote: > Python's setup.py has grown way out of control. I'm trying to build > and install Python 2.4.0 on a Solaris system with Tcl/Tk installed in a > non-standard place and I can't figure out the incantation to tell setup.py > to look where they are installed. ... > This might be an excellent sprint topic for PyCon. Indeed it would be! -Fred -- Fred L. Drake, Jr. From ncoghlan at iinet.net.au Wed Jan 26 09:06:36 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Jan 26 09:06:43 2005 Subject: [Python-Dev] Allowing slicing of iterators In-Reply-To: References: <41F4DC39.9020603@iinet.net.au> <41F64B57.3040807@iinet.net.au> Message-ID: <41F74F8C.5080100@iinet.net.au> Guido van Rossum wrote: > Iterators are for single sequential access. It's a feature that you > have to import itertools (or at least that you have to invoke its > special operations) -- iterators are not sequences and shouldn't be > confused with such. > I agree the semantic difference between an iterable and an iterator is important, but I am unclear on why that needs to translate to a syntactic difference for slicing, when it doesn't translate to such a difference for iteration (despite the *major* difference in the effect upon the object that is iterated over). Are the semantics of slicing really that much more exact than those for iteration? Also, would it make a difference if the ability to extract an individual item from an iterator through subscripting was disallowed? (i.e. getting the second item of an iterator being spelt "itr[2:3].next()" instead of "itr[2]") Regards, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From anthony at python.org Wed Jan 26 09:51:55 2005 From: anthony at python.org (Anthony Baxter) Date: Wed Jan 26 09:52:18 2005 Subject: [Python-Dev] RELEASED Python 2.3.5, release candidate 1 Message-ID: <200501261952.06150.anthony@python.org> On behalf of the Python development team and the Python community, I'm happy to announce the release of Python 2.3.5 (release candidate 1). Python 2.3.5 is a bug-fix release. See the release notes at the website (also available as Misc/NEWS in the source distribution) for details of the bugs squished in this release. Assuming no major problems crop up, a final release of Python 2.3.5 will follow in about a week's time. Python 2.3.5 is the last release in the Python 2.3 series, and is being released for those people who still need to use Python 2.3. Python 2.4 is a newer release, and should be preferred if possible. From here, bugfix releases are switching to the Python 2.4 branch - a 2.4.1 will follow 2.3.5 final. For more information on Python 2.3.5, including download links for various platforms, release notes, and known issues, please see: http://www.python.org/2.3.5 Highlights of this new release include: - Bug fixes. According to the release notes, more than 50 bugs have been fixed, including a couple of bugs that could cause Python to crash. Highlights of the previous major Python release (2.3) are available from the Python 2.3 page, at http://www.python.org/2.3/highlights.html Enjoy the new release, Anthony Anthony Baxter anthony@python.org Python Release Manager (on behalf of the entire python-dev team) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050126/769bc0ed/attachment.pgp From walter at livinglogic.de Wed Jan 26 12:47:11 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Jan 26 12:47:17 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: References: <000e01c502d3$0458a340$18fccc97@oemcomputer> Message-ID: <41F7833F.90905@livinglogic.de> Neal Norwitz wrote: > [...] > This is the python test coverage: > http://coverage.livinglogic.de/coverage/web/selectEntry.do?template=2850&entryToSelect=182530 This link won't work because of session management. To get the coverage info of ceval.c go to http://coverage.livinglogic.de, click on the latest run, enter "ceval" in the "Filename" field, click "Search" and click on the one line in the search result. Bye, Walter D?rwald From python at rcn.com Wed Jan 26 15:47:41 2005 From: python at rcn.com (Raymond Hettinger) Date: Wed Jan 26 15:53:31 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: Message-ID: <004a01c503b5$fd167b00$18fccc97@oemcomputer> > I agree that METH_O and METH_NOARGS are near > optimal wrt to performance. But if we could have one METH_UNPACKED > instead of 3 METH_*, I think that would be a win. . . . > Sorry, I meant eliminated w/3.0. So, leave METH_O and METH_NOARGS alone. They can't be dropped until 3.0 and they can't be improved speedwise. > > * not handling more than nine arguments, > > There are very few functions I've found that take more than 2 arguments. It's not a matter of how few; it's a matter of imposing a new, arbitrary limit where none previously existed. This is not a positive point for the patch. > Ultimately, I think we can speed things up more by having 9 different > op codes, ie, one for each # of arguments. CALL_FUNCTION_0, > CALL_FUNCTION_1, ... > (9 is still arbitrary and subject to change) How is the compiler to know the arity of the target function? If I call pow(3,5), how would the compiler know that pow() can take an optional third argument which would be need to be initialized to NULL? > Then we would have N little functions, each with the exact # of > parameters. Each would still need a switch to call the C function > because there may be optional parameters. Ultimately, it's possible > the code would be small enough to stick it into the eval_frame loop. > Each of these steps would need to be tested, but that's a possible > longer term direction. . . . > There would only be an if to check if it was a C function or not. > Maybe we could even get rid of this by more fixup at import time. This is what I mean about the patch taking on a life of its own. It's an optimization patch that slows down METH_O and METH_NOARGS. It's a incremental change that throws away backwards compatibility. It's a simplification that introduces a bazillion new code paths. It's a simplification that can't be realized until 3.0. It's a minor change that entails new opcodes, compiler changes, and changes in all extensions that have ever been written. IOW, this patch has lost its focus (or innocence). That can be recovered by limiting the scope to improving the call time for methods with signatures like "O}O". That is an achievable goal that doesn't impact backwards compatibility, doesn't negatively impact existing near-optimal METH_O and METH_NOARGS code, doesn't mess with the compiler, doesn't introduce new opcodes, doesn't alter import logic, and doesn't muck-up existing extensions. Raymond "Until next week, keep your feet on the ground and keep reaching for the stars." -- Casey Kasem From fredrik at pythonware.com Wed Jan 26 18:44:32 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Jan 26 18:44:24 2005 Subject: [Python-Dev] Re: Allowing slicing of iterators References: <41F4DC39.9020603@iinet.net.au><41F64B57.3040807@iinet.net.au> Message-ID: Guido van Rossum wrote: >> I'd like to see iterators become as easy to work with as lists are. At the >> moment, anything that returns an iterator forces you to use the relatively >> cumbersome itertools.islice mechanism, rather than Python's native slice syntax. > > Sorry. Still -1. can we perhaps persuade you into changing that to a -0.1, so we can continue playing with the idea? > iterators are not sequences and shouldn't be confused with such. the for-in statement disagrees with you, I think. on the other hand, I'm not sure I trust that statement any more; I'm really disappointed that it won't let me write my loops as: for item in item for item in item for item in item for item in seq: ... From steve at holdenweb.com Tue Jan 25 18:17:55 2005 From: steve at holdenweb.com (Steve Holden) Date: Wed Jan 26 20:07:14 2005 Subject: [Python-Dev] Re: [PyCON-Organizers] PyCon: The Spam Continues ;-) In-Reply-To: <41F5A434.1060905@ActiveState.com> References: <41F583A5.8020206@holdenweb.com> <41F5A434.1060905@ActiveState.com> Message-ID: <41F67F43.4010403@holdenweb.com> David Ascher wrote: > Steve Holden wrote: > >> Dear python-dev: >> >> The current (as of even date) summary of my recent contributions to >> Python -dev appears to be spam about PyCon. >> >> Not being one to break habits, even not those of a lifetime sometimes, >> I spam you yet again to show you what a beautiful summary ActiveState >> have provided (I don't know whether this URL is cacheable or not): >> >> > > > > Yup, we try to make all our URLs portable and persistent. > Good for you. >> If I remember Trent Lott (?) > > > Nah, that's a US politician. T'was Trent Mick. > Indeed, and already corrected. >> described at an IPC the SQL Server database that drives this system, >> and it was a great example of open source technology driving a >> proprietary (but I expect (?) relatively portable) repository. > > > Modulo some SQLServer features we're using. > Well free-text indexing would be my first guess. Anything else of interest? MySQL's free text indexing really sucks compared with SQL Server's, which to my mind is a good justification for the Microsoft product. >> Since I have your attention (and if I haven't then it really doesn't >> matter what I write hereafter, goodbye ...) I will also point out that >> the current top hit on Google for >> >> "Microsoft to Provide PyCon Opening Keynote" > > > What a bizarre search. > Oh, no, people run that search all the time :-) It's actually hit #3 for "Microsoft PyCon" as well, so I guess that's not too bad. > (note that some of your To's and Cc's were pretty strange... > Hmm, yes, I cringed when I got the bounces. That information didn't belong there. If only there were a way to take emails back ... regards Steve -- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/ Holden Web LLC +1 703 861 4237 +1 800 494 3119 From kbk at shore.net Wed Jan 26 21:50:54 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed Jan 26 21:51:20 2005 Subject: [Python-Dev] I think setup.py needs major rework In-Reply-To: <16887.7421.746339.594221@montanaro.dyndns.org> (Skip Montanaro's message of "Tue, 25 Jan 2005 22:30:53 -0600") References: <16887.7421.746339.594221@montanaro.dyndns.org> Message-ID: <87acqvzzgx.fsf@hydra.bayview.thirdcreek.com> Skip Montanaro writes: > Python's setup.py has grown way out of control. I'm trying to > build and install Python 2.4.0 on a Solaris system with Tcl/Tk > installed in a non-standard place and I can't figure out the > incantation to tell setup.py to look where they are installed. This may be more due to the complexity of distutils than to setup.py itself. Special cases are special cases, after all, e.g. look at Autotools. setup.py is billed as "Autodetecting setup.py script for building the Python extensions" but exactly how to override it without hacking it isn't very well documented, when possible at all. "Distributing Python Modules" helped me, but the reference section is missing, so it's utsl from there. So one improvement would be to better document overriding setup.py in README. Your solution may be as simple as adding to Makefile:342 (approx) --include-dirs=xxxx --library-dirs=yyyy where setup.py is called. (distutils/command/build_ext.py) *But* I suspect build() may not pass the options through to build_ext()! So, a config file approach: .../src/setup.cfg: [build_ext] include-dirs=xxxx library-dirs=yyyy In setup.py, PyBuildExt.build_extension() does most of the special casing. The last thing done is to call PyBuildExt.detect_tkinter() which handles a bunch of platform incompatibilities. e.g. # OpenBSD and FreeBSD use Tcl/Tk library names like libtcl83.a, but # the include subdirs are named like .../include/tcl8.3. If the previous ideas flub, you could hack your detect_tkinter() and append your include and lib dirs to inc_dirs and lib_dirs at the beginning of the method. All else fails, use Modules/Setup.dist to install Tcl/Tk? Or maybe symlink your non-standard location? -- KBK From bac at OCF.Berkeley.EDU Wed Jan 26 22:35:35 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Jan 26 22:35:52 2005 Subject: [Python-Dev] I think setup.py needs major rework In-Reply-To: <200501260118.47192.fdrake@acm.org> References: <16887.7421.746339.594221@montanaro.dyndns.org> <200501260118.47192.fdrake@acm.org> Message-ID: <41F80D27.1090100@ocf.berkeley.edu> Fred L. Drake, Jr. wrote: > On Tuesday 25 January 2005 23:30, Skip Montanaro wrote: > > Python's setup.py has grown way out of control. I'm trying to build > > and install Python 2.4.0 on a Solaris system with Tcl/Tk installed in a > > non-standard place and I can't figure out the incantation to tell setup.py > > to look where they are installed. > ... > > This might be an excellent sprint topic for PyCon. > > Indeed it would be! > ... and now it is listed as one. Started a section on the sprint wiki page for orphaned sprint topics with a subsection on core stuff. Listed a rewrite of setup.py along with a rewrite of site.py and the usual bug/patch squashing. URL is http://www.python.org/moin/PyConDC2005/Sprints for those not wanting to go hunting for it. -Brett From abo at minkirri.apana.org.au Thu Jan 27 00:51:43 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Thu Jan 27 00:52:28 2005 Subject: [Python-Dev] Strange segfault in Python threads and linux kernel 2.6 In-Reply-To: <200501260153.21672.anthony@interlink.com.au> References: <1106111769.3822.52.camel@schizo> <200501251657.57682.anthony@interlink.com.au> <004b01c502e6$6db77380$24ed0ccb@apana.org.au> <200501260153.21672.anthony@interlink.com.au> Message-ID: <1106783503.3889.12.camel@schizo> On Wed, 2005-01-26 at 01:53 +1100, Anthony Baxter wrote: > On Wednesday 26 January 2005 01:01, Donovan Baarda wrote: > > In this case it turns out to be "don't do exec() in a thread, because what > > you exec can have all it's signals masked". That turns out to be a hell of > > a lot of things; popen, os.command, etc. They all only work OK in a > > threaded application if what you are exec'ing doesn't use any signals. > > Yep. You just have to be aware of it. We do a bit of this at work, and we > either spool via a database table, or a directory full of spool files. > > > Actually, I've noticed that zope often has a sorta zombie "which" process > > which it spawns. I wonder it this is a stuck thread waiting for some > > signal... > > Quite likely. For the record, it seems that the java version also contributes. This problem only occurs when you have the following combination; Linux >=2.6 Python <=2.3 j2re1.4 =1.4.2.01-1 | kaffe 2:1.1.4xxx If you use Linux 2.4, it goes away. If you use Python 2.4 it goes away. If you use j2re1.4.1.01-1 it goes away. For the problem to occur the following combination needs to occur; 1) Linux uses the thread's sigmask instead of the main thread/process sigmask for the exc'ed process (ie, 2.6 does this, 2.4 doesn't). 2) Python needs to screw with the sigmask in threads (python 2.3 does, python 2.4 doesn't). 3) The exec'ed process needs to rely on threads (j2re1.4 1.4.2.01-1 does, j2re1.4 1.4.1.01-1 doesn't). It is hard to find old Debian deb's of j2re1.4 (1.4.1.01-1), and when you do, you will also need the now non-existent j2se-common 1.1 package. I don't know if this qualifies as a potential bug against j2re1.4 1.4.2.01-1. For now my solution is to roll back to the older j2re1.4. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From alan.green at gmail.com Thu Jan 27 01:27:03 2005 From: alan.green at gmail.com (Alan Green) Date: Thu Jan 27 01:27:06 2005 Subject: [Python-Dev] Patch review: [ 1100942 ] datetime.strptime constructor added Message-ID: I see a need for this patch - I've had to write "datetime(*(time.strptime(date_string, format)[0:6]))" far too many times. I don't understand the C API well enough to check if reference counts are handled properly, but otherwise the implementation looks straight forward. Documentation looks good and the test passes on my machine. Two suggestions: 1. In the time module, the strptime() function's format parameter is optional. For consistency's sake, I'd expect datetime.strptime()'s format parameter also to be optional. (On the other hand, the default value for the format is not very useful.) 2. Since strftime is supported by datetime.time, datetime.date and datetime.datetime, I'd also expect strptime to be supported by all three classes. Could you add that now, or would it be better to do it as a separate patch? Alan. -- Alan Green alan.green@cardboard.nu - http://cardboard.nu From alan.green at gmail.com Thu Jan 27 01:32:58 2005 From: alan.green at gmail.com (Alan Green) Date: Thu Jan 27 01:33:03 2005 Subject: [Python-Dev] Patch review: [ 1094542 ] add Bunch type to collections module Message-ID: Steven Bethard is proposing a new collection class named Bunch. I had a few suggestions which I attached as comments to the patch - but what is really required is a bit more work on the draft PEP, and then discussion on the python-dev mailing list. http://sourceforge.net/tracker/?func=detail&aid=1100942&group_id=5470&atid=305470 Alan. -- Alan Green alan.green@cardboard.nu - http://cardboard.nu From steven.bethard at gmail.com Thu Jan 27 01:40:06 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Jan 27 01:40:08 2005 Subject: [Python-Dev] Patch review: [ 1094542 ] add Bunch type to collections module In-Reply-To: References: Message-ID: Alan Green wrote: > Steven Bethard is proposing a new collection class named Bunch. I had > a few suggestions which I attached as comments to the patch - but what > is really required is a bit more work on the draft PEP, and then > discussion on the python-dev mailing list. > > http://sourceforge.net/tracker/?func=detail&aid=1100942&group_id=5470&atid=305470 I believe the correct tracker is: http://sourceforge.net/tracker/index.php?func=detail&aid=1094542&group_id=5470&atid=305470 There was a substantial discussion about this on the python-list before I put the PEP together: http://mail.python.org/pipermail/python-list/2004-November/252621.html I applied for a PEP number on 2 Jan 2005 and haven't heard back yet, but the patch was posted so people could easily play around with it if they liked. My intentions were to post the PEP to python-dev as soon as I got a PEP number for it. Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From alan.green at gmail.com Thu Jan 27 02:44:08 2005 From: alan.green at gmail.com (Alan Green) Date: Thu Jan 27 02:44:10 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to __builtin__ Message-ID: Last August, James Knight posted to python-dev, "There's a fair number of classes that claim they are defined in __builtin__, but do not actually appear there". There was a discussion and James submitted this patch: http://sourceforge.net/tracker/index.php?func=detail&aid=1009811&group_id=5470&atid=305470 The final result of the discussion is unclear. Guido declared himself +0.5 on the concept, but nobody has reviewed the patch in detail yet. The original email thread starts here: http://mail.python.org/pipermail/python-dev/2004-August/047477.html The patch still applies, and test cases still run OK afterwards. Now that 2.4 has been released it is perhaps a good time to discuss in on python-dev again. If it isn't discussed, then the patch should be closed due to lack of interest. Alan. -- Alan Green alan.green@cardboard.nu - http://cardboard.nu From skip at pobox.com Thu Jan 27 04:17:50 2005 From: skip at pobox.com (Skip Montanaro) Date: Thu Jan 27 04:18:00 2005 Subject: [Python-Dev] Patch review: [ 1100942 ] datetime.strptime constructor added In-Reply-To: References: Message-ID: <16888.23902.44511.8382@montanaro.dyndns.org> Alan> 1. In the time module, the strptime() function's format Alan> parameter is optional. For consistency's sake, I'd expect Alan> datetime.strptime()'s format parameter also to be optional. (On Alan> the other hand, the default value for the format is not very Alan> useful.) Correct. No need to propagate a mistake. Alan> 2. Since strftime is supported by datetime.time, Alan> datetime.date and datetime.datetime, I'd also expect strptime to Alan> be supported by all three classes. Could you add that now, or Alan> would it be better to do it as a separate patch? That can probably be done, but I'm not convinced strftime really belongs on either date or time objects given the information those objects are missing: >>> t = datetime.datetime.now().time() >>> t.strftime("%Y-%m-%d") '1900-01-01' >>> d = datetime.datetime.now().date() >>> d.strftime("%H:%M:%S") '00:00:00' I would be happy for strftime to only be available for datetime objects (assuming there was a good way to get from time or date objects to datetime objects short of extracting their individual attributes). In any case, going from datetime to date or time objects is trivial: >>> dt = datetime.datetime.now() >>> dt.time() datetime.time(21, 12, 18, 705365) so parsing a string into a datetime object then splitting out date and time objects seems reasonable. Skip From python at rcn.com Thu Jan 27 05:55:41 2005 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 27 05:59:18 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: Message-ID: <001f01c5042c$73efe1a0$b038fea9@oemcomputer> > Last August, James Knight posted to python-dev, "There's a fair number > of classes that claim they are defined in __builtin__, but do not > actually appear there". There was a discussion and James submitted > this patch: > > http://sourceforge.net/tracker/index.php?func=detail&aid=1009811&group_i d= > 5470&atid=305470 I'm -1 on adding these to __builtin__. They are just distractors and have almost no use in real Python programs. Worse, if you do use them, then you are likely to be programming badly -- we don't want to encourage that. Also, I take some of these, such as dictproxy and cell, to be implementation details that are subject to change. Adding them to __builtin__ would unnecessarily immortalize them. > The final result of the discussion is unclear. Guido declared himself > +0.5 on the concept, but nobody has reviewed the patch in detail yet. Even if Guido were suffering from time machine induced hallucinations that day, he still knew better than to go a full +1. Raymond From martin at v.loewis.de Thu Jan 27 07:20:48 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 27 07:20:47 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: <001f01c5042c$73efe1a0$b038fea9@oemcomputer> References: <001f01c5042c$73efe1a0$b038fea9@oemcomputer> Message-ID: <41F88840.7070105@v.loewis.de> Raymond Hettinger wrote: > I'm -1 on adding these to __builtin__. They are just distractors and > have almost no use in real Python programs. Worse, if you do use them, > then you are likely to be programming badly -- we don't want to > encourage that. I agree. Because of the BDFL pronouncement, I cannot reject the patch, but I won't accept it, either. So it seems that this patch will have to sit in the SF tracker until either Guido processes it, or it is withdrawn. Regards, Martin From foom at fuhm.net Thu Jan 27 08:01:20 2005 From: foom at fuhm.net (James Y Knight) Date: Thu Jan 27 08:01:43 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: <41F88840.7070105@v.loewis.de> References: <001f01c5042c$73efe1a0$b038fea9@oemcomputer> <41F88840.7070105@v.loewis.de> Message-ID: <3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net> On Jan 27, 2005, at 1:20 AM, Martin v. L?wis wrote: > I agree. Because of the BDFL pronouncement, I cannot reject the patch, > but I won't accept it, either. So it seems that this patch will have > to sit in the SF tracker until either Guido processes it, or it is > withdrawn. If people want to restart this discussion, I'd like to start back with the following message, rather than simply accepting/rejecting the patch. From the two comments so far, it seems like it's not the patch that needs reviewing, but still the concept. On August 10, 2004 12:17:14 PM EDT, I wrote: > Sooo should (for 'generator' in objects that claim to be in > __builtins__ but aren't), > 1) 'generator' be added to __builtins__ > 2) 'generator' be added to types.py and its __module__ be set to > 'types' > 3) 'generator' be added to .py and its __module__ be set to > '' (and a name for the module chosen) Basically, I'd like to see them be given a binding somewhere, and have their claimed module agree with that, but am not particular as to where. Option #2 seemed to be rejected last time, and option #1 was given approval, so that's what I wrote a patch for. It sounds like it's getting pretty strong "no" votes this time around, however. Therefore, I would like to suggest option #3, with being, say, 'internals'. James From fperez.net at gmail.com Thu Jan 27 09:07:06 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Thu Jan 27 09:20:55 2005 Subject: [Python-Dev] Re: Patch review: [ 1094542 ] add Bunch type to collections module References: Message-ID: Hi all, Steven Bethard wrote: > Alan Green wrote: >> Steven Bethard is proposing a new collection class named Bunch. I had >> a few suggestions which I attached as comments to the patch - but what >> is really required is a bit more work on the draft PEP, and then >> discussion on the python-dev mailing list. >> >> http://sourceforge.net/tracker/?func=detail&aid=1100942&group_id=5470&atid=305470 > > I believe the correct tracker is: > > http://sourceforge.net/tracker/index.php?func=detail&aid=1094542&group_id=5470&atid=305470 A while back, when I started writing ipython, I had to write this same class (I called it Struct), and I ended up building a fairly powerful one for handling ipython's reucursive configuration system robustly. The design has some nasty problems which I'd change if I were doing this today (I was just learning the language at the time). But it also taught me a few things about what one ends up needing from such a beast in complex situations. I've posted the code here as plain text and syntax-highlighted html, in case anyone is interested: http://amath.colorado.edu/faculty/fperez/python/Struct.py http://amath.colorado.edu/faculty/fperez/python/Struct.py.html One glaring problem of my class is the blocking of dictionary method names as attributes, this would have to be addressed differently. But one thing which I really find necessary from a useful 'Bunch' class, is the ability to access attributes via foo[name] (which requires implementing __getitem__). Named access is convenient when you _know_ the name you need (foo.attr). However, if the name of the attribute is held in a variable, IMHO foo[name] beats getattr(foo,name) in clarity and feels much more 'pythonic'. Another useful feature of this Struct class is the 'merge' method. While mine is probably overly flexible and complex for the stdlib (though it is incredibly useful in many situations), I'd really like dicts/Structs to have another way of updating with a single method, which was non-destructive (update automatically overwrites with the new data). Currently this is best done with a loop, but a 'merge' method which would work like 'update', but without overwriting would be a great improvement, I think. Finally, my values() method allows an optional keys argument, which I also find very useful. If this keys sequence is given, values are returned only for those keys. I don't know if anyone else would find such a feature useful, but I do :). It allows a kind of 'slicing' of dicts which can be really convenient. I understand that my Struct is much more of a dict/Bunch hybrid than what you have in mind. But in heavy usage, I quickly realized that at least having __getitem__ implemented was an immediate need in many cases. Finally, the Bunch class should have a way of returning its values easily as a plain dictionary, for cases when you want to pass this data into a function which expects a true dict. Otherwise, it will 'lock' your information in. I really would like to see such a class in the stdlib, as it's something that pretty much everyone ends up rewriting. I certainly don't claim my implementation to be a good reference (it isn't). But perhaps it can be useful to the discussion as an example of a 'battle-tested' such class, flaws and all. I think the current pre-PEP version is a bit too limited to be generally useful in complex, real-world situtations. It would be a good starting point to subclass for more demanding situations, but IMHO it would be worth considering a more powerful default class. Regards, From p.f.moore at gmail.com Thu Jan 27 10:49:48 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Thu Jan 27 10:49:51 2005 Subject: [Python-Dev] PEP 309 (Was: Patch review: [ 1094542 ] add Bunch type to collections module) In-Reply-To: References: Message-ID: <79990c6b05012701492440d0c0@mail.gmail.com> On Thu, 27 Jan 2005 01:07:06 -0700, Fernando Perez wrote: > I really would like to see such a class in the stdlib, as it's something that > pretty much everyone ends up rewriting. I certainly don't claim my > implementation to be a good reference (it isn't). But perhaps it can be > useful to the discussion as an example of a 'battle-tested' such class, flaws > and all. On the subject of "things everyone ends up rewriting", what needs to be done to restart discussion on PEP 309 (Partial Function Application)? The PEP is marked "Accepted" and various patches exist: 941881 - C implementation 1006948 - Windows build changes 931010 - Unit Tests 931007 - Documentation 931005 - Reference implementation (in Python) I get the impression that there are some outstanding tweaks required to the Windows build, but I don't have VS.NET to check and/or update the patch. Does this just need a core developer to pick it up? I guess I'll go off and do some patch/bug reviews... Paul. From gvanrossum at gmail.com Thu Jan 27 16:48:07 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Jan 27 16:48:14 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: <3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net> References: <001f01c5042c$73efe1a0$b038fea9@oemcomputer> <41F88840.7070105@v.loewis.de> <3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net> Message-ID: On Thu, 27 Jan 2005 02:01:20 -0500, James Y Knight wrote: > Basically, I'd like to see them be given a binding somewhere, and have > their claimed module agree with that, but am not particular as to > where. Option #2 seemed to be rejected last time, and option #1 was > given approval, so that's what I wrote a patch for. It sounds like it's > getting pretty strong "no" votes this time around, however. Therefore, > I would like to suggest option #3, with being, say, > 'internals'. +1 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Thu Jan 27 18:11:06 2005 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 27 18:14:50 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing typesto__builtin__ In-Reply-To: Message-ID: <004301c50493$32e80fe0$433fc797@oemcomputer> [James Y Knight] > > Basically, I'd like to see them be given a binding somewhere, and have > > their claimed module agree with that, but am not particular as to > > where. Option #2 seemed to be rejected last time, and option #1 was > > given approval, so that's what I wrote a patch for. It sounds like it's > > getting pretty strong "no" votes this time around, however. Therefore, > > I would like to suggest option #3, with being, say, > > 'internals'. [GvR] > +1 That gives them a place to live and doesn't clutter __builtin__. However, it should be named __internals__. The next question is how to document it. My preference is to be clear that it is implementation specific (Jython won't have cell, PyCFunction, and dictproxy types); that it is subject to change between versions (so as not to prematurely immortalize design/implementation accidents); and that they have only esoteric application (99.9% of programs won't need them and should avoid them like the plague). Calling it __internals__ will help emphasize that we are exposing parts of the implementation that were consciously left semi-private or undocumented. Raymond From steven.bethard at gmail.com Thu Jan 27 18:52:33 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Jan 27 18:52:37 2005 Subject: [Python-Dev] Re: Patch review: [ 1094542 ] add Bunch type to collections module In-Reply-To: References: Message-ID: Fernando Perez wrote: > > Alan Green wrote: > >> Steven Bethard is proposing a new collection class named Bunch. I had > >> a few suggestions which I attached as comments to the patch - but what > >> is really required is a bit more work on the draft PEP, and then > >> discussion on the python-dev mailing list. > > But one thing which I really find necessary from a useful 'Bunch' class, is > the ability to access attributes via foo[name] (which requires implementing > __getitem__). Named access is convenient when you _know_ the name you need > (foo.attr). However, if the name of the attribute is held in a variable, IMHO > foo[name] beats getattr(foo,name) in clarity and feels much more 'pythonic'. My feeling about this is that if the name of the attribute is held in a variable, you should be using a dict, not a Bunch/Struct. If you have a Bunch/Struct and decide you want a dict instead, you can just use vars: py> b = Bunch(a=1, b=2, c=3) py> vars(b) {'a': 1, 'c': 3, 'b': 2} > Another useful feature of this Struct class is the 'merge' method. [snip] > my values() method allows an optional keys argument, which I also > find very useful. Both of these features sound useful, but I don't see that they're particularly more useful in the case of a Bunch/Struct than they are for dict. If dict gets such methods, then I'd be happy to add them to Bunch/Struct, but for consistency's sake, I think at the moment I'd prefer that people who want this functionality subclass Bunch/Struct and add the methods themselves. > I think the current pre-PEP version is a bit too limited to be generally > useful in complex, real-world situtations. It would be a good starting point > to subclass for more demanding situations, but IMHO it would be worth > considering a more powerful default class. I'm probably not willing to budge much on adding dict-style methods -- if you want a dict, use a dict. But if people think they're necessary, there are a few methods from Struct that I wouldn't be too upset if I had to add, e.g. clear, copy, etc. But I'm going to need more feedback before I make any changes like this. Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From martin at v.loewis.de Thu Jan 27 23:41:30 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Jan 27 23:41:24 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: <3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net> References: <001f01c5042c$73efe1a0$b038fea9@oemcomputer> <41F88840.7070105@v.loewis.de> <3F36B0D6-7031-11D9-B9BF-000A95A50FB2@fuhm.net> Message-ID: <41F96E1A.3020701@v.loewis.de> James Y Knight wrote: >> Sooo should (for 'generator' in objects that claim to be in >> __builtins__ but aren't), >> 1) 'generator' be added to __builtins__ >> 2) 'generator' be added to types.py and its __module__ be set to 'types' >> 3) 'generator' be added to .py and its __module__ be set to >> '' (and a name for the module chosen) There are more alternatives: 4) the __module__ of these types could be absent (i.e. accessing __module__ could give an AttributeError) 5) the __module__ could be present and have a value of None 6) anything could be left as is. The __module__ value of these types might be somewhat confusing, but not enough so to justify changing it to any of the alternatives, which might also be confusing (each in their own way). > Basically, I'd like to see them be given a binding somewhere, and have > their claimed module agree with that, but am not particular as to where. I think I cannot agree with this as a goal regardless of the consequences. > Option #2 seemed to be rejected last time, and option #1 was given > approval, so that's what I wrote a patch for. It sounds like it's > getting pretty strong "no" votes this time around, however. Therefore, I > would like to suggest option #3, with being, say, 'internals'. -1. 'internals' is not any better than 'sys', 'new', or 'types'. It is worse, as new modules are confusing to users - one more thing they have to learn. Regards, Martin From python at rcn.com Thu Jan 27 23:52:17 2005 From: python at rcn.com (Raymond Hettinger) Date: Thu Jan 27 23:56:00 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: <41F96E1A.3020701@v.loewis.de> Message-ID: <006001c504c2$da9cc1c0$433fc797@oemcomputer> > > Basically, I'd like to see them be given a binding somewhere, and have > > their claimed module agree with that, but am not particular as to where. > > I think I cannot agree with this as a goal regardless of the consequences. Other than a vague feeling of completeness is there any reason this needs to be done? Is there anything useful that currently cannot be expressed without this new module? Raymond From martin at v.loewis.de Fri Jan 28 00:24:51 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri Jan 28 00:24:46 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: <006001c504c2$da9cc1c0$433fc797@oemcomputer> References: <006001c504c2$da9cc1c0$433fc797@oemcomputer> Message-ID: <41F97843.8070902@v.loewis.de> Raymond Hettinger wrote: > Other than a vague feeling of completeness is there any reason this > needs to be done? Is there anything useful that currently cannot be > expressed without this new module? That I wonder myself, too. Regards, Martin From fperez.net at gmail.com Fri Jan 28 02:16:24 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Fri Jan 28 02:16:32 2005 Subject: [Python-Dev] Re: Re: Patch review: [ 1094542 ] add Bunch type to collections module References: Message-ID: Steven Bethard wrote: > Fernando Perez wrote: > My feeling about this is that if the name of the attribute is held in > a variable, you should be using a dict, not a Bunch/Struct. If you > have a Bunch/Struct and decide you want a dict instead, you can just > use vars: > > py> b = Bunch(a=1, b=2, c=3) > py> vars(b) > {'a': 1, 'c': 3, 'b': 2} Well, the problem I see here is that often, you need to mix both kinds of usage. It's reasonable to have code for which Bunch is exactly what you need in most places, but where you have a number of accesses via variables whose value is resolved at runtime. Granted, you can use getattr(bunch,varname), or make an on-the-fly dict as you indicated above. But since Bunch is above all a convenience class for common idioms, I think supporting a common need is a reasonable idea. Again, just my opinion. >> Another useful feature of this Struct class is the 'merge' method. > [snip] >> my values() method allows an optional keys argument, which I also >> find very useful. > > Both of these features sound useful, but I don't see that they're > particularly more useful in the case of a Bunch/Struct than they are > for dict. If dict gets such methods, then I'd be happy to add them to > Bunch/Struct, but for consistency's sake, I think at the moment I'd > prefer that people who want this functionality subclass Bunch/Struct > and add the methods themselves. It's very true that these are almost a request for a dict extension. Frankly, I'm too swamped to follow up with a pep/patch for it, though. Pity, because they can be really useful... Takers? > I'm probably not willing to budge much on adding dict-style methods -- > if you want a dict, use a dict. But if people think they're > necessary, there are a few methods from Struct that I wouldn't be too > upset if I had to add, e.g. clear, copy, etc. But I'm going to need > more feedback before I make any changes like this. You already have update(), which by the way precludes a bunch storing an 'update' attribute. My class suffers from the same problem, just with many more names. I've thought about this, and my favorite solution so far would be to provide whichever dict-like methods end up implemented (update, merge (?), etc) with a leading single underscore. I simply don't see any other way to cleanly distinguish between a bunch which holds an 'update' attribute and the update method. I guess making them classmethods (or is it staticmethods? I don't use those so I may be confusing terminology) might be a clean way out: Bunch.update(mybunch, othermapping) -> modifies mybunch. Less traditional OO syntax for bunches, but this would sidestep the potential name conflicts. Anyway, these are just some thoughts. Feel free to take what you like. Regards, f From steven.bethard at gmail.com Fri Jan 28 02:25:57 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri Jan 28 02:26:01 2005 Subject: [Python-Dev] Re: Re: Patch review: [ 1094542 ] add Bunch type to collections module In-Reply-To: References: Message-ID: Fernando Perez wrote: > Steven Bethard wrote: > > I'm probably not willing to budge much on adding dict-style methods -- > > if you want a dict, use a dict. But if people think they're > > necessary, there are a few methods from Struct that I wouldn't be too > > upset if I had to add, e.g. clear, copy, etc. But I'm going to need > > more feedback before I make any changes like this. > > You already have update(), which by the way precludes a bunch storing an > 'update' attribute. Well, actually, you can have an update attribute, but then you have to call update from the class instead of the instance: py> from bunch import Bunch py> b = Bunch(update=3) py> b.update 3 py> b.update(Bunch(hi=4)) Traceback (most recent call last): File "", line 1, in ? TypeError: 'int' object is not callable py> Bunch.update(b, Bunch(hi=4)) py> b.hi 4 > Bunch.update(mybunch, othermapping) -> modifies mybunch. Yup, that works currently. As is normal for new-style classes (AFAIK), the methods are stored in the class object, so assigning an 'update' attribute to an instance just hides the method in the class. You can still reach the method by invoking it from the class and passing it an instance as the first argument. Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From fperez.net at gmail.com Fri Jan 28 02:31:55 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Fri Jan 28 02:32:10 2005 Subject: [Python-Dev] Re: Re: Re: Patch review: [ 1094542 ] add Bunch type to collections module References: Message-ID: Steven Bethard wrote: > Fernando Perez wrote: >> Steven Bethard wrote: >> > I'm probably not willing to budge much on adding dict-style methods -- >> > if you want a dict, use a dict. But if people think they're >> > necessary, there are a few methods from Struct that I wouldn't be too >> > upset if I had to add, e.g. clear, copy, etc. But I'm going to need >> > more feedback before I make any changes like this. >> >> You already have update(), which by the way precludes a bunch storing an >> 'update' attribute. > > Well, actually, you can have an update attribute, but then you have to > call update from the class instead of the instance: [...] Of course, you are right. However, I think it would perhaps be best to advertise any methods of Bunch as strictly classmethods from day 1. Otherwise, you can have: b = Bunch() b.update(otherdict) -> otherdict happens to have an 'update' key ... more code b.update(someotherdict) -> boom! update is not callable If all Bunch methods are officially presented always as classmethods, users can simply expect that all attributes of a bunch are meant to store data, without any instance methods at all. Regards, f From steven.bethard at gmail.com Fri Jan 28 02:54:04 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri Jan 28 02:54:07 2005 Subject: [Python-Dev] Re: Re: Re: Patch review: [ 1094542 ] add Bunch type to collections module In-Reply-To: References: Message-ID: On Thu, 27 Jan 2005 18:31:55 -0700, Fernando Perez wrote: > However, I think it would perhaps be best to advertise any methods of Bunch as > strictly classmethods from day 1. Otherwise, you can have: > > b = Bunch() > b.update(otherdict) -> otherdict happens to have an 'update' key > > ... more code > > b.update(someotherdict) -> boom! update is not callable > > If all Bunch methods are officially presented always as classmethods, users can > simply expect that all attributes of a bunch are meant to store data, without > any instance methods at all. That sounds reasonable to me. I'll fix update to be a staticmethod. If people want other methods, I'll make sure they're staticmethods too.[1] Steve [1] In all the cases I can think of, staticmethod is sufficient -- the methods don't need to access any attributes of the Bunch class. If anyone has a good reason to make them classmethods instead, let me know... -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From gvanrossum at gmail.com Fri Jan 28 03:48:00 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 28 03:48:04 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/idlelib EditorWindow.py, 1.65, 1.66 NEWS.txt, 1.53, 1.54 config-keys.def, 1.21, 1.22 configHandler.py, 1.37, 1.38 In-Reply-To: References: Message-ID: Thanks!!! On Thu, 27 Jan 2005 16:16:19 -0800, kbk@users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Lib/idlelib > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv5316 > > Modified Files: > EditorWindow.py NEWS.txt config-keys.def configHandler.py > Log Message: > Add keybindings for del-word-left and del-word-right. > > M EditorWindow.py > M NEWS.txt > M config-keys.def > M configHandler.py > > Index: EditorWindow.py > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Lib/idlelib/EditorWindow.py,v > retrieving revision 1.65 > retrieving revision 1.66 > diff -u -d -r1.65 -r1.66 > --- EditorWindow.py 19 Jan 2005 00:22:54 -0000 1.65 > +++ EditorWindow.py 28 Jan 2005 00:16:15 -0000 1.66 > @@ -141,6 +141,8 @@ > text.bind("<>",self.change_indentwidth_event) > text.bind("", self.move_at_edge_if_selection(0)) > text.bind("", self.move_at_edge_if_selection(1)) > + text.bind("<>", self.del_word_left) > + text.bind("<>", self.del_word_right) > > if flist: > flist.inversedict[self] = key > @@ -386,6 +388,14 @@ > pass > return move_at_edge > > + def del_word_left(self, event): > + self.text.event_generate('') > + return "break" > + > + def del_word_right(self, event): > + self.text.event_generate('') > + return "break" > + > def find_event(self, event): > SearchDialog.find(self.text) > return "break" > > Index: NEWS.txt > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Lib/idlelib/NEWS.txt,v > retrieving revision 1.53 > retrieving revision 1.54 > diff -u -d -r1.53 -r1.54 > --- NEWS.txt 19 Jan 2005 00:22:57 -0000 1.53 > +++ NEWS.txt 28 Jan 2005 00:16:16 -0000 1.54 > @@ -3,17 +3,24 @@ > > *Release date: XX-XXX-2005* > > +- Add keybindings for del-word-left and del-word-right. > + > - Discourage using an indent width other than 8 when using tabs to indent > Python code. > > - Restore use of EditorWindow.set_indentation_params(), was dead code since > - Autoindent was merged into EditorWindow. > + Autoindent was merged into EditorWindow. This allows IDLE to conform to the > + indentation width of a loaded file. (But it still will not switch to tabs > + even if the file uses tabs.) Any change in indent width is local to that > + window. > > - Add Tabnanny check before Run/F5, not just when Checking module. > > - If an extension can't be loaded, print warning and skip it instead of > erroring out. > > +- Improve error handling when .idlerc can't be created (warn and exit). > + > - The GUI was hanging if the shell window was closed while a raw_input() > was pending. Restored the quit() of the readline() mainloop(). > http://mail.python.org/pipermail/idle-dev/2004-December/002307.html > > Index: config-keys.def > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Lib/idlelib/config-keys.def,v > retrieving revision 1.21 > retrieving revision 1.22 > diff -u -d -r1.21 -r1.22 > --- config-keys.def 17 Aug 2004 08:01:19 -0000 1.21 > +++ config-keys.def 28 Jan 2005 00:16:16 -0000 1.22 > @@ -55,6 +55,8 @@ > untabify-region= > toggle-tabs= > change-indentwidth= > +del-word-left= > +del-word-right= > > [IDLE Classic Unix] > copy= > @@ -104,6 +106,8 @@ > untabify-region= > toggle-tabs= > change-indentwidth= > +del-word-left= > +del-word-right= > > [IDLE Classic Mac] > copy= > @@ -153,3 +157,5 @@ > untabify-region= > toggle-tabs= > change-indentwidth= > +del-word-left= > +del-word-right= > > Index: configHandler.py > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Lib/idlelib/configHandler.py,v > retrieving revision 1.37 > retrieving revision 1.38 > diff -u -d -r1.37 -r1.38 > --- configHandler.py 13 Jan 2005 17:37:38 -0000 1.37 > +++ configHandler.py 28 Jan 2005 00:16:16 -0000 1.38 > @@ -579,7 +579,9 @@ > '<>': [''], > '<>': [''], > '<>': [''], > - '<>': [''] > + '<>': [''], > + '<>': [''], > + '<>': [''] > } > if keySetName: > for event in keyBindings.keys(): > > _______________________________________________ > Python-checkins mailing list > Python-checkins@python.org > http://mail.python.org/mailman/listinfo/python-checkins > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jeff at taupro.com Fri Jan 28 06:50:53 2005 From: jeff at taupro.com (Jeff Rush) Date: Fri Jan 28 06:51:02 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: <41F97843.8070902@v.loewis.de> References: <006001c504c2$da9cc1c0$433fc797@oemcomputer> <41F97843.8070902@v.loewis.de> Message-ID: <1106891453.13830.67.camel@vault.timecastle.net> On Thu, 2005-01-27 at 17:24, "Martin v. L?wis" wrote: > Raymond Hettinger wrote: > > Other than a vague feeling of completeness is there any reason this > > needs to be done? Is there anything useful that currently cannot be > > expressed without this new module? > > That I wonder myself, too. One reason is correct documentation. If the code is rejected, there should be a patch proposed to remove the erroneous documentation references that indicates things are in __builtins_ when they are in fact not. If they are put into __builtins__, the documentation won't need updating. ;-) -Jeff Rush From fperez.net at gmail.com Fri Jan 28 10:06:17 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Fri Jan 28 10:06:56 2005 Subject: [Python-Dev] Re: Re: Re: Re: Patch review: [ 1094542 ] add Bunch type to collections module References: Message-ID: Steven Bethard wrote: > That sounds reasonable to me. I'll fix update to be a staticmethod. > If people want other methods, I'll make sure they're staticmethods > too.[1] > > Steve > > [1] In all the cases I can think of, staticmethod is sufficient -- the > methods don't need to access any attributes of the Bunch class. If > anyone has a good reason to make them classmethods instead, let me > know... Great. I probably meant staticmethod. I don't use either much, so I don't really know the difference in the terminology. For a long time I stuck to 2.1 features for ipython and my other codes, and I seem to recall those appeared in 2.2. But you got what I meant :) Cheers, f From ejones at uwaterloo.ca Fri Jan 28 14:17:07 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Fri Jan 28 14:16:49 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? Message-ID: Due to the issue of thread safety in the Python memory allocator, I have been wondering about thread safety in the rest of the Python interpreter. I understand that the interpreter is not thread safe, but I'm not sure that I have seen a discussion of the all areas where this is an issue. Here are the areas I know of: 1. The memory allocator. 2. Reference counts. 3. The cyclic garbage collector. 4. Current interpreter state is pointed to by a single shared pointer. 5. Many modules may not be thread safe (?). Ignoring the issue of #5 for the moment, are there any other areas where this is a problem? I'm curious about how much work it would be to allow concurrent execution of Python code. Evan Jones Note: One of the reasons I am asking is that my memory allocator patch is that it changes the current allocator from "sort of" thread safe to obviously unsafe. One way to eliminate this issue is to make the allocator completely thread safe, but that would require some fairly significant changes to avoid a major performance penalty. However, if it was one of the components that permitted the interpreter to go multi-threaded, then it would be worth it. From steve at holdenweb.com Fri Jan 28 19:24:03 2005 From: steve at holdenweb.com (Steve Holden) Date: Fri Jan 28 19:29:27 2005 Subject: [Python-Dev] Re: [PyCon] Reg: Registration In-Reply-To: <20050128182019.GB19054@panix.com> References: <3e475f5b0501280929717ce3b3@mail.gmail.com> <20050128182019.GB19054@panix.com> Message-ID: <41FA8343.8020007@holdenweb.com> Aahz, writing as pycon@python.org, wrote: > It's still January 28 here -- register now! I don't know if we'll be > able to extend the registration price beyond that. Just in case anybody else might be wondering when the early bird registration deadline is, I've asked the registration team to allow the early bird price as long as it's January 28th somewhere in the world. There have been rumors that Guido will not be attending PyCon this year. I am happy to scotch them by pointing out that Guido van Rossum's keynote address will be on its traditional Thursday morning. I look forward to joining you all to hear Guido speak on "The State of Python". regards Steve -- Steve Holden http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/ Holden Web LLC +1 703 861 4237 +1 800 494 3119 From greg.ewing at canterbury.ac.nz Fri Jan 28 23:27:13 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Jan 28 23:35:27 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available References: Message-ID: <41FABC41.3060406@canterbury.ac.nz> Guido van Rossum wrote: > Here's a patch that gets rid of unbound methods, as > discussed here before. A function's __get__ method > now returns the function unchanged when called without > an instance, instead of returning an unbound method object. I thought the main reason for existence of unbound methods (for user-defined classes, at least) was so that if you screw up a super call by forgetting to pass self, or passing the wrong type of object, you get a more helpful error message. I remember a discussion about this some years ago, in which you seemed to think the ability to produce this message was important enough to justify the existence of unbound methods, even though it meant you couldn't easily have static methods (this was long before staticmethod() was created). Have you changed your mind about that? Also, surely unbound methods will still have to exist for C methods? Otherwise there will be nothing to ensure that C code is getting the object type it expects for self. -- Greg From gvanrossum at gmail.com Fri Jan 28 23:45:00 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Jan 28 23:45:19 2005 Subject: [Python-Dev] Getting rid of unbound methods: patch available In-Reply-To: <41FABC41.3060406@canterbury.ac.nz> References: <41FABC41.3060406@canterbury.ac.nz> Message-ID: [Guido] > > Here's a patch that gets rid of unbound methods, as > > discussed here before. A function's __get__ method > > now returns the function unchanged when called without > > an instance, instead of returning an unbound method object. [Greg] > I thought the main reason for existence of unbound > methods (for user-defined classes, at least) was so that > if you screw up a super call by forgetting to pass self, > or passing the wrong type of object, you get a more > helpful error message. Yes, Tim reminded me of this too. But he said he could live without it. :-) > I remember a discussion about this some years ago, in > which you seemed to think the ability to produce this > message was important enough to justify the existence > of unbound methods, even though it meant you couldn't > easily have static methods (this was long before > staticmethod() was created). > > Have you changed your mind about that? After all those years, I think the added complexity of unbound methods doesn't warrant having the convenience of the error message. > Also, surely unbound methods will still have to exist > for C methods? Otherwise there will be nothing to ensure > that C code is getting the object type it expects for > self. No, C methods have their own object type for that (which is logically equivalent to an unbound method). But there was a use case for unbound methods having to do with C methods of classic classes, in the implementation of built-in exceptions. Anyway, it's all moot because I withdrew the patch, due to the large amount of code that would break due to the missing im_class attribute -- all fixable, but enough not to want to break it all when 2.5 comes out. So I'm salting the idea up for 3.0. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Fri Jan 28 23:46:12 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Jan 28 23:54:25 2005 Subject: [Python-Dev] Let's get rid of unbound methods References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> Message-ID: <41FAC0B4.8080301@canterbury.ac.nz> Josiah Carlson wrote: > While it seems that super() is the 'modern paradigm' for this, > I have been using base.method(self, ...) for years now, and have been > quite happy with it. I too would be very disappointed if base.method(self, ...) became somehow deprecated. Cooperative super calls are a different beast altogether and have different use cases. In fact I'm having difficulty finding *any* use cases at all for super() in my code. I thought I had found one once, but on further reflection I changed my mind. And I have found that the type checking of self provided by unbound methods has caught a few bugs that would probably have produced more mysterious symptoms otherwise. But I can't say for sure whether they would have been greatly more mysterious -- perhaps not. -- Greg From greg.ewing at canterbury.ac.nz Fri Jan 28 23:51:00 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Jan 28 23:59:12 2005 Subject: [Python-Dev] Let's get rid of unbound methods References: <1f7befae05010410576effd024@mail.gmail.com> <20050104154707.927B.JCARLSON@uci.edu> <1f7befae050104180711743ebd@mail.gmail.com> Message-ID: <41FAC1D4.6070106@canterbury.ac.nz> Tim Peters wrote: > I expect that's because he stopped working on Zope code, so actually > thinks it's odd again to see a gazillion methods like: > > class Registerer(my_base): > def register(*args, **kws): > my_base.register(*args, **kws) I second that! My PyGUI code is *full* of __init__ methods like that, because of my convention for supplying initial values of properties as keyword arguments. -- Greg From martin at v.loewis.de Sat Jan 29 00:12:23 2005 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Jan 29 00:12:16 2005 Subject: [Python-Dev] Patch review: [ 1009811 ] Add missing types to__builtin__ In-Reply-To: <1106891453.13830.67.camel@vault.timecastle.net> References: <006001c504c2$da9cc1c0$433fc797@oemcomputer> <41F97843.8070902@v.loewis.de> <1106891453.13830.67.camel@vault.timecastle.net> Message-ID: <41FAC6D7.6070704@v.loewis.de> Jeff Rush wrote: > If they are put into __builtins__, the documentation won't need > updating. ;-) In that case, I'd rather prefer to correct the documentation. Regards, Martin From martin at v.loewis.de Sat Jan 29 00:24:13 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Jan 29 00:24:15 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? In-Reply-To: References: Message-ID: <41FAC99D.3070502@v.loewis.de> Evan Jones wrote: > Due to the issue of thread safety in the Python memory allocator, I have > been wondering about thread safety in the rest of the Python > interpreter. I understand that the interpreter is not thread safe, but > I'm not sure that I have seen a discussion of the all areas where this > is an issue. Here are the areas I know of: This very much depends on the definition of "thread safe". If this mean "working correctly in the presence of threads", then your statement is wrong - the interpreter *is* thread-safe. The global interpreter lock (GIL) guarantees that only one thread at any time can execute critical operations. Since all threads acquire the GIL before performing such operations, the interpreter is thread-safe. > 1. The memory allocator. > 2. Reference counts. > 3. The cyclic garbage collector. > 4. Current interpreter state is pointed to by a single shared pointer. This is all protected by the GIL. > 5. Many modules may not be thread safe (?). Modules often release the GIL through BEGIN_ALLOW_THREADS, if they know that would be safe if another thread would enter the Python interpreter. > Ignoring the issue of #5 for the moment, are there any other areas where > this is a problem? I'm curious about how much work it would be to allow > concurrent execution of Python code. Define "concurrent". Webster's offers 1. operating or occurring at the same time Clearly, on a single-processor system, no two activities can execute concurrently - the processor can do at most one activity at any point in time. Perhaps you are asking whether it would be possible to change the current coarse-grained lock into a more finer-grained lock (as working without locks is not implementable). This is also known as "free threading". There have been attempts to implement free threading, and they have failed. > Note: One of the reasons I am asking is that my memory allocator patch > is that it changes the current allocator from "sort of" thread safe to > obviously unsafe. The allocator is thread-safe in the presence of the GIL - you are supposed to hold the GIL before entering the allocator. Due to some unfortunate historical reasons, there is code which enters free() without holding the GIL - and that is what the allocator specifically deals with. Except for this single case, all callers of the allocator are required to hold the GIL. > However, if it > was one of the components that permitted the interpreter to go > multi-threaded, then it would be worth it. Again, the interpreter supports multi-threading today. Removing the GIL is more difficult, though - nearly any container object (list, dictionary, etc) would have to change, plus the reference counting (which would have to grow atomic increment/decrement). Regards, Martin From ejones at uwaterloo.ca Sat Jan 29 01:22:52 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Sat Jan 29 01:22:41 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? In-Reply-To: <41FAC99D.3070502@v.loewis.de> References: <41FAC99D.3070502@v.loewis.de> Message-ID: On Jan 28, 2005, at 18:24, Martin v. L?wis wrote: >> 5. Many modules may not be thread safe (?). > Modules often release the GIL through BEGIN_ALLOW_THREADS, if they know > that would be safe if another thread would enter the Python > interpreter. Right, I guess that the modules already have to deal with being reentrant and thread-safe, since Python threads could already cause issues. >> Ignoring the issue of #5 for the moment, are there any other areas >> where this is a problem? I'm curious about how much work it would be >> to allow concurrent execution of Python code. > Define "concurrent". Webster's offers Sorry, I really meant *parallel* execution of Python code: Multiple threads simultaneously executing a Python program, potentially on different CPUs. > There have been attempts to implement free threading, and > they have failed. What I was trying to ask with my last email was what are the trouble areas? There are probably many that I am unaware of, due to my unfamiliarity the Python internals. > Due to some > unfortunate historical reasons, there is code which enters free() > without holding the GIL - and that is what the allocator specifically > deals with. Right, but as said in a previous post, I'm not convinced that the current implementation is completely correct anyway. > Again, the interpreter supports multi-threading today. Removing > the GIL is more difficult, though - nearly any container object > (list, dictionary, etc) would have to change, plus the reference > counting (which would have to grow atomic increment/decrement). Wouldn't it be up to the programmer to ensure that accesses to shared objects, like containers, are serialized? For example, with Java's collections, there are both synchronized and unsynchronized versions. Evan Jones From martin at v.loewis.de Sat Jan 29 01:44:09 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Jan 29 01:44:01 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? In-Reply-To: References: <41FAC99D.3070502@v.loewis.de> Message-ID: <41FADC59.3050609@v.loewis.de> Evan Jones wrote: > Sorry, I really meant *parallel* execution of Python code: Multiple > threads simultaneously executing a Python program, potentially on > different CPUs. There cannot be parallel threads on a single CPU - for threads to be truly parallel, you *must* have two CPUs, at a minimum. Python threads can run truly parallel, as long as one of them invoke BEGIN_ALLOW_THREADS. > What I was trying to ask with my last email was what are the trouble > areas? There are probably many that I am unaware of, due to my > unfamiliarity the Python internals. I think nobody really remembers - ask Google for "Python free threading". Greg Stein did the patch, and the main problem apparently was that the performance became unacceptable - apparently primarily because of dictionary locking. > Right, but as said in a previous post, I'm not convinced that the > current implementation is completely correct anyway. Why do you think so? (I see in your previous post that you claim it is not completely correct, but I don't see any proof). > Wouldn't it be up to the programmer to ensure that accesses to shared > objects, like containers, are serialized? In a truly parallel Python, two arbitrary threads could access the same container, and it would still work. If some containers cannot be used simultaneously in multiple threads, this might ask for a desaster. > For example, with Java's > collections, there are both synchronized and unsynchronized versions. I don't think this approach can apply to Python. Python users are used to completely thread-safe containers, and lots of programs would break if the container would suddenly throw exceptions. Furthermore, the question is what kind of failure you'ld expect if an unsynchronized dictionary is used from multiple threads. Apparently, Java guarantees something (e.g. that the interpreter won't crash) but even this guarantee would be difficult to make. For example, for lists, the C API allows direct access to the pointers in the list. If the elements of the list could change in-between, an object in the list might go away after you got the pointer, but before you had a chance to INCREF it. This would cause a crash shortly afterwards. Even if that was changed to always return a new refence, lots of code would break, as it would create large memory leaks (code would have needed to decref the list items, but currently doesn't - nor is it currently necessary). Regards, Martin From abo at minkirri.apana.org.au Sat Jan 29 01:44:21 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Sat Jan 29 01:44:34 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? In-Reply-To: <41FAC99D.3070502@v.loewis.de> References: <41FAC99D.3070502@v.loewis.de> Message-ID: <1106959461.4557.21.camel@localhost> On Sat, 2005-01-29 at 00:24 +0100, "Martin v. L?wis" wrote: > Evan Jones wrote: [...] > The allocator is thread-safe in the presence of the GIL - you are > supposed to hold the GIL before entering the allocator. Due to some > unfortunate historical reasons, there is code which enters free() > without holding the GIL - and that is what the allocator specifically > deals with. Except for this single case, all callers of the allocator > are required to hold the GIL. Just curious; is that "one case" a bug that needs fixing, or is the some reason this case can't be changed to use the GIL? Surely making it mandatory for all free() calls to hold the GIL is easier than making the allocator deal with the one case where this isn't done. I like the GIL :-) so much so I'd like to see it visible at the Python level. Then you could write your own atomic methods in Python. BTW, if what Evan is hoping for concurrent threads running on different processors in a multiprocessor system, then don't :-) It's been a while since I looked at multiprocessor architectures, but I believe threading's shared memory paradigm will always be hard to distribute efficiently over multiple CPU's. If you want to run on multiple processors, use processes, not threads. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From mcherm at mcherm.com Sat Jan 29 02:14:59 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Sat Jan 29 02:15:15 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? Message-ID: <1106961299.41fae39341ec9@mcherm.com> Martin v. L?wis writes: > Due to some > unfortunate historical reasons, there is code which enters free() > without holding the GIL - and that is what the allocator specifically > deals with. Except for this single case, all callers of the allocator > are required to hold the GIL. Donovan Baarda writes: > Just curious; is that "one case" a bug that needs fixing, or is the some > reason this case can't be changed to use the GIL? Surely making it > mandatory for all free() calls to hold the GIL is easier than making the > allocator deal with the one case where this isn't done. What Martin is trying to say here is that it _IS_ mandatory to hold the GIL when calling free(). However, there is some very old code in existance (written by other people) which calls free() without holding the GIL. We work very hard to provide backward compatibility, so we are jumping through hoops to ensure that even this old code which is violating the rules doesn't get broken. -- Michael Chermside From tim.peters at gmail.com Sat Jan 29 02:27:25 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sat Jan 29 02:27:28 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? In-Reply-To: References: <41FAC99D.3070502@v.loewis.de> Message-ID: <1f7befae05012817277bbc292b@mail.gmail.com> ... [Evan Jones] > What I was trying to ask with my last email was what are the trouble > areas? There are probably many that I am unaware of, due to my > unfamiliarity the Python internals. Google on "Python free threading". That's not meant to be curt, it's just meant to recognize that the task is daunting and has been discussed often before. [Martin v. L?wis] >> Due to some unfortunate historical reasons, there is code which enters >> free() without holding the GIL - and that is what the allocator specifically >> deals with. > Right, but as said in a previous post, I'm not convinced that the > current implementation is completely correct anyway. Sorry, I haven't had time for this. From your earlier post: > For example, is it possible to call PyMem_Free from two threads > simultaneously? Possible but not legal; undefined behavior if you try. See the "Thread State and the Global Interpreter Lock" section of the Python C API manual. ... only the thread that has acquired the global interpreter lock may operate on Python objects or call Python/C API functions There are only a handful of exceptions to the last part of that rule, concerned with interpreter and thread startup and shutdown, and they're explicitly listed in that section. The memory-management functions aren't among them. In addition, it's not legal to call PyMem_Free regardless unless the pointer passed to it was originally obtained from another function in the PyMem_* family (that specifically excludes memory obtained from a PyObject_* function). In a release build, all of the PyMem_* allocators resolve directly to the platform malloc or realloc, and all PyMem_Free has to determine is that they *were* so allocated and thus call the platform free() directly (which is presumably safe to call without holding the GIL). The hacks in PyObject_Free (== PyMem_Free) are there solely so that question can be answered correctly in the absence of holding the GIL. "That question" == "does pymalloc control the pointer passed to me, or does the system malloc?". In return, that hack is there solely because in much earlier versions of Python extension writers got into the horrible habit of allocating object memory with PyObject_New but releasing it with PyMem_Free, and because indeed Python didn't *have* a PyObject_Free function then. Other extension writers were just nuts, mixing PyMem_* calls with direct calls to system free/malloc/realloc, and ignoring GIL issues for all of those. When pymalloc was new, we went to insane lengths to avoid breaking that stuff, but enough is enough. > Since the problem is that threads could call PyMem_Free without > holding the GIL, it seems to be that it is possible. Yes, but not specific to PyMem_Free. It's clearly _possible_ to call _any_ function from multiple threads without holding the GIL. > Shouldn't it also be supported? No. If what they want is the system malloc/realloc/free, that's what they should call. > In the current memory allocator, I believe that situation can lead to > inconsistent state. Certainly, but only if pymalloc controls the memory blocks. If they were actually obtained from the system malloc, the only part of pymalloc that has to work correctly is the Py_ADDRESS_IN_RANGE() macro. When that returns false, the only other thing PyObject_Free() does is call the system free() immediately, then return. None of pymalloc's data structures are involved, apart from the hacks ensuring that the arena of base addresses is safe to access despite potentlly current mutation-by-appending. > ... > Basically, if a concurrent memory allocator is the requirement, It isn't. The attempt to _exploit_ the GIL by doing no internal locking of its own is 100% deliberate in pymalloc -- it's a significant speed win (albeit on some platforms more than others). > then I think some other approach is necessary. If it became necessary, that's what this section of obmalloc is for: SIMPLELOCK_DECL(_malloc_lock) #define LOCK() SIMPLELOCK_LOCK(_malloc_lock) #define UNLOCK() SIMPLELOCK_UNLOCK(_malloc_lock) #define LOCK_INIT() SIMPLELOCK_INIT(_malloc_lock) #define LOCK_FINI() SIMPLELOCK_FINI(_malloc_lock) You'll see that PyObject_Free() calls LOCK() and UNLOCK() at appropriate places already, but they have empty expansions now. Back to the present: [Martin] >> Again, the interpreter supports multi-threading today. Removing >> the GIL is more difficult, though - nearly any container object >> (list, dictionary, etc) would have to change, plus the reference >> counting (which would have to grow atomic increment/decrement). [Evan] > Wouldn't it be up to the programmer to ensure that accesses to shared > objects, like containers, are serialized? For example, with Java's > collections, there are both synchronized and unsynchronized versions. Enormous mounds of existing threaded Python code freely manipulates lists and dicts without explicit locking now. We can't break that -- and wouldn't want to. Writing threaded code is especially easy (a relative stmt, not absolute) in Python because of it. From ejones at uwaterloo.ca Sat Jan 29 03:15:33 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Sat Jan 29 03:15:45 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? In-Reply-To: <1f7befae05012817277bbc292b@mail.gmail.com> References: <41FAC99D.3070502@v.loewis.de> <1f7befae05012817277bbc292b@mail.gmail.com> Message-ID: On Jan 28, 2005, at 20:27, Tim Peters wrote: > The hacks in PyObject_Free (== PyMem_Free) > are there solely so that question can be answered correctly in the > absence of holding the GIL. "That question" == "does pymalloc > control the pointer passed to me, or does the system malloc?". Ah! *Now* I get it. And yes, it will be possible to still support this in my patched version of the allocator. It just means that I have to leak the "arenas" array just like it did before, and then do some hard thinking about memory models and consistency to decide if the "arenas" pointer needs to be volatile. > When pymalloc was new, we went to insane lengths to > avoid breaking that stuff, but enough is enough. So you don't think we need to bother supporting that any more? > Back to the present: >> Wouldn't it be up to the programmer to ensure that accesses to shared >> objects, like containers, are serialized? For example, with Java's >> collections, there are both synchronized and unsynchronized versions. > Enormous mounds of existing threaded Python code freely manipulates > lists and dicts without explicit locking now. We can't break that -- > and wouldn't want to. Writing threaded code is especially easy (a > relative stmt, not absolute) in Python because of it. Right, because currently Python switches threads on a granularity of opcodes, which gives you this serialization with the cost of never having parallel execution. Evan Jones From ejones at uwaterloo.ca Sat Jan 29 03:17:54 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Sat Jan 29 03:17:50 2005 Subject: [Python-Dev] Python Interpreter Thread Safety? In-Reply-To: <41FADC59.3050609@v.loewis.de> References: <41FAC99D.3070502@v.loewis.de> <41FADC59.3050609@v.loewis.de> Message-ID: On Jan 28, 2005, at 19:44, Martin v. L?wis wrote: > Python threads can run truly parallel, as long as one of them > invoke BEGIN_ALLOW_THREADS. Except that they are really executing C code, not Python code. > I think nobody really remembers - ask Google for "Python free > threading". Greg Stein did the patch, and the main problem apparently > was that the performance became unacceptable - apparently primarily > because of dictionary locking. Thanks, I found the threads discussing it. >> Right, but as said in a previous post, I'm not convinced that the >> current implementation is completely correct anyway. > Why do you think so? (I see in your previous post that you claim > it is not completely correct, but I don't see any proof). There are a number of issues actually, but as Tim points, only if the blocks are managed by PyMalloc. I had written a description of three of them here, but they are not relevant. If the issue is calling PyMem_Free with a pointer that was allocated with malloc() while PyMalloc is doing other stuff, then no problem: That is possible to support, but I'll have to think rather hard about some of the issues. > For example, for lists, the C API allows direct access to the pointers > in the list. If the elements of the list could change in-between, an > object in the list might go away after you got the pointer, but before > you had a chance to INCREF it. This would cause a crash shortly > afterwards. Even if that was changed to always return a new refence, > lots of code would break, as it would create large memory leaks > (code would have needed to decref the list items, but currently > doesn't - nor is it currently necessary). Ah! Right. In Java, the collections are all actually written in Java, and run on the VM. Thus, when some concurrent weirdness happens, it just corrupts the application, not the VM. However, in Python, this could actually corrupt the interpreter itself, crashing the entire thing with a very ungraceful Segmentation Fault or something similar. Evan Jones From kbk at shore.net Sat Jan 29 20:10:30 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat Jan 29 20:10:47 2005 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200501291910.j0TJAUhi007769@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 280 open ( +7) / 2747 closed ( +1) / 3027 total ( +8) Bugs : 803 open ( +6) / 4799 closed (+10) / 5602 total (+16) RFE : 167 open ( +1) / 141 closed ( +0) / 308 total ( +1) New / Reopened Patches ______________________ tarfile.ExFileObject iterators (2005-01-23) http://python.org/sf/1107973 opened by Mitch Chapman Allow slicing of any iterator by default (2005-01-24) http://python.org/sf/1108272 opened by Nick Coghlan fix .split() separator doc, update .rsplit() docs (2005-01-24) CLOSED http://python.org/sf/1108303 opened by Wummel type conversion methods and subclasses (2005-01-25) http://python.org/sf/1109424 opened by Walter D?rwald distutils dry-run breaks when attempting to bytecompile (2005-01-26) http://python.org/sf/1109658 opened by Anthony Baxter patch for idlelib (2005-01-26) http://python.org/sf/1110205 opened by sowjanya patch for gzip.GzipFile.flush() (2005-01-26) http://python.org/sf/1110248 opened by David Schnepper HEAD/PUT/DELETE support for urllib2.py (2005-01-28) http://python.org/sf/1111653 opened by Terrel Shumway Patches Closed ______________ fix .split() maxsplit doc, update .rsplit() docs (2005-01-24) http://python.org/sf/1108303 closed by rhettinger New / Reopened Bugs ___________________ "\0" not listed as a valid escape in the lang reference (2005-01-24) CLOSED http://python.org/sf/1108060 opened by Andrew Bennetts broken link in tkinter docs (2005-01-24) http://python.org/sf/1108490 opened by Ilya Sandler Cookie.py produces invalid code (2005-01-25) http://python.org/sf/1108948 opened by Simon Dahlbacka idle freezes when run over ssh (2005-01-25) http://python.org/sf/1108992 opened by Mark Poolman Time module missing from latest module index (2005-01-25) http://python.org/sf/1109523 opened by Skip Montanaro Need some setup.py sanity (2005-01-25) http://python.org/sf/1109602 opened by Skip Montanaro distutils argument parsing is bogus (2005-01-26) http://python.org/sf/1109659 opened by Anthony Baxter bdist_wininst ignores build_lib from build command (2005-01-26) http://python.org/sf/1109963 opened by Anthony Tuininga Cannot ./configure on FC3 with gcc 3.4.2 (2005-01-26) CLOSED http://python.org/sf/1110007 opened by Paul Watson recursion core dumps (2005-01-26) http://python.org/sf/1110055 opened by Jacob Engelbrecht gzip.GzipFile.flush() does not flush all internal buffers (2005-01-26) http://python.org/sf/1110242 opened by David Schnepper os.environ.update doesn't work (2005-01-27) CLOSED http://python.org/sf/1110478 opened by June Kim list comprehension scope (2005-01-27) CLOSED http://python.org/sf/1110705 opened by Simon Dahlbacka RLock logging mispells "success" (2005-01-27) CLOSED http://python.org/sf/1110998 opened by Matthew Bogosian csv reader barfs encountering quote when quote_none is set (2005-01-27) http://python.org/sf/1111100 opened by washington irving tkSimpleDialog broken on MacOS X (Aqua Tk) (2005-01-27) http://python.org/sf/1111130 opened by Russell Owen Bugs Closed ___________ bug with idle's stdout when executing load_source (2005-01-20) http://python.org/sf/1105950 closed by kbk "\0" not listed as a valid escape in the lang reference (2005-01-23) http://python.org/sf/1108060 closed by tim_one Undocumented implicit strip() in split(None) string method (2005-01-19) http://python.org/sf/1105286 closed by rhettinger split() takes no keyword arguments (2005-01-21) http://python.org/sf/1106694 closed by rhettinger Cannot ./configure on FC3 with gcc 3.4.2 (2005-01-26) http://python.org/sf/1110007 closed by loewis os.environ.update doesn't work (2005-01-27) http://python.org/sf/1110478 closed by loewis Scripts started with CGIHTTPServer: missing cgi environment (2005-01-11) http://python.org/sf/1100235 closed by loewis list comprehension scope (2005-01-27) http://python.org/sf/1110705 closed by rhettinger RLock logging mispells "success" (2005-01-27) http://python.org/sf/1110998 closed by bcannon README of 2.4 source download says 2.4a3 (2005-01-20) http://python.org/sf/1106057 closed by loewis New / Reopened RFE __________________ 'attrmap' function, attrmap(x)['attname'] = x.attname (2005-01-26) http://python.org/sf/1110010 opened by Gregory Smith From p.f.moore at gmail.com Sat Jan 29 23:15:40 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Sat Jan 29 23:15:44 2005 Subject: [Python-Dev] Re: PEP 309 (Was: Patch review: [ 1094542 ] add Bunch type to collections module) In-Reply-To: <79990c6b05012701492440d0c0@mail.gmail.com> References: <79990c6b05012701492440d0c0@mail.gmail.com> Message-ID: <79990c6b05012914156800e5bc@mail.gmail.com> On Thu, 27 Jan 2005 09:49:48 +0000, Paul Moore wrote: > On the subject of "things everyone ends up rewriting", what needs to > be done to restart discussion on PEP 309 (Partial Function > Application)? The PEP is marked "Accepted" and various patches exist: > > 941881 - C implementation > 1006948 - Windows build changes > 931010 - Unit Tests > 931007 - Documentation > 931005 - Reference implementation (in Python) > > I get the impression that there are some outstanding tweaks required > to the Windows build, but I don't have VS.NET to check and/or update > the patch. > > Does this just need a core developer to pick it up? I guess I'll go > off and do some patch/bug reviews... OK, I reviewed some bugs. Could I ask that someone review 941881 (Martin would be ideal, as he knows the Windows build process - 1006948 should probably be included as well). If I'm being cheeky by effectively asking that a suite of patches be reviewed in exchange for 5 bugs, then I'll review some more - I don't have time now, unfortunately. I justify myself by claiming that the suite of patches is in effect one big patch split into multiple tracker items... :-) Bugs reviewed: 1083306 - looks fine to me, I recommend applying. I've added a patch for CVS HEAD. 1058960 - already fixed in CVS HEAD (rev 1.45) - can be closed. Backport candidate? 1033422 - This is standard Windows behaviour, and should be retained. I recommend closing "Won't Fix". 1016563 - The patch looks fine, I've added a patch against CVS HEAD. The change was introduced in revision 1.32 from patch 527518. It looks accidental. I can't reproduce the problem, but I can see that it could be an issue. I recommend applying the patch. 977250 - Not a bug. I've given an explanation in the tracker item, and would recommend closing "Won't Fix". Also, while looking at patches I noticed 1077106. It doesn't apply to me - I don't use Linux - but it looks like this may have simply been forgotten. The last comment is in December from from Michael Hudson, saying in effect "I'll commit this tomorrow". Michael? Paul. From martin at v.loewis.de Sun Jan 30 00:54:12 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 30 00:54:01 2005 Subject: [Python-Dev] Re: PEP 309 (Was: Patch review: [ 1094542 ] add Bunch type to collections module) In-Reply-To: <79990c6b05012914156800e5bc@mail.gmail.com> References: <79990c6b05012701492440d0c0@mail.gmail.com> <79990c6b05012914156800e5bc@mail.gmail.com> Message-ID: <41FC2224.4000107@v.loewis.de> Paul Moore wrote: > OK, I reviewed some bugs. Could I ask that someone review 941881 > (Martin would be ideal, as he knows the Windows build process - > 1006948 should probably be included as well). Thanks for the reviews. I won't be available next week to look into the PEP, but I promise to do so some time in February. I've dealt with the easy reviews already: > 1058960 - already fixed in CVS HEAD (rev 1.45) - can be closed. > 1033422 - This is standard Windows behaviour, and should be retained. > I recommend closing "Won't Fix". > 977250 - Not a bug. I've given an explanation in the tracker item, and > would recommend closing "Won't Fix". I've closed all of them. Regards, Martin From martin at v.loewis.de Sun Jan 30 11:31:58 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 30 11:31:46 2005 Subject: [Python-Dev] Improving the Python Memory Allocator In-Reply-To: References: <41F581C8.6070109@v.loewis.de> Message-ID: <41FCB79E.4050605@v.loewis.de> Evan Jones wrote: > Sure. This should be done even for patches which should absolutely not > be committed? Difficult question. I think the answer goes like this: "Patches that should absolutely not be committed should not be published at all". There are different shades of gray, of course - but people typically dislike receiving patches through a mailing list. OTOH, I'm guilty of committing a patch myself which was explicitly marked as not-to-be-committed on SF, so I cannot really advise to use SF in this case. Putting it on your own web server would be best. Regards, Martin From nnorwitz at gmail.com Sun Jan 30 21:40:35 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun Jan 30 21:40:38 2005 Subject: [Python-Dev] Speed up function calls In-Reply-To: <004a01c503b5$fd167b00$18fccc97@oemcomputer> References: <004a01c503b5$fd167b00$18fccc97@oemcomputer> Message-ID: On Wed, 26 Jan 2005 09:47:41 -0500, Raymond Hettinger wrote: > > I agree that METH_O and METH_NOARGS are near > > optimal wrt to performance. But if we could have one METH_UNPACKED > > instead of 3 METH_*, I think that would be a win. > . . . > > Sorry, I meant eliminated w/3.0. > > So, leave METH_O and METH_NOARGS alone. They can't be dropped until 3.0 > and they can't be improved speedwise. I was just trying to point out possible directions. I wasn't trying to suggest that the patch as a whole should be integrated now. > > Ultimately, I think we can speed things up more by having 9 different > > op codes, ie, one for each # of arguments. CALL_FUNCTION_0, > > CALL_FUNCTION_1, ... > > (9 is still arbitrary and subject to change) > > How is the compiler to know the arity of the target function? If I call > pow(3,5), how would the compiler know that pow() can take an optional > third argument which would be need to be initialized to NULL? The compiler wouldn't know anything about pow(). It would only know that 2 arguments are passed. That would help get rid of the first switch statement. I need to think more about the NULL initialization. I may have mixed 2 separate issues. > > Then we would have N little functions, each with the exact # of > > parameters. Each would still need a switch to call the C function > > because there may be optional parameters. Ultimately, it's possible > > the code would be small enough to stick it into the eval_frame loop. > > Each of these steps would need to be tested, but that's a possible > > longer term direction. > . . . > > There would only be an if to check if it was a C function or not. > > Maybe we could even get rid of this by more fixup at import time. > > This is what I mean about the patch taking on a life of its own. It's > an optimization patch that slows down METH_O and METH_NOARGS. It's a > incremental change that throws away backwards compatibility. It's a > simplification that introduces a bazillion new code paths. It's a > simplification that can't be realized until 3.0. It's a minor change > that entails new opcodes, compiler changes, and changes in all > extensions that have ever been written. I really didn't want to do this now (or necessarily in 2.5). I was just trying to provide insight into future direction. This brings up another discussion about working towards 3.0. But I'll make a new thread for that. At this point, it seems there aren't many disagreements about the general idea. There is primarily a question about what is acceptable now. I will rework the patch based on Raymond's feedback and continue update the tracker. Unless if anyone disagrees, I don't see a reason to continue the remainder of this discussion on py-dev. Neal From nnorwitz at gmail.com Sun Jan 30 21:55:08 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun Jan 30 21:55:10 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: <004a01c503b5$fd167b00$18fccc97@oemcomputer> References: <004a01c503b5$fd167b00$18fccc97@oemcomputer> Message-ID: On Wed, 26 Jan 2005 09:47:41 -0500, Raymond Hettinger wrote: > > This is what I mean about the patch taking on a life of its own. It's > an optimization patch that slows down METH_O and METH_NOARGS. It's a > incremental change that throws away backwards compatibility. It's a > simplification that introduces a bazillion new code paths. It's a > simplification that can't be realized until 3.0. I've been thinking about how to move towards 3.0. There are many changes that are desirable and unlikely to occur prior to 3.0. But if we defer so many enhancments, the changes will be voluminous, potentially difficult to manage, and possibly error prone. There is a risk that many small warts will not be fixed, only because they fell through the cracks. I thought about making a p3k branch in CVS. It could be worked on slowly and would be the implementation of PEP 3000. However, if a branch was created all changes would need to be forward ported to it and it would need to be kept up to date. I know I wouldn't have enough time to maintain this. The benefit is that people could test the portability of their applications with 3.0 sooner rather than later. They could see if the switch to iterators created problems, or integer division, or new-style exceptions, etc. We could try to improve performance by simplifying architecture. We could see how much a problem it would be to (re)move some builtins. Any ideas how we could start to realize some benefits of Py3.0 before it arrives? I'm not sure if this is worth it, if it's premature, or if there are other ways to acheive the goal of easing transition for users and simplifying developers tasks (by spreading over a longer period of time) and reducing the possibility of not fixing warts. Neal From t-meyer at ihug.co.nz Sun Jan 30 23:05:57 2005 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Sun Jan 30 23:07:08 2005 Subject: [Python-Dev] Should Python's library modules be written to help the freeze tools? In-Reply-To: Message-ID: The Python 2.4 Lib/bsddb/__init__.py contains this: """ # for backwards compatibility with python versions older than 2.3, the # iterator interface is dynamically defined and added using a mixin # class. old python can't tokenize it due to the yield keyword. if sys.version >= '2.3': exec """ import UserDict from weakref import ref class _iter_mixin(UserDict.DictMixin): ... """ Because the imports are inside an exec, modulefinder (e.g. when using bsddb with a py2exe built application) does not realise that the imports are required. (The requirement can be manually specified, of course, if you know that you need to do so). I believe that changing the above code to: """ if sys.version >= '2.3': import UserDict from weakref import ref exec """ class _iter_mixin(UserDict.DictMixin): """ Would still have the intended effect and would let modulefinder do its work. The main question (to steal Thomas's words) is whether the library modules should be written to help the freeze tools - if the answer is 'yes', then I'll submit the above as a patch for 2.5. Thanks! =Tony.Meyer From martin at v.loewis.de Sun Jan 30 23:50:53 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 30 23:50:39 2005 Subject: [Python-Dev] Should Python's library modules be written to help the freeze tools? In-Reply-To: References: Message-ID: <41FD64CD.5050307@v.loewis.de> Tony Meyer wrote: > The main question (to steal Thomas's words) is whether the library modules > should be written to help the freeze tools - if the answer is 'yes', then > I'll submit the above as a patch for 2.5. The answer to this question certainly is "yes, if possible". In this specific case, I wonder whether the backwards compatibility is still required in the first place. According to PEP 291, Greg Smith and Barry Warsaw decide on this, so I think they would need to comment first because any patch can be integrated. If they comment that 2.1 compatibility is still desirable, your patch would be fine (I guess); if they say that the compatibility requirement can be dropped for 2.5, I suggest that the entire exec statement is removed, along with the conditional clause. Regards, Martin From t-meyer at ihug.co.nz Mon Jan 31 00:28:42 2005 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Mon Jan 31 00:28:46 2005 Subject: [Python-Dev] Should Python's library modules be written to help the freeze tools? In-Reply-To: Message-ID: [Tony Meyer] > The main question (to steal Thomas's words) is whether the > library modules should be written to help the freeze tools > - if the answer is 'yes', then I'll submit the above as a > patch for 2.5. [Martin v. L?wis] > The answer to this question certainly is "yes, if possible". In this > specific case, I wonder whether the backwards compatibility is still > required in the first place. According to PEP 291, Greg Smith and > Barry Warsaw decide on this, so I think they would need to comment > first because any patch can be integrated. [...] Thanks! I've gone ahead and submitted a patch, in that case: [ 1112812 ] Patch for Lib/bsddb/__init__.py to work with modulefinder I realise that neither of the people that need to look at this are part of the '5 for 1' deal, so I need to wait for one of them to have time to look at it (plenty of time left before 2.5 anyway) but I'll do 5 reviews for the karma anyway, today or tomorrow. =Tony.Meyer From bob at redivi.com Mon Jan 31 00:56:17 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 31 00:56:32 2005 Subject: [Python-Dev] Should Python's library modules be written to help the freeze tools? In-Reply-To: <41FD64CD.5050307@v.loewis.de> References: <41FD64CD.5050307@v.loewis.de> Message-ID: <97B1F1B1-9004-4002-A2A2-11259D4EE007@redivi.com> On Jan 30, 2005, at 5:50 PM, Martin v. L?wis wrote: > Tony Meyer wrote: >> The main question (to steal Thomas's words) is whether the library >> modules >> should be written to help the freeze tools - if the answer is 'yes', >> then >> I'll submit the above as a patch for 2.5. > > The answer to this question certainly is "yes, if possible". In this > specific case, I wonder whether the backwards compatibility is still > required in the first place. According to PEP 291, Greg Smith and > Barry Warsaw decide on this, so I think they would need to comment > first because any patch can be integrated. If they comment that 2.1 > compatibility is still desirable, your patch would be fine (I guess); > if they say that the compatibility requirement can be dropped for 2.5, > I suggest that the entire exec statement is removed, along with the > conditional clause. py2app handles this situation by using a much richer way to analyze module dependencies, in that it can use hooks (called "recipes") to trigger arbitrary behavior when the responsible recipe sees that a certain module is in the dependency graph. This is actually necessary to do the right thing in the case of extensions and modules that are not friendly with bytecode analysis. Though there are not many of these in the standard library, a few common packages such as PIL have a real need for this. Also, since modulegraph uses a graph data structure it is much better suited to pruning the dependency graph. For example, pydoc imports Tkinter to support an obscure feature, but this is almost never desired in the context of an application freeze tool. py2app ships with a recipe that automatically breaks the edge between pydoc and Tkinter , so if Tkinter is not explicitly included or used by anything else in the dependency graph, it is correctly excluded from the resultant application bundle. In order to correctly cover the Python API, I needed to ALWAYS include: unicodedata, warnings, encodings, and weakref because they can be used by the implementation of Python itself without any "import" hints (which, if py2exe also did this, would've probably solved Tony's issue with bsddb). Also, I did an analysis of the Python standard library and I discovered that the following (hopefully rather complete) list of implicit dependencies (from ): { # imports done from builtin modules in C code (untrackable by modulegraph) "time": ["_strptime"], "datetime": ["time"], "MacOS": ["macresource"], "cPickle": ["copy_reg", "cStringIO"], "parser": ["copy_reg"], "codecs": ["encodings"], "cStringIO": ["copy_reg"], "_sre": ["copy", "string", "sre"], "zipimport": ["zlib"], # mactoolboxglue can do a bunch more of these # that are far harder to predict, these should be tracked # manually for now. # this isn't C, but it uses __import__ "anydbm": ["dbhash", "gdbm", "dbm", "dumbdbm", "whichdb"], } I would like to write a PEP for modulegraph as a replacement for modulefinder at some point, but I can't budget the time for it right now. The current implementation is also largely untested on other platforms. I hear it has been used by a Twisted developer to create Windows NT services using py2exe (augmenting py2exe's rather simple dependency resolution mechanism), however I'm not sure if that is in public svn or not. If the authors of the other freeze tools are interested, they can feel free to use modulegraph from py2app -- it is cross-platform code under MIT license, but I can dual-license if necessary (however I think it should be compatible with cx_freeze, py2exe, and Python itself). The API is purposefully different than modulefinder, but it is close enough such that most of the work involved is just removing unnecessary kludges. -bob From bac at OCF.Berkeley.EDU Mon Jan 31 03:29:25 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Jan 31 03:29:33 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: References: <004a01c503b5$fd167b00$18fccc97@oemcomputer> Message-ID: <41FD9805.1090309@ocf.berkeley.edu> Neal Norwitz wrote: > On Wed, 26 Jan 2005 09:47:41 -0500, Raymond Hettinger wrote: > [SNIP] > Any ideas how we could start to realize some benefits of Py3.0 before > it arrives? I'm not sure if this is worth it, if it's premature, or > if there are other ways to acheive the goal of easing transition for > users and simplifying developers tasks (by spreading over a longer > period of time) and reducing the possibility of not fixing warts. > The way I always imagined Python 3.0 would come about would be through preview releases. Once the final 2.x version was released and went into maintennance we would start developing Python 3.0 . During that development, when a major semantic change was checked in and seemed to work we could do a quick preview release for people to use to see if the new features up to that preview release would break their code. Any other way, though, through concurrent development, seems painful. As you mentioned, Neal, branches require merges eventually and that can be painful. I suspect people will just have to put up with a longer dev time for Python 3.0 . That longer dev time might actually be a good thing in the end. It would enable us to really develop a very stable 2.x version of Python that we all know will be in use for quite some time by old code. -Brett From python at rcn.com Mon Jan 31 04:26:47 2005 From: python at rcn.com (Raymond Hettinger) Date: Mon Jan 31 04:30:27 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: Message-ID: <000d01c50744$b2395700$fe26a044@oemcomputer> Neal Norwitz > I thought about making a p3k branch in CVS I had hoped for the core of p3k to be built for scratch so that even the most pervasive and fundamental implementation choices would be open for discussion: * Possibly write in C++. * Possibly replace bytecode with Forth style threaded code. * Possibly toss ref counting in favor of some kind of GC. * Consider ways to leverage multiple processor environments. * Consider alternative ways to implement exception handling (long jumps, etc, signals, etc.) * Look at alternate ways of building, passing, and parsing function arguments. * Use b-trees instead of dictionaries (just kidding). Raymond From skip at pobox.com Mon Jan 31 06:00:21 2005 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 31 06:00:27 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: <000d01c50744$b2395700$fe26a044@oemcomputer> References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: <16893.47973.850283.413462@montanaro.dyndns.org> Raymond> I had hoped for the core of p3k to be built for scratch ... Then we should just create a new CVS module for it (or go whole hog and try a new revision control system altogether - svn, darcs, arch, whatever). Skip From gvanrossum at gmail.com Mon Jan 31 06:17:15 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Jan 31 06:17:19 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: <000d01c50744$b2395700$fe26a044@oemcomputer> References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: > I had hoped for the core of p3k to be built for scratch [...] Stop right there. I used to think that was a good idea too, and was hoping to do exactly that (after retirement :). However, the more I think about it, the more I believe it would be throwing away too much valuable work. Please read this article by Joel Spolsky (if you're not yet in the habit of reading "Joel on Software", you're missing something): http://joelonsoftware.com/articles/fog0000000069.html Then tell me if you still want to start over. I expect that if we do piecemeal replacement of modules rather than starting from scratch we'll be more productive sooner with less effort. After all, the Python 3000 effort shouldn't be as pervasive as the Perl 6 design -- we're not redesigning the language from scratch, we're just tweaking (albeit allowing backwards incompatibilities). > * Possibly write in C++. > * Possibly replace bytecode with Forth style threaded code. > * Possibly toss ref counting in favor of some kind of GC. > * Consider ways to leverage multiple processor environments. > * Consider alternative ways to implement exception handling (long jumps, > etc, signals, etc.) > * Look at alternate ways of building, passing, and parsing function > arguments. > * Use b-trees instead of dictionaries (just kidding). The "just kidding" applies to the whole list, right? None of these strike me as good ideas, except for improvements to function argument passing. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tameyer at ihug.co.nz Mon Jan 31 01:48:35 2005 From: tameyer at ihug.co.nz (Tony Meyer) Date: Mon Jan 31 07:55:58 2005 Subject: [Python-Dev] Bug tracker reviews Message-ID: As promised, here are five bug reviews with recommendations. If they help [ 1112812 ] Patch for Lib/bsddb/__init__.py to work with modulefinder get reviewed, then that'd be great. Otherwise I'll just take the good karma and run :) ----- [ 531205 ] Bugs in rfc822.parseaddr() What to do when an email address contains spaces, when RFC2822 says it can't. At the moment the spaces are stripped. Recommend closing "Won't Fix", for reasons outlined in the tracker by Tim Roberts. [ 768419 ] Subtle bug in os.path.realpath on Cygwin Agree with Sjoerd that this is a Cygwin bug rather than a Python one (and no response from OP for a very long time). Recommend closing "Won't Fix". [ 803413 ] uu.decode prints to stderr The question is whether it is ok for library modules to print to stderr if a recoverable error occurs. Looking at other modules, it seems uncommon, but ok, so recommend closing "Won't fix", but making the suggested documentation change. (Alternatively, change from printing to stderr to using warnings.warn, which would be a simple change and possibly more correct, although giving the same result). [ 989333 ] Empty curses module is loaded in win32 Importing curses loads an empty module instead of raising ImportError on win32. I cannot duplicate this: recommend closing as "Invalid". [ 1090076 ] Defaults in ConfigParser.get overrides section values Behaviour of ConfigParser doesn't match the documentation. The included patch for ConfigParser does fix the problem, but might break existing code. A decision needs to be made which is the desired behaviour, and the tracker can be closed either "Won't Fix" or "Fixed" (and the fix applied for 2.5 and 2.4.1). =Tony.Meyer From barry at python.org Mon Jan 31 14:12:38 2005 From: barry at python.org (Barry Warsaw) Date: Mon Jan 31 14:12:47 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: <1107177157.14649.125.camel@presto.wooz.org> On Mon, 2005-01-31 at 00:17, Guido van Rossum wrote: > > I had hoped for the core of p3k to be built for scratch [...] > > Stop right there. Phew! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050131/5f6e756b/attachment.pgp From barry at python.org Mon Jan 31 14:13:32 2005 From: barry at python.org (Barry Warsaw) Date: Mon Jan 31 14:13:34 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: <16893.47973.850283.413462@montanaro.dyndns.org> References: <000d01c50744$b2395700$fe26a044@oemcomputer> <16893.47973.850283.413462@montanaro.dyndns.org> Message-ID: <1107177212.14651.127.camel@presto.wooz.org> On Mon, 2005-01-31 at 00:00, Skip Montanaro wrote: > Raymond> I had hoped for the core of p3k to be built for scratch ... > > Then we should just create a new CVS module for it (or go whole hog and try > a new revision control system altogether - svn, darcs, arch, whatever). I've heard rumors that SF was going to be making svn available. Anybody know more about that? I'd be +1 on moving from cvs to svn. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050131/cb8fb3f5/attachment.pgp From ejones at uwaterloo.ca Mon Jan 31 16:43:53 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Mon Jan 31 16:43:41 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: On Jan 31, 2005, at 0:17, Guido van Rossum wrote: > The "just kidding" applies to the whole list, right? None of these > strike me as good ideas, except for improvements to function argument > passing. Really? You see no advantage to moving to garbage collection, nor allowing Python to leverage multiple processor environments? I'd be curious to hear your reasons why not. My knowledge about garbage collection is weak, but I have read a little bit of Hans Boehm's work on garbage collection. For example, his "Memory Allocation Myths and Half Truths" presentation (http://www.hpl.hp.com/personal/Hans_Boehm/gc/myths.ps) is quite interesting. On page 25 he examines reference counting. The biggest disadvantage mentioned is that simple pointer assignments end up becoming "increment ref count" operations as well, which can "involve at least 4 potential memory references." The next page has a micro-benchmark that shows reference counting performing very poorly. Not to mention that Python has a garbage collector *anyway,* so wouldn't it make sense to get rid of the reference counting? My only argument for making Python capable of leveraging multiple processor environments is that multithreading seems to be where the big performance increases will be in the next few years. I am currently using Python for some relatively large simulations, so performance is important to me. Evan Jones From bob at redivi.com Mon Jan 31 17:04:51 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Jan 31 17:04:57 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: On Jan 31, 2005, at 10:43, Evan Jones wrote: > On Jan 31, 2005, at 0:17, Guido van Rossum wrote: >> The "just kidding" applies to the whole list, right? None of these >> strike me as good ideas, except for improvements to function argument >> passing. > > Really? You see no advantage to moving to garbage collection, nor > allowing Python to leverage multiple processor environments? I'd be > curious to hear your reasons why not. > > My knowledge about garbage collection is weak, but I have read a > little bit of Hans Boehm's work on garbage collection. For example, > his "Memory Allocation Myths and Half Truths" presentation > (http://www.hpl.hp.com/personal/Hans_Boehm/gc/myths.ps) is quite > interesting. On page 25 he examines reference counting. The biggest > disadvantage mentioned is that simple pointer assignments end up > becoming "increment ref count" operations as well, which can "involve > at least 4 potential memory references." The next page has a > micro-benchmark that shows reference counting performing very poorly. > Not to mention that Python has a garbage collector *anyway,* so > wouldn't it make sense to get rid of the reference counting? > > My only argument for making Python capable of leveraging multiple > processor environments is that multithreading seems to be where the > big performance increases will be in the next few years. I am > currently using Python for some relatively large simulations, so > performance is important to me. Wouldn't it be nicer to have a facility that let you send messages between processes and manage concurrency properly instead? You'll need most of this anyway to do multithreading sanely, and the benefit to the multiple process model is that you can scale to multiple machines, not just processors. For brokering data between processes on the same machine, you can use mapped memory if you can't afford to copy it around, which gives you basically all the benefits of threads with fewer pitfalls. -bob From fredrik at pythonware.com Mon Jan 31 17:09:02 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon Jan 31 17:10:22 2005 Subject: [Python-Dev] Re: Moving towards Python 3.0 (was Re: Speed up functioncalls) References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: Bob Ippolito wrote: > Wouldn't it be nicer to have a facility that let you send messages between processes and manage > concurrency properly instead? You'll need most of this anyway to do multithreading sanely, and > the benefit to the multiple process model is that you can scale to multiple machines, not just > processors. yes, please! > For brokering data between processes on the same machine, you can use > mapped memory if you can't afford to copy it around this mechanism should be reasonably hidden, of course, at least for "normal use". From mcherm at mcherm.com Mon Jan 31 17:51:05 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Mon Jan 31 17:51:09 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up functioncalls) Message-ID: <1107190265.41fe61f9caab8@mcherm.com> Evan Jones writes: > My knowledge about garbage collection is weak, but I have read a little > bit of Hans Boehm's work on garbage collection. [...] The biggest > disadvantage mentioned is that simple pointer assignments end up > becoming "increment ref count" operations as well... Hans Boehm certainly has some excellent points. I believe a little searching through the Python dev archives will reveal that attempts have been made in the past to use his GC tools with CPython, and that the results have been disapointing. That may be because other parts of CPython are optimized for reference counting, or it may be just because this stuff is so bloody difficult! However, remember that changing away from reference counting is a change to the semantics of CPython. Right now, people can (and often do) assume that objects which don't participate in a reference loop are collected as soon as they go out of scope. They write code that depends on this... idioms like: >>> text_of_file = open(file_name, 'r').read() Perhaps such idioms aren't a good practice (they'd fail in Jython or in IronPython), but they ARE common. So we shouldn't stop using reference counting unless we can demonstrate that the alternative is clearly better. Of course, we'd also need to devise a way for extensions to cooperate (which is a problem Jython, at least, doesn't face). So it's NOT an obvious call, and so far numerous attempts to review other GC strategies have failed. I wouldn't be so quick to dismiss reference counting. > My only argument for making Python capable of leveraging multiple > processor environments is that multithreading seems to be where the big > performance increases will be in the next few years. I am currently > using Python for some relatively large simulations, so performance is > important to me. CPython CAN leverage such environments, and it IS used that way. However, this requires using multiple Python processes and inter-process communication of some sort (there are lots of choices, take your pick). It's a technique which is more trouble for the programmer, but in my experience usually has less likelihood of containing subtle parallel processing bugs. Sure, it'd be great if Python threads could make use of separate CPUs, but if the cost of that were that Python dictionaries performed as poorly as a Java HashTable or synchronized HashMap, then it wouldn't be worth the cost. There's a reason why Java moved away from HashTable (the threadsafe data structure) to HashMap (not threadsafe). Perhaps the REAL solution is just a really good IPC library that makes it easier to write programs that launch "threads" as separate processes and communicate with them. No change to the internals, just a new library to encourage people to use the technique that already works. -- Michael Chermside From skip at pobox.com Mon Jan 31 18:02:27 2005 From: skip at pobox.com (Skip Montanaro) Date: Mon Jan 31 18:02:59 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up functioncalls) In-Reply-To: <1107190265.41fe61f9caab8@mcherm.com> References: <1107190265.41fe61f9caab8@mcherm.com> Message-ID: <16894.25763.467209.961676@montanaro.dyndns.org> Michael> CPython CAN leverage such environments, and it IS used that Michael> way. However, this requires using multiple Python processes Michael> and inter-process communication of some sort (there are lots of Michael> choices, take your pick). It's a technique which is more Michael> trouble for the programmer, but in my experience usually has Michael> less likelihood of containing subtle parallel processing Michael> bugs. In my experience, when people suggest that "threads are easier than ipc", it means that their code is sprinkled with "subtle parallel processing bugs". Michael> Perhaps the REAL solution is just a really good IPC library Michael> that makes it easier to write programs that launch "threads" as Michael> separate processes and communicate with them. Tuple space, anyone? Skip From mwh at python.net Mon Jan 31 18:20:46 2005 From: mwh at python.net (Michael Hudson) Date: Mon Jan 31 18:20:49 2005 Subject: [Python-Dev] Re: PEP 309 In-Reply-To: <79990c6b05012914156800e5bc@mail.gmail.com> (Paul Moore's message of "Sat, 29 Jan 2005 22:15:40 +0000") References: <79990c6b05012701492440d0c0@mail.gmail.com> <79990c6b05012914156800e5bc@mail.gmail.com> Message-ID: <2mr7k18qhd.fsf@starship.python.net> Paul Moore writes: > Also, while looking at patches I noticed 1077106. It doesn't apply to > me - I don't use Linux - but it looks like this may have simply been > forgotten. The last comment is in December from from Michael Hudson, > saying in effect "I'll commit this tomorrow". Michael? Argh. Committed. Cheers, mwh -- LINTILLA: You could take some evening classes. ARTHUR: What, here? LINTILLA: Yes, I've got a bottle of them. Little pink ones. -- The Hitch-Hikers Guide to the Galaxy, Episode 12 From glyph at divmod.com Mon Jan 31 20:08:24 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Mon Jan 31 20:07:50 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up functioncalls) In-Reply-To: <1107190265.41fe61f9caab8@mcherm.com> References: <1107190265.41fe61f9caab8@mcherm.com> Message-ID: <1107198504.4185.5.camel@localhost> On Mon, 2005-01-31 at 08:51 -0800, Michael Chermside wrote: > However, remember that changing away from reference counting is a change > to the semantics of CPython. Right now, people can (and often do) assume > that objects which don't participate in a reference loop are collected > as soon as they go out of scope. They write code that depends on > this... idioms like: > > >>> text_of_file = open(file_name, 'r').read() > > Perhaps such idioms aren't a good practice (they'd fail in Jython or > in IronPython), but they ARE common. So we shouldn't stop using > reference counting unless we can demonstrate that the alternative is > clearly better. Of course, we'd also need to devise a way for extensions > to cooperate (which is a problem Jython, at least, doesn't face). I agree that the issue is highly subtle, but this reason strikes me as kind of bogus. The problem here is not that the semantics are really different, but that Python doesn't treat file descriptors as an allocatable resource, and therefore doesn't trigger the GC when they are exhausted. As it stands, this idiom works most of the time, and if an EMFILE errno triggered the GC, it would always work. Obviously this would be difficult to implement pervasively, but maybe it should be a guideline for alternative implementations to follow so as not to fall into situations where tricks like this one, which are perfectly valid both semantically and in regular python, would fail due to an interaction with the OS...? From martin at v.loewis.de Mon Jan 31 20:21:10 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 31 20:20:53 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: <41FE8526.4000907@v.loewis.de> Evan Jones wrote: > The next page has a > micro-benchmark that shows reference counting performing very poorly. > Not to mention that Python has a garbage collector *anyway,* so wouldn't > it make sense to get rid of the reference counting? It's not clear what these numbers exactly mean, but I don't believe them. With the Python GIL, the increments/decrements don't have to be atomic, which already helps in a multiprocessor system (as you don't need a buslock). The actual costs of GC occur when a collection happens - and it should always be possible to construct cases where the collection needs longer, because it has to look at so much memory. I like reference counting because of its predictability. I deliberately do data = open(filename).read() without having to worry about closing the file - just because reference counting does it for me. I guess a lot of code will break when you drop refcounting - perhaps unless an fopen failure will trigger a GC. Regards, Martin From martin at v.loewis.de Mon Jan 31 20:23:18 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 31 20:23:02 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: <1107177212.14651.127.camel@presto.wooz.org> References: <000d01c50744$b2395700$fe26a044@oemcomputer> <16893.47973.850283.413462@montanaro.dyndns.org> <1107177212.14651.127.camel@presto.wooz.org> Message-ID: <41FE85A6.10903@v.loewis.de> Barry Warsaw wrote: > I've heard rumors that SF was going to be making svn available. Anybody > know more about that? I'd be +1 on moving from cvs to svn. It was on their "things we do in 2005" list. 2005 isn't over yet... I wouldn't be surprised if it gets moved to their "things we do in 2006" list in November (just predicting from past history, without any insight). Regards, Martin From mwh at python.net Mon Jan 31 20:40:48 2005 From: mwh at python.net (Michael Hudson) Date: Mon Jan 31 20:40:49 2005 Subject: [Python-Dev] Re: Moving towards Python 3.0 In-Reply-To: (Evan Jones's message of "Mon, 31 Jan 2005 10:43:53 -0500") References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: <2mmzup8jzz.fsf@starship.python.net> Evan Jones writes: > On Jan 31, 2005, at 0:17, Guido van Rossum wrote: >> The "just kidding" applies to the whole list, right? None of these >> strike me as good ideas, except for improvements to function argument >> passing. > > Really? You see no advantage to moving to garbage collection, nor > allowing Python to leverage multiple processor environments? I'd be > curious to hear your reasons why not. Obviously, if one could wave a wand and make it so, we would. The argument about whether the cost (in backwards compatibility, portability, uniprocessor performace, developer time, etc) outweighs the benefit. > My knowledge about garbage collection is weak, but I have read a > little bit of Hans Boehm's work on garbage collection. For example, > his "Memory Allocation Myths and Half Truths" presentation > (http://www.hpl.hp.com/personal/Hans_Boehm/gc/myths.ps) is quite > interesting. On page 25 he examines reference counting. The biggest > disadvantage mentioned is that simple pointer assignments end up > becoming "increment ref count" operations as well, which can "involve > at least 4 potential memory references." The next page has a > micro-benchmark that shows reference counting performing very > poorly. Given the current implementations *extreme* malloc-happyness I posit that it would be more-or-less impossible to make any form of non-copying garabage collector go faster for Python that refcounting. I may be wrong, but I don't think so and I have actually thought about this a little bit :) The "non-copying" bit is important for backwards compatibility of C extensions (unless there's something I don't know). > Not to mention that Python has a garbage collector *anyway,* so > wouldn't it make sense to get rid of the reference counting? Here you're confused. Python's cycle collector depends utterly on reference counting. (And what is it with this "let's ditch refcounting and use a garbage collector" thing that people always wheel out? Refcounting *is* a form of garbage collection by most reasonable definitions, esp. when you add Python's cycle collector). > My only argument for making Python capable of leveraging multiple > processor environments is that multithreading seems to be where the > big performance increases will be in the next few years. I am > currently using Python for some relatively large simulations, so > performance is important to me. I'm sure you're tired of hearing it, but I think processes are your friend... Cheers, mwh -- It is time-consuming to produce high-quality software. However, that should not alone be a reason to give up the high standards of Python development. -- Martin von Loewis, python-dev From apocalypznow at gmail.com Mon Jan 31 09:15:58 2005 From: apocalypznow at gmail.com (apocalypznow) Date: Mon Jan 31 21:10:20 2005 Subject: [Python-Dev] linux executable - how? Message-ID: How can I take my python scripts and create a linux executable out of it (to be distributed without having to also distribute python) ? From aahz at pythoncraft.com Mon Jan 31 22:11:16 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon Jan 31 22:11:20 2005 Subject: [Python-Dev] linux executable - how? In-Reply-To: References: Message-ID: <20050131211116.GA7518@panix.com> On Mon, Jan 31, 2005, apocalypznow wrote: > > How can I take my python scripts and create a linux executable out of it > (to be distributed without having to also distribute python) ? python-dev is for discussion of patches and bugs to Python itself. Please post your question on comp.lang.python. Thanks! -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Given that C++ has pointers and typecasts, it's really hard to have a serious conversation about type safety with a C++ programmer and keep a straight face. It's kind of like having a guy who juggles chainsaws wearing body armor arguing with a guy who juggles rubber chickens wearing a T-shirt about who's in more danger." --Roy Smith, c.l.py, 2004.05.23 From bac at OCF.Berkeley.EDU Mon Jan 31 23:02:20 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Jan 31 23:02:37 2005 Subject: [Python-Dev] python-dev Summary for 2004-12-16 through 2004-12-31 [draft] Message-ID: <41FEAAEC.5080805@ocf.berkeley.edu> Nice and short summary this time. Plan to send this off Wednesday or Thursday so get corrections in before then. ------------------------------ ===================== Summary Announcements ===================== You can still `register `__ for `PyCon`_. The `schedule of talks`_ is now online. Jim Hugunin is lined up to be the keynote speaker on the first day with Guido being the keynote on Thursday. Once again PyCon looks like it is going to be great. On a different note, as I am sure you are all aware I am still about a month behind in summaries. School this quarter for me has just turned out hectic. I think it is lack of motivation thanks to having finished my 14 doctoral applications just a little over a week ago (and no, that number is not a typo). I am going to for the first time in my life come up with a very regimented study schedule that will hopefully allow me to fit in weekly Python time so as to allow me to catch up on summaries. And this summary is not short because I wanted to finish it. 2.5 was released just before the time this summary covers so most stuff was on bug fixes discovered after the release. .. _PyCon: http://www.pycon.org/ .. _schedule of talks: http://www.python.org/pycon/2005/schedule.html ======= Summary ======= ------------- PEP movements ------------- I introduced a `proto-PEP `__ to the list on how one can go about changing CPython's bytecode. It will need rewriting once the AST branch is merged into HEAD on CVS. Plus I need to get a PEP number assigned to me. =) Contributing threads: - ` proto-pep: How to change Python's bytecode <>`__ ------------------------------------ Handling versioning within a package ------------------------------------ The suggestion of extending import syntax to support explicit version importation came up. The idea was to have something along the lines of ``import foo version 2, 4`` so that one can have packages that contain different versions of itself and to provide an easy way to specify which version was desired. The idea didn't fly, though. The main objection was that import-as support was all you really needed; ``import foo_2_4 as foo``. And if you had a ton of references to a specific package and didn't want to burden yourself with explicit imports, one can always have a single place before codes starts executing doing ``import foo_2_4; sys.modules["foo"] = foo_2_4``. And that itself can even be lower by creating a foo.py file that does the above for you. You can also look at how wxPython handles it at http://wiki.wxpython.org/index.cgi/MultiVersionInstalls . Contributing threads: - `Re: [Pythonmac-SIG] The versioning question... <>`__ =============== Skipped Threads =============== - Problems compiling Python 2.3.3 on Solaris 10 with gcc 3.4.1 - 2.4 news reaches interesting places see `last summary`_ for coverage of this thread - RE: [Python-checkins] python/dist/src/Modules posixmodule.c, 2.300.8.10, 2.300.8.11 - mmap feature or bug? - Re: [Python-checkins] python/dist/src/Pythonmarshal.c, 1.79, 1.80 - Latex problem when trying to build documentation - Patches: 1 for the price of 10. - Python for Series 60 released - Website documentation - link to descriptor information - Build extensions for windows python 2.4 what are the compiler rules? - Re: [Python-checkins] python/dist/src setup.py, 1.208, 1.209 - Zipfile needs? fake 32-bit unsigned int overflow with ``x = x & 0xFFFFFFFFL`` and signed ints with the additional ``if x & 0x80000000L: x -= 0x100000000L`` . - Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.1, 1.2 From binkertn at umich.edu Mon Jan 31 21:16:47 2005 From: binkertn at umich.edu (Nathan Binkert) Date: Mon Jan 31 23:05:39 2005 Subject: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls) In-Reply-To: References: <000d01c50744$b2395700$fe26a044@oemcomputer> Message-ID: > Wouldn't it be nicer to have a facility that let you send messages > between processes and manage concurrency properly instead? You'll need > most of this anyway to do multithreading sanely, and the benefit to the > multiple process model is that you can scale to multiple machines, not > just processors. For brokering data between processes on the same > machine, you can use mapped memory if you can't afford to copy it > around, which gives you basically all the benefits of threads with > fewer pitfalls. I don't think this is an answered problem. There are plenty of researchers on both sides of this fence. It is not been proven at all that threads are a bad model. http://capriccio.cs.berkeley.edu/pubs/threads-hotos-2003.pdf or even http://www.python.org/~jeremy/weblog/030912.html