From collinw at gmail.com Fri Sep 1 04:52:02 2006 From: collinw at gmail.com (Collin Winter) Date: Thu, 31 Aug 2006 21:52:02 -0500 Subject: [Python-Dev] A test suite for unittest Message-ID: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> I've just uploaded a trio of unittest-related patches: #1550272 (http://python.org/sf/1550272) is a test suite for the mission-critical parts of unittest. #1550273 (http://python.org/sf/1550273) fixes 6 issues uncovered while writing the test suite. Several other items that I raised earlier (http://mail.python.org/pipermail/python-dev/2006-August/068378.html) were judged to be either non-issues or behaviours that, while suboptimal, people have come to rely on. #1550263 (http://python.org/sf/1550263) follows up on an earlier patch I submitted for unittest's docs. This new patch corrects and clarifies numerous sections of the module's documentation. I'd appreciate it if these changes could make it into 2.5-final or at least 2.5.1. What follows is a list of the issues fixed in patch #1550273: 1) TestLoader.loadTestsFromName() failed to return a suite when resolving a name to a callable that returns a TestCase instance. 2) Fix a bug in both TestSuite.addTest() and TestSuite.addTests() concerning a lack of input checking on the input test case(s)/suite(s). 3) Fix a bug in both TestLoader.loadTestsFromName() and TestLoader.loadTestsFromNames() that had ValueError being raised instead of TypeError. The problem occured when the given name resolved to a callable and the callable returned something of the wrong type. 4) When a name resolves to a method on a TestCase subclass, TestLoader.loadTestsFromName() did not return a suite as promised. 5) TestLoader.loadTestsFromName() would raise a ValueError (rather than a TypeError) if a name resolved to an invalid object. This has been fixed so that a TypeError is raised. 6) TestResult.shouldStop was being initialised to 0 in TestResult.__init__. Since this attribute is always used in a boolean context, it's better to use the False spelling. Thanks, Collin Winter From fdrake at acm.org Fri Sep 1 06:02:59 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 1 Sep 2006 00:02:59 -0400 Subject: [Python-Dev] A test suite for unittest In-Reply-To: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> References: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> Message-ID: <200609010003.00418.fdrake@acm.org> On Thursday 31 August 2006 22:52, Collin Winter wrote: > I've just uploaded a trio of unittest-related patches: Thanks, Collin! > #1550272 (http://python.org/sf/1550272) is a test suite for the > mission-critical parts of unittest. > > #1550273 (http://python.org/sf/1550273) fixes 6 issues uncovered while > writing the test suite. Several other items that I raised earlier > (http://mail.python.org/pipermail/python-dev/2006-August/068378.html) > were judged to be either non-issues or behaviours that, while > suboptimal, people have come to rely on. I'm hesitant to commit even tests at this point (the release candidate has already been released, and there's no plan for a second). I've not reviewed the patches. > #1550263 (http://python.org/sf/1550263) follows up on an earlier patch > I submitted for unittest's docs. This new patch corrects and clarifies > numerous sections of the module's documentation. Anthony did approve documentation changes for 2.5, so I've committed this for 2.5 and on the trunk (2.6). These should be considered for 2.4.4 as well. (The other two may be appropriate as well.) -Fred -- Fred L. Drake, Jr. From anthony at interlink.com.au Fri Sep 1 06:35:19 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 1 Sep 2006 14:35:19 +1000 Subject: [Python-Dev] A test suite for unittest In-Reply-To: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> References: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com> Message-ID: <200609011435.21060.anthony@interlink.com.au> At this point, I'd say the documentation patches should go in - the other patches are probably appropriate for 2.5.1. I only want to accept critical patches between now and 2.5 final. Thanks for the patches (and particularly for the unittest! woooooo!) Anthony From fredrik at pythonware.com Fri Sep 1 10:08:18 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 1 Sep 2006 10:08:18 +0200 Subject: [Python-Dev] That library reference, yet again References: <8233478f0608311255o7058a1feo55c710e7eb8e6b6c@mail.gmail.com> Message-ID: "Johann C. Rocholl" wrote: > What is the status of http://effbot.org/lib/ ? > > I think it's a step in the right direction. Is it still in progress? the pushback from the powers-that-be was massive, so we're currently working "under the radar", using alternative deployment approaches (see pytut.infogami.com and friends). From jimjjewett at gmail.com Fri Sep 1 15:31:36 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 1 Sep 2006 09:31:36 -0400 Subject: [Python-Dev] Fwd: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc In-Reply-To: <20060831224237.B872C1E4002@bag.python.org> References: <20060831224237.B872C1E4002@bag.python.org> Message-ID: This 8 vs 4 is getting cruftier and cruftier. (And does it deal properly with existing code that already has four spaces because it was written recently?) "Tim" regularly fixes whitespace already, with little damage. Would it make sense to do a one-time cutover on the 2.6 trunk? How about the bugfix branches? If it is ever going to happen, then immediately after a release, before unfreezing, is probably the best time. -jJ ---------- Forwarded message ---------- From: brett.cannon Date: Aug 31, 2006 6:42 PM Subject: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc To: python-checkins at python.org Author: brett.cannon Date: Fri Sep 1 00:42:37 2006 New Revision: 51674 Modified: python/trunk/Misc/Vim/vimrc Log: Have pre-existing C files use 8 spaces indents (to match old PEP 7 style), but have all new files use 4 spaces (to match current PEP 7 style). Modified: python/trunk/Misc/Vim/vimrc ============================================================================== --- python/trunk/Misc/Vim/vimrc (original) +++ python/trunk/Misc/Vim/vimrc Fri Sep 1 00:42:37 2006 @@ -19,9 +19,10 @@ " Number of spaces to use for an indent. " This will affect Ctrl-T and 'autoindent'. " Python: 4 spaces -" C: 4 spaces +" C: 8 spaces (pre-existing files) or 4 spaces (new files) au BufRead,BufNewFile *.py,*pyw set shiftwidth=4 -au BufRead,BufNewFile *.c,*.h set shiftwidth=4 +au BufRead *.c,*.h set shiftwidth=8 +au BufNewFile *.c,*.h set shiftwidth=4 " Number of spaces that a pre-existing tab is equal to. " For the amount of space used for a new tab use shiftwidth. _______________________________________________ Python-checkins mailing list Python-checkins at python.org http://mail.python.org/mailman/listinfo/python-checkins From guido at python.org Fri Sep 1 17:02:37 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 1 Sep 2006 08:02:37 -0700 Subject: [Python-Dev] Fwd: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc In-Reply-To: References: <20060831224237.B872C1E4002@bag.python.org> Message-ID: For 2.x we really don't want to reformat all code. I even think it's questionable to use 4 spaces for new files since it will mean problems for editors switching between files. For 3.0 we really do. But as long as 2.x and 3.0 aren't too far apart I'd rather not reformat everything because it would break all merge capabilities. --Guido On 9/1/06, Jim Jewett wrote: > This 8 vs 4 is getting cruftier and cruftier. (And does it deal > properly with existing code that already has four spaces because it > was written recently?) > > "Tim" regularly fixes whitespace already, with little damage. > > Would it make sense to do a one-time cutover on the 2.6 trunk? > How about the bugfix branches? > > If it is ever going to happen, then immediately after a release, > before unfreezing, is probably the best time. > > -jJ > > ---------- Forwarded message ---------- > From: brett.cannon > Date: Aug 31, 2006 6:42 PM > Subject: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc > To: python-checkins at python.org > > > Author: brett.cannon > Date: Fri Sep 1 00:42:37 2006 > New Revision: 51674 > > Modified: > python/trunk/Misc/Vim/vimrc > Log: > Have pre-existing C files use 8 spaces indents (to match old PEP 7 style), but > have all new files use 4 spaces (to match current PEP 7 style). > > > Modified: python/trunk/Misc/Vim/vimrc > ============================================================================== > --- python/trunk/Misc/Vim/vimrc (original) > +++ python/trunk/Misc/Vim/vimrc Fri Sep 1 00:42:37 2006 > @@ -19,9 +19,10 @@ > " Number of spaces to use for an indent. > " This will affect Ctrl-T and 'autoindent'. > " Python: 4 spaces > -" C: 4 spaces > +" C: 8 spaces (pre-existing files) or 4 spaces (new files) > au BufRead,BufNewFile *.py,*pyw set shiftwidth=4 > -au BufRead,BufNewFile *.c,*.h set shiftwidth=4 > +au BufRead *.c,*.h set shiftwidth=8 > +au BufNewFile *.c,*.h set shiftwidth=4 > > " Number of spaces that a pre-existing tab is equal to. > " For the amount of space used for a new tab use shiftwidth. > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhettinger at ewtllc.com Fri Sep 1 19:56:17 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Fri, 01 Sep 2006 10:56:17 -0700 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <44F6D12C.4040808@gmail.com> References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F6D12C.4040808@gmail.com> Message-ID: <44F87441.7060203@ewtllc.com> >>> The right way to do it was presented in PEP343. The implementation >>> was correct and the API was simple. >> >> >> >> Raymond's persuaded me that he's right on the API part at the very >> least. The current API was a mechanical replacement of the initial >> __context__ based API with a normal method, whereas I should have >> reverted back to the module-level localcontext() function from PEP343 >> and thrown the method on Context objects away entirely. >> >> I can fix it on the trunk (and add those missing tests!), but I'll >> need Anthony and/or Neal's permission to backport it and remove the >> get_manager() method from Python 2.5 before we get stuck with it >> forever. > > > > I committed this fix as 51664 on the trunk (although the docstrings > are still example free because doctest doesn't understand __future__ > statements). > Thanks for getting this done. Please make the following changes: * rename ContextManger to _ContextManger and remove it from the __all__ listing * move the copy() step from localcontext() to _ContextManager() * make the trivial updates the whatsnew25 example Once those nits are fixed, I recommend this patch be backported to the Py2.5 release. Raymond From rhettinger at ewtllc.com Sat Sep 2 01:47:21 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Fri, 01 Sep 2006 16:47:21 -0700 Subject: [Python-Dev] Problem withthe API for str.rpartition() Message-ID: <44F8C689.6050804@ewtllc.com> Currently, both the partition() and rpartition() methods return a (head, sep, tail) tuple and the only difference between the two is whether the partition element search starts from the beginning or end of the string. When no separator is found, both methods return the string S and two empty strings so that 'a'.partition('x') == 'a'.rpartition('x') == ('a', '', ''). For rpartition() the notion of head and tail are backwards -- you repeatedly search the tail, not the head. The distinction is vital because the use cases for rpartition() are a mirror image of those for partition(). Accordingly, rpartition()'s result should be interpreted as (tail, sep, head) and the partition-not-found endcase needs change so that 'a'.rpartition('x') == ('', '', 'a') . The test invariant should be: For every s and p: s.partition(p) == s[::-1].rpartition(p)[::-1] The following code demonstrates why the current choice is problematic: line = 'a.b.c.d' while line: field, sep, line = line.partition('.') print field line = 'a.b.c.d' while line: line, sep, field = line.rpartition('.') print field The second fragment never terminates. Since this is a critical API flaw rather than a implementation bug, I think it should get fixed right away rather than waiting for Py2.5.1. Raymond From guido at python.org Sat Sep 2 02:04:12 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 1 Sep 2006 17:04:12 -0700 Subject: [Python-Dev] Problem withthe API for str.rpartition() In-Reply-To: <44F8C689.6050804@ewtllc.com> References: <44F8C689.6050804@ewtllc.com> Message-ID: +1 On 9/1/06, Raymond Hettinger wrote: > Currently, both the partition() and rpartition() methods return a (head, > sep, tail) tuple and the only difference between the two is whether the > partition element search starts from the beginning or end of the > string. When no separator is found, both methods return the string S > and two empty strings so that 'a'.partition('x') == 'a'.rpartition('x') > == ('a', '', ''). > > For rpartition() the notion of head and tail are backwards -- you > repeatedly search the tail, not the head. The distinction is vital > because the use cases for rpartition() are a mirror image of those for > partition(). Accordingly, rpartition()'s result should be interpreted > as (tail, sep, head) and the partition-not-found endcase needs change so > that 'a'.rpartition('x') == ('', '', 'a') . > > The test invariant should be: > For every s and p: s.partition(p) == s[::-1].rpartition(p)[::-1] > > The following code demonstrates why the current choice is problematic: > > line = 'a.b.c.d' > while line: > field, sep, line = line.partition('.') > print field > > line = 'a.b.c.d' > while line: > line, sep, field = line.rpartition('.') > print field > > The second fragment never terminates. > > Since this is a critical API flaw rather than a implementation bug, I > think it should get fixed right away rather than waiting for Py2.5.1. > > > > Raymond > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Sat Sep 2 03:28:44 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri, 1 Sep 2006 21:28:44 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200609020128.k821SicT001270@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 412 open ( +5) / 3397 closed ( +4) / 3809 total ( +9) Bugs : 900 open (+12) / 6149 closed ( +4) / 7049 total (+16) RFE : 233 open ( +1) / 236 closed ( +0) / 469 total ( +1) New / Reopened Patches ______________________ set literals (2006-08-28) CLOSED http://python.org/sf/1547796 opened by Georg Brandl "for x in setliteral" peepholer optimization (2006-08-28) CLOSED http://python.org/sf/1548082 opened by Georg Brandl set comprehensions (2006-08-29) http://python.org/sf/1548388 opened by Georg Brandl Fix for structmember conversion issues (2006-08-29) http://python.org/sf/1549049 opened by Roger Upole Implementation of PEP 3102 Keyword Only Argument (2006-08-31) http://python.org/sf/1549670 opened by Jiwon Seo Add a test suite for test_unittest (2006-08-31) http://python.org/sf/1550272 opened by Collin Winter Fix numerous bugs in unittest (2006-08-31) http://python.org/sf/1550273 opened by Collin Winter Ellipsis literal "..." (2006-09-01) http://python.org/sf/1550786 opened by Georg Brandl make exec a function (2006-09-01) http://python.org/sf/1550800 opened by Georg Brandl Patches Closed ______________ Allow os.listdir to accept file names longer than MAX_PATH (2006-04-26) http://python.org/sf/1477350 closed by rupole set literals (2006-08-28) http://python.org/sf/1547796 closed by gbrandl pybench.py error reporting broken for bad -s filename (2006-08-25) http://python.org/sf/1546372 closed by lemburg "if x in setliteral" peepholer optimization (2006-08-28) http://python.org/sf/1548082 closed by gvanrossum New / Reopened Bugs ___________________ Typo in Language Reference Section 3.2 Class Instances (2006-08-28) http://python.org/sf/1547931 opened by whesse_at_clarkson curses module segfaults on invalid tparm arguments (2006-08-28) http://python.org/sf/1548092 opened by Marien Zwart Add 'find' method to sequence types (2006-08-28) http://python.org/sf/1548178 opened by kovan Recursion limit exceeded in the match function (2006-08-29) CLOSED http://python.org/sf/1548252 opened by wojtekwu sgmllib.sgmlparser is not thread safe (2006-08-28) http://python.org/sf/1548288 opened by Andres Riancho whichdb too dumb (2006-08-28) http://python.org/sf/1548332 opened by Curtis Doty filterwarnings('error') has no effect (2006-08-29) http://python.org/sf/1548371 opened by Roger Upole C modules reloaded on certain failed imports (2006-08-29) http://python.org/sf/1548687 opened by Josiah Carlson shlex (or perhaps cStringIO) and unicode strings (2006-08-29) http://python.org/sf/1548891 opened by Erwin S. Andreasen bug in classlevel variabels (2006-08-30) CLOSED http://python.org/sf/1549499 opened by Thomas Dybdahl Ahle Pdb parser bug (2006-08-30) http://python.org/sf/1549574 opened by Alexander Belopolsky urlparse return exchanged values (2006-08-30) CLOSED http://python.org/sf/1549589 opened by Oscar Acena Enhance and correct unittest's docs (redux) (2006-08-31) http://python.org/sf/1550263 reopened by fdrake Enhance and correct unittest's docs (redux) (2006-08-31) http://python.org/sf/1550263 opened by Collin Winter inspect module and class startlineno (2006-09-01) http://python.org/sf/1550524 opened by Ali Gholami Rudi SWIG wrappers incompatible with 2.5c1 (2006-09-01) http://python.org/sf/1550559 opened by Andrew Gregory itertools.tee raises SystemError (2006-09-01) http://python.org/sf/1550714 opened by Alexander Belopolsky itertools.tee raises SystemError (2006-09-01) CLOSED http://python.org/sf/1550761 opened by Alexander Belopolsky Bugs Closed ___________ x!=y and [x]=[y] (!) (2006-08-22) http://python.org/sf/1544762 closed by rhettinger Recursion limit exceeded in the match function (2006-08-29) http://python.org/sf/1548252 closed by gbrandl bug in classlevel variabels (2006-08-30) http://python.org/sf/1549499 closed by gbrandl urlparse return exchanged values (2006-08-30) http://python.org/sf/1549589 closed by gbrandl Enhance and correct unittest's docs (redux) (2006-08-31) http://python.org/sf/1550263 closed by fdrake itertools.tee raises SystemError (2006-09-01) http://python.org/sf/1550761 deleted by belopolsky From ncoghlan at gmail.com Sat Sep 2 06:47:30 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 02 Sep 2006 14:47:30 +1000 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <44F715DC.1090001@ewtllc.com> References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F715DC.1090001@ewtllc.com> Message-ID: <44F90CE2.2050200@gmail.com> Raymond Hettinger wrote: > Please go ahead and get the patch together for localcontext(). This > should be an easy sell: > > * simple bugs can be fixed in Py2.5.1 but API mistakes are forever. * > currently, all of the docs, docstrings, and whatsnew are incorrect. > * the solution has already been worked-out in PEP343 -- it's nothing new. > * nothing else, anywhere depends on this code -- it is as safe a change > as we could hope for. > > Neal is tough, but he's not heartless ;-) I backported the changes and assigned the patch to Neal: http://www.python.org/sf/1550886 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From gjcarneiro at gmail.com Sat Sep 2 14:10:04 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 2 Sep 2006 13:10:04 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: We have to resort to timeouts in pygtk in order to catch unix signals in threaded mode. The reason is this. We call gtk_main() (mainloop function) which blocks forever. Suppose there are threads in the program; then any thread can receive a signal (e.g. SIGINT). Python catches the signal, but doesn't do anything; it simply sets a flag in a global structure and calls Py_AddPendingCall(), and I guess it expects someone to call Py_MakePendingCalls(). However, the main thread is blocked calling a C function and has no way of being notified it needs to give control back to python to handle the signal. Hence, we use a 100ms timeout for polling. Unfortunately, timeouts needlessly consume CPU time and drain laptop batteries. According to [1], all python needs to do to avoid this problem is block all signals in all but the main thread; then we can guarantee signal handlers are always called from the main thread, and pygtk doesn't need a timeout. Another alternative would be to add a new API like Py_AddPendingCallNotification, which would let python notify extensions that new pending calls exist and need to be processed. But I would really prefer the first alternative, as it could be fixed within python 2.5; no need to wait for 2.6. Please, let's make Python ready for the enterprise! [2] [1] https://bugzilla.redhat.com/bugzilla/process_bug.cgi#c3 [2] http://perkypants.org/blog/2006/09/02/rfte-python/ From nmm1 at cus.cam.ac.uk Sat Sep 2 15:02:43 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sat, 02 Sep 2006 14:02:43 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Sat, 02 Sep 2006 13:10:04 BST." Message-ID: "Gustavo Carneiro" wrote: > > We have to resort to timeouts in pygtk in order to catch unix signals > in threaded mode. A common defect of modern designs - TCP/IP is particularly objectionable in this respect, but that battle was lost and won over two decades ago :-( > The reason is this. We call gtk_main() (mainloop function) which > blocks forever. Suppose there are threads in the program; then any > thread can receive a signal (e.g. SIGINT). Python catches the signal, > but doesn't do anything; it simply sets a flag in a global structure > and calls Py_AddPendingCall(), and I guess it expects someone to call > Py_MakePendingCalls(). However, the main thread is blocked calling a > C function and has no way of being notified it needs to give control > back to python to handle the signal. Hence, we use a 100ms timeout > for polling. Unfortunately, timeouts needlessly consume CPU time and > drain laptop batteries. Yup. > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; then we can guarantee > signal handlers are always called from the main thread, and pygtk > doesn't need a timeout. 1) That page is password protected, so I can't see what it says, and am disinclined to register myself to yet another such site. 2) No way, Jose, anyway. The POSIX signal handling model was broken beyond redemption, even before threading was added, and the combination is evil almost beyond belief. That procedure is good practice, yes, but that is NOT all that you have to do - it may be all that you CAN do, but that is not the same. Come back MVS (or even VMS) - all is forgiven! That is only partly a joke. > Another alternative would be to add a new API like > Py_AddPendingCallNotification, which would let python notify > extensions that new pending calls exist and need to be processed. Nope. Sorry, but you can't solve a broken design by adding interfaces. > But I would really prefer the first alternative, as it could be > fixed within python 2.5; no need to wait for 2.6. It clearly should be done, assuming that Python's model is that it doesn't want to get involved with subthread signalling (and I really, but REALLY, recommend not doing so). The best that can be done is to say that all signal handling is the business of the main thread and that, when the system bypasses that, all bets are off. > Please, let's make Python ready for the enterprise! [2] Given that no Unix variant or Microsoft system is, isn't that rather an unreasonable demand? I am probably one of the last half-dozen people still employed in a technical capacity who has implemented run-time systems that supported user-level signal handling with threads/asynchronicity and allowing for signals received while in system calls. It would be possible to modify/extend POSIX or Microsoft designs to support this, but currently they don't make it possible. There is NOTHING that Python can do but to minimise the chaos. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From jjl at pobox.com Sat Sep 2 17:01:52 2006 From: jjl at pobox.com (John J Lee) Date: Sat, 2 Sep 2006 15:01:52 +0000 (UTC) Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <44F6D12C.4040808@gmail.com> References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F6D12C.4040808@gmail.com> Message-ID: On Thu, 31 Aug 2006, Nick Coghlan wrote: [...] > I committed this fix as 51664 on the trunk (although the docstrings are still > example free because doctest doesn't understand __future__ statements). [...] Assuming doctest doesn't try to parse the Python code when SKIP is specified, I guess this would solve that little problem: http://docs.python.org/dev/lib/doctest-options.html """ SKIP When specified, do not run the example at all. This can be useful in contexts where doctest examples serve as both documentation and test cases, and an example should be included for documentation purposes, but should not be checked. E.g., the example's output might be random; or the example might depend on resources which would be unavailable to the test driver. The SKIP flag can also be used for temporarily "commenting out" examples. ... Changed in version 2.5: Constant SKIP was added. """ John From ncoghlan at gmail.com Sat Sep 2 17:27:03 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 03 Sep 2006 01:27:03 +1000 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F6D12C.4040808@gmail.com> Message-ID: <44F9A2C7.5060803@gmail.com> John J Lee wrote: > On Thu, 31 Aug 2006, Nick Coghlan wrote: > [...] >> I committed this fix as 51664 on the trunk (although the docstrings are still >> example free because doctest doesn't understand __future__ statements). > [...] > > Assuming doctest doesn't try to parse the Python code when SKIP is > specified, I guess this would solve that little problem: > > http://docs.python.org/dev/lib/doctest-options.html > > """ > SKIP A quick experiment suggests that using SKIP will solve the problem - fixing that can wait until 2.5.1 though. The localcontext() docstring does actually contain an example - it just isn't in a form that doctest will try to execute. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From alan.mcintyre at gmail.com Sat Sep 2 18:31:54 2006 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sat, 02 Sep 2006 12:31:54 -0400 Subject: [Python-Dev] Windows build slave down until Tuesday-ish Message-ID: <44F9B1FA.4010305@gmail.com> The "x86 XP trunk" build slave will be down for a bit longer, unfortunately. Tropical storm Ernesto got in the way of my DSL installation - I don't have a new install date yet, but I'm assuming it's going to be Tuesday or later. Alan From gjcarneiro at gmail.com Sat Sep 2 18:39:51 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 2 Sep 2006 17:39:51 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References:

Message-ID: On 9/2/06, Nick Maclaren wrote: > > According to [1], all python needs to do to avoid this problem is > > block all signals in all but the main thread; then we can guarantee > > signal handlers are always called from the main thread, and pygtk > > doesn't need a timeout. > > 1) That page is password protected, so I can't see what it says, and > am disinclined to register myself to yet another such site. Oh, sorry, here's the comment: (coment by Arjan van de Ven): | afaik the kernel only sends signals to threads that don't have them blocked. | If python doesn't want anyone but the main thread to get signals, it should just | block signals on all but the main thread and then by nature, all signals will go | to the main thread.... > 2) No way, Jose, anyway. The POSIX signal handling model was broken > beyond redemption, even before threading was added, and the combination > is evil almost beyond belief. That procedure is good practice, yes, > but that is NOT all that you have to do - it may be all that you CAN > do, but that is not the same. > > Nope. Sorry, but you can't solve a broken design by adding interfaces. Well, Python has a broken design too; it postpones tasks and expects to magically regain control in order to finish the job. That often doesn't happen! > > > But I would really prefer the first alternative, as it could be > > fixed within python 2.5; no need to wait for 2.6. > > It clearly should be done, assuming that Python's model is that it > doesn't want to get involved with subthread signalling (and I really, > but REALLY, recommend not doing so). The best that can be done is to > say that all signal handling is the business of the main thread and > that, when the system bypasses that, all bets are off. Python is halfway there; it assumes signals are to be handled in the main thread. However, it _catches_ them in any thread, sets a flag, and just waits for the next opportunity when it runs again in the main thread. It is precisely this "split handling" of signals that is failing now. Anyway, attached a patch that should fix the problem in posix threads systems, in case anyone wants to review. Cheers. -------------- next part -------------- A non-text attachment was scrubbed... Name: pythreads.diff Type: text/x-patch Size: 1030 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060902/b91250df/attachment.bin From raymond.hettinger at verizon.net Sat Sep 2 19:11:58 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 02 Sep 2006 10:11:58 -0700 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com> <44F715DC.1090001@ewtllc.com> <44F90CE2.2050200@gmail.com> Message-ID: <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> [Neal] > Please review the patch and make a comment. I did a diff between HEAD > and 2.4 and am fine with this going in once you are happy. I fixed a couple of documentation nits in rev 51688. The patch is ready-to-go. Nick, please go ahead and backport. Raymond From nmm1 at cus.cam.ac.uk Sat Sep 2 20:41:59 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sat, 02 Sep 2006 19:41:59 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Sat, 02 Sep 2006 17:39:51 BST." Message-ID: "Gustavo Carneiro" wrote: > > Oh, sorry, here's the comment: > > (coment by Arjan van de Ven): > | afaik the kernel only sends signals to threads that don't have them blocked. > | If python doesn't want anyone but the main thread to get signals, it > should just > | block signals on all but the main thread and then by nature, all > signals will go > | to the main thread.... Well, THAT'S wrong, I am afraid! Things ain't that simple :-( Yes, POSIX implies that things work that way, but there are so many get-out clauses and problems with trying to implement that specification that such behaviour can't be relied on. > Well, Python has a broken design too; it postpones tasks and expects > to magically regain control in order to finish the job. That often > doesn't happen! Very true. And that is another problem with POSIX :-( > Python is halfway there; it assumes signals are to be handled in the > main thread. However, it _catches_ them in any thread, sets a flag, > and just waits for the next opportunity when it runs again in the main > thread. It is precisely this "split handling" of signals that is > failing now. I agree that is not how to do it, but that code should not be removed. Despite best attempts, there may well be circumstances under which signals are received in a subthread, despite all attempts of the program to ensure that the main thread gets them. > Anyway, attached a patch that should fix the problem in posix > threads systems, in case anyone wants to review. Not "fix" - "improve" :-) I haven't looked at it, but I agree that what you have said is the way to proceed. The best solution is to enable the main thread for all relevant signals, disable all subthreads, but to not rely on any of that working in all cases. It won't help with the problem where merely receiving a signal causes chaos, or where blocking them does so, but there is nothing that Python can do about that, in general. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From anthony at interlink.com.au Sun Sep 3 05:58:40 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Sun, 3 Sep 2006 13:58:40 +1000 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> References: <44F4D9D2.2040804@ewtllc.com> <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> Message-ID: <200609031358.42774.anthony@interlink.com.au> On Sunday 03 September 2006 03:11, Raymond Hettinger wrote: > [Neal] > > > Please review the patch and make a comment. I did a diff between HEAD > > and 2.4 and am fine with this going in once you are happy. > > I fixed a couple of documentation nits in rev 51688. > The patch is ready-to-go. > Nick, please go ahead and backport. I think this is suitable for 2.5. I'm thinking, though, that we need a second release candidate, given the number of changes since rc1. -- Anthony Baxter It's never too late to have a happy childhood. From aahz at pythoncraft.com Sun Sep 3 06:06:27 2006 From: aahz at pythoncraft.com (Aahz) Date: Sat, 2 Sep 2006 21:06:27 -0700 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <200609031358.42774.anthony@interlink.com.au> References: <44F4D9D2.2040804@ewtllc.com> <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> <200609031358.42774.anthony@interlink.com.au> Message-ID: <20060903040627.GA21743@panix.com> On Sun, Sep 03, 2006, Anthony Baxter wrote: > > I think this is suitable for 2.5. I'm thinking, though, that we need > a second release candidate, given the number of changes since rc1. +1 -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ I support the RKAB From fdrake at acm.org Sun Sep 3 07:01:50 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 3 Sep 2006 01:01:50 -0400 Subject: [Python-Dev] Py2.5 issue: decimal context manager misimplemented, misdesigned, and misdocumented In-Reply-To: <200609031358.42774.anthony@interlink.com.au> References: <44F4D9D2.2040804@ewtllc.com> <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1> <200609031358.42774.anthony@interlink.com.au> Message-ID: <200609030101.51129.fdrake@acm.org> On Saturday 02 September 2006 23:58, Anthony Baxter wrote: > I think this is suitable for 2.5. I'm thinking, though, that we need a > second release candidate, given the number of changes since rc1. +1 -Fred -- Fred L. Drake, Jr. From chrism at plope.com Mon Sep 4 04:36:23 2006 From: chrism at plope.com (Chris McDonough) Date: Sun, 3 Sep 2006 22:36:23 -0400 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: Would adding an API for sigprocmask help here? (Although it has been tried before -- http://mail.python.org/ pipermail/python-dev/2003-February/033016.html and died in the womb due to threading-related issues -- http://mail.mems-exchange.org/ durusmail/quixote-users/1248/) - C On Sep 2, 2006, at 8:10 AM, Gustavo Carneiro wrote: > We have to resort to timeouts in pygtk in order to catch unix signals > in threaded mode. > The reason is this. We call gtk_main() (mainloop function) which > blocks forever. Suppose there are threads in the program; then any > thread can receive a signal (e.g. SIGINT). Python catches the signal, > but doesn't do anything; it simply sets a flag in a global structure > and calls Py_AddPendingCall(), and I guess it expects someone to call > Py_MakePendingCalls(). However, the main thread is blocked calling a > C function and has no way of being notified it needs to give control > back to python to handle the signal. Hence, we use a 100ms timeout > for polling. Unfortunately, timeouts needlessly consume CPU time and > drain laptop batteries. > > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; then we can guarantee > signal handlers are always called from the main thread, and pygtk > doesn't need a timeout. > > Another alternative would be to add a new API like > Py_AddPendingCallNotification, which would let python notify > extensions that new pending calls exist and need to be processed. > > But I would really prefer the first alternative, as it could be > fixed within python 2.5; no need to wait for 2.6. > > Please, let's make Python ready for the enterprise! [2] > > [1] https://bugzilla.redhat.com/bugzilla/process_bug.cgi#c3 > [2] http://perkypants.org/blog/2006/09/02/rfte-python/ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists > %40plope.com > From anthony at interlink.com.au Mon Sep 4 09:19:39 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 4 Sep 2006 17:19:39 +1000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <200609041719.41488.anthony@interlink.com.au> On Saturday 02 September 2006 22:10, Gustavo Carneiro wrote: > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; then we can guarantee > signal handlers are always called from the main thread, and pygtk > doesn't need a timeout. > But I would really prefer the first alternative, as it could be > fixed within python 2.5; no need to wait for 2.6. Assuming "the first alternative" is the "just block all signals in all but the main thread" option, there is absolutely no chance of this going into 2.5. Signals and threads combined are an complete *nightmare* of platform-specific behaviour. I'm -1000 on trying to change this code now, _after_ the first release candidate. To say that "that path lies madness" is like saying "Pacific Ocean large, wet, full of fish". -- Anthony Baxter It's never too late to have a happy childhood. From rasky at develer.com Mon Sep 4 12:29:51 2006 From: rasky at develer.com (Giovanni Bajo) Date: Mon, 4 Sep 2006 12:29:51 +0200 Subject: [Python-Dev] Error while building 2.5rc1 pythoncore_pgo on VC8 References: <8dd9fd0608310336q45d2d3d3re203e871c7b384b8@mail.gmail.com> <8dd9fd0608310446o6008240x8bfa852b41595eab@mail.gmail.com> Message-ID: <01ca01c6d00d$0dd14a90$b803030a@trilan> Fredrik Lundh wrote: >> That error mentioned in that post was in "pythoncore" module. >> My error is while compiling "pythoncore_pgo" module. > > iirc, that's a partially experimental alternative build for playing > with performance guided optimizations. are you sure you need > that module ? Oh yes, it's a 30% improvement in pystone, for free. -- Giovanni Bajo From mwh at python.net Mon Sep 4 15:30:41 2006 From: mwh at python.net (Michael Hudson) Date: Mon, 04 Sep 2006 14:30:41 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: (Gustavo Carneiro's message of "Sat, 2 Sep 2006 13:10:04 +0100") References: Message-ID: <2mpseboj26.fsf@starship.python.net> "Gustavo Carneiro" writes: > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; Argh, no: then people who call system() from non-main threads end up running subprocesses with all signals masked, which breaks other things in very mysterious ways. Been there... No time to read the rest of the post, maybe in a few days... Cheers, mwh -- Arrrrgh, the braindamage! It's not unlike the massively non-brilliant decision to use the period in abbreviations as well as a sentence terminator. Had these people no imagination at _all_? -- Erik Naggum, comp.lang.lisp From gjcarneiro at gmail.com Mon Sep 4 15:48:54 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Mon, 4 Sep 2006 13:48:54 +0000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <2mpseboj26.fsf@starship.python.net> References: <2mpseboj26.fsf@starship.python.net> Message-ID: On 9/4/06, Michael Hudson wrote: > "Gustavo Carneiro" writes: > > > According to [1], all python needs to do to avoid this problem is > > block all signals in all but the main thread; > > Argh, no: then people who call system() from non-main threads end up > running subprocesses with all signals masked, which breaks other > things in very mysterious ways. Been there... That's a very good point; I wasn't aware that child processes inherited the signals mask from their parent processes. > No time to read the rest of the post, maybe in a few days... Don't worry. From the feedback received so far it seems that any proposed solution has to wait for Python 2.6 :-( I am now thinking of something along these lines: typedef void (*PyPendingCallNotify)(void *user_data); PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, void *user_data); PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify callback, void *user_data); Regards. From nmm1 at cus.cam.ac.uk Mon Sep 4 16:05:56 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 15:05:56 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: "Gustavo Carneiro" wrote: > > That's a very good point; I wasn't aware that child processes > inherited the signals mask from their parent processes. That's one of the few places where POSIX does describe what happens. Well, usually. You really don't want to know what happens when you call something revolting, like csh or a setuid program. This particular mess is why I had to write my own nohup - the new POSIX interfaces broke the existing one, and it remains broken today on almost all systems. > I am now thinking of something along these lines: > typedef void (*PyPendingCallNotify)(void *user_data); > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, > void *user_data); > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify > callback, void *user_data); Why would that help? The problems are semantic, not syntactic. Anthony Baxter isn't exaggerating the problem, despite what you may think from his posting. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Mon Sep 4 16:07:17 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 15:07:17 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: Chris McDonough wrote: > > Would adding an API for sigprocmask help here? No. sigprocmask is a large part of the problem. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From anthony at interlink.com.au Mon Sep 4 16:22:22 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 5 Sep 2006 00:22:22 +1000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <200609050022.23944.anthony@interlink.com.au> On Tuesday 05 September 2006 00:05, Nick Maclaren wrote: > Anthony Baxter isn't exaggerating the problem, despite what you may > think from his posting. If the SF bugtracker had a better search interface, you could see why I have such a bleak view of this area of Python. What's there now *mostly* works (I exclude freakshows like certain versions of HP/UX, AIX, SCO and the like). It took a hell of a lot of effort to get it to this point. threads + signals == tears. Anthony From gjcarneiro at gmail.com Mon Sep 4 16:52:36 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Mon, 4 Sep 2006 14:52:36 +0000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/4/06, Nick Maclaren wrote: > "Gustavo Carneiro" wrote: > > I am now thinking of something along these lines: > > typedef void (*PyPendingCallNotify)(void *user_data); > > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, > > void *user_data); > > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify > > callback, void *user_data); > > Why would that help? The problems are semantic, not syntactic. > > Anthony Baxter isn't exaggerating the problem, despite what you may > think from his posting. You guys are tough customers to please. I am just trying to solve a problem here, not create a new one; you have to believe me. OK, let's review what we know about current python, signals, and threads: 1. Python launches threads without touching sigprocmask; 2. Python installs signal handlers for all signals; 3. Signals can be delivered to any thread, let's assume (because of point #1 and not others not mentioned) that we have no control over which threads receive which signals, might as well be random for all we know; 4. Python signal handlers do almost nothing: just sets a flag, and calls Py_AddPendingCall, to postpone the job of handling a signal until a "safer" time. 5. The function Py_MakePendingCalls() should eventually get called at a "safer" time by user or python code. 6. It follows that until Py_MakePendingCalls() is called, the signal will not be handled at all! Now, back to explaining the problem. 1. In PyGTK we have a gobject.MainLoop.run() method, which blocks essentially forever in a poll() system call, and only wakes if/when it has to process timeout or IO event; 2. When we only have one thread, we can guarantee that e.g. SIGINT will always be caught by the thread running the g_main_loop_run(), so we know poll() will be interrupted and a EINTR will be generated, giving us control temporarily back to check for python signals; 3. When we have multiple thread, we cannot make this assumption, so instead we install a timeout to periodically check for signals. We want to get rid of timeouts. Now my idea: add a Python API to say: "dear Python, please call me when you start having pending calls, even if from a signal handler context, ok?" >From that point on, signals will get handled by Python, python calls PyGTK, PyGTK calls a special API to safely wake up the main loop even from a thread or signal handler, then main loop checks for signal by calling PyErr_CheckSignals(), it is handled by Python, and the process lives happily ever after, or die trying. I sincerely hope my explanation was satisfactory this time. Best regards. PS: there's a "funny" comment in Py_AddPendingCall that suggests it is not very safe against reentrancy problems: /* XXX Begin critical section */ /* XXX If you want this to be safe against nested XXX asynchronous calls, you'll have to work harder! */ Are signal handlers guaranteed to not be interrupted by another signal, at least? What about threads? From anthony at interlink.com.au Mon Sep 4 17:30:11 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 5 Sep 2006 01:30:11 +1000 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <200609050130.13189.anthony@interlink.com.au> On Tuesday 05 September 2006 00:52, Gustavo Carneiro wrote: > 3. Signals can be delivered to any thread, let's assume (because > of point #1 and not others not mentioned) that we have no control over > which threads receive which signals, might as well be random for all > we know; Note that some Unix variants only deliver signals to the main thread (or so the manpages allege, anyway). Anthony From exarkun at divmod.com Mon Sep 4 17:56:00 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 4 Sep 2006 11:56:00 -0400 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Message-ID: <20060904155600.1717.605687145.divmod.quotient.38950@ohm> On Mon, 04 Sep 2006 15:05:56 +0100, Nick Maclaren wrote: >"Gustavo Carneiro" wrote: >> >> That's a very good point; I wasn't aware that child processes >> inherited the signals mask from their parent processes. > >That's one of the few places where POSIX does describe what happens. >Well, usually. You really don't want to know what happens when you >call something revolting, like csh or a setuid program. This >particular mess is why I had to write my own nohup - the new POSIX >interfaces broke the existing one, and it remains broken today on >almost all systems. > >> I am now thinking of something along these lines: >> typedef void (*PyPendingCallNotify)(void *user_data); >> PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, >> void *user_data); >> PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify >> callback, void *user_data); > >Why would that help? The problems are semantic, not syntactic. > >Anthony Baxter isn't exaggerating the problem, despite what you may >think from his posting. > Dealing with threads and signals is certainly hairy. However, that barely has anything to do with what Gustavo is talking about. By the time Gustavo's proposed API springs into action, the threads already exist and the signal is already being handled by one. So, let's forget about threads and signals for a moment. The problem to be solved is that one piece of code wants to communicate a piece of information to another piece of code. The first piece of code is in Python itself. The second piece of code could be from any third-party library, and Python has no way of knowing about it - now. Gustavo is suggesting adding a registration API so that these third-party libraries can tell Python that they exist and are interested in this piece of information. Simple, no? PyGTK would presumably implement its pending call callback by writing a byte to a pipe which it is also passing to poll(). This lets them handle signals in a very timely manner without constantly waking up from poll() to see if Python wants to do any work. This is far from a new idea - it's basically the bog standard way of handling this situation. It strikes me as a very useful API to add to Python (although at this point in the 2.5 release process, not to 2.5, sorry Gustavo). Jean-Paul From david.nospam.hopwood at blueyonder.co.uk Mon Sep 4 18:19:27 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Mon, 04 Sep 2006 17:19:27 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <44FC520F.3070307@blueyonder.co.uk> Gustavo Carneiro wrote: > OK, let's review what we know about current python, signals, and threads: > > 1. Python launches threads without touching sigprocmask; > 2. Python installs signal handlers for all signals; > 3. Signals can be delivered to any thread, let's assume (because > of point #1 and not others not mentioned) that we have no control over > which threads receive which signals, might as well be random for all > we know; > 4. Python signal handlers do almost nothing: just sets a flag, > and calls Py_AddPendingCall, to postpone the job of handling a signal > until a "safer" time. > 5. The function Py_MakePendingCalls() should eventually get > called at a "safer" time by user or python code. > 6. It follows that until Py_MakePendingCalls() is called, the > signal will not be handled at all! > > Now, back to explaining the problem. > > 1. In PyGTK we have a gobject.MainLoop.run() method, which blocks > essentially forever in a poll() system call, and only wakes if/when it > has to process timeout or IO event; > 2. When we only have one thread, we can guarantee that e.g. > SIGINT will always be caught by the thread running the > g_main_loop_run(), so we know poll() will be interrupted and a EINTR > will be generated, giving us control temporarily back to check for > python signals; > 3. When we have multiple thread, we cannot make this assumption, > so instead we install a timeout to periodically check for signals. > > We want to get rid of timeouts. Now my idea: add a Python API to say: > "dear Python, please call me when you start having pending calls, > even if from a signal handler context, ok?" What can be safely done from a signal handler context is *very* limited. Calling back arbitrary Python code is certainly not safe. Reliable asynchronous interruption of arbitrary code is a difficult problem, but POSIX and POSIX implementations botch it particularly badly. I don't know how to implement what you want here, but I'd endorse the comments of Nick Maclaren and Antony Baxter against making precipitate changes. -- David Hopwood From nmm1 at cus.cam.ac.uk Mon Sep 4 18:24:27 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 17:24:27 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Mon, 04 Sep 2006 14:52:36 -0000." Message-ID: "Gustavo Carneiro" wrote: > > You guys are tough customers to please. I am just trying to solve a > problem here, not create a new one; you have to believe me. Oh, I believe you. Look at it this way. You are trying to resolve the problem that your farm is littered with cluster bombs, and your cows keep blowing their legs off. Your solution is effectively saying "well, let's travel around and pick them all up then". > We want to get rid of timeouts. Now my idea: add a Python API to say: > "dear Python, please call me when you start having pending calls, > even if from a signal handler context, ok?" Yes, I know. I have been there and done that, both academically and (observing, as a consultant) to the vendor. And that was on a system that was a damn sight better engineered than any of the main ones that Python runs on today. I have attempted to do much EASIER tasks under both Unix and (earlier) versions of Microsoft Windows, and failed dismally because the system wasn't up to it. > From that point on, signals will get handled by Python, python calls > PyGTK, PyGTK calls a special API to safely wake up the main loop even > from a thread or signal handler, then main loop checks for signal by > calling PyErr_CheckSignals(), it is handled by Python, and the process > lives happily ever after, or die trying. The first thing that will happen to that beautiful theory when it goes out into Unix County or Microsoft City is that a gang of ugly facts will find it and beat it into a pulp. > I sincerely hope my explanation was satisfactory this time. Oh, it was last time. It isn't that that is the problem. > Are signal handlers guaranteed to not be interrupted by another > signal, at least? What about threads? No and no. In theory, what POSIX says about blocking threads should be reliable; in my experience, it almost is, except under precisely the circumstances that you most want it to work. Look, I am agreeing that your basic design is right. What I am saying is that (a) you cannot make delivery reliable and abolish timeouts and (b) that it is such a revoltingly system-dependent mess that I would much rather Python didn't fiddle with it. Do you know how signalling is misimplemented at the hardware level? And that it is possible for a handler to be called with any of its critical pointers (INCLUDING the global code and data pointers) in undefined states? Do you know how to program round that sort of thing? I can answer "yes" to all three - for my sins, which must be many and grievous, for that to be the case :-( Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From david.nospam.hopwood at blueyonder.co.uk Mon Sep 4 18:24:56 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Mon, 04 Sep 2006 17:24:56 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <20060904155600.1717.605687145.divmod.quotient.38950@ohm> References: <20060904155600.1717.605687145.divmod.quotient.38950@ohm> Message-ID: <44FC5358.70806@blueyonder.co.uk> Jean-Paul Calderone wrote: > PyGTK would presumably implement its pending call callback by writing a > byte to a pipe which it is also passing to poll(). But doing that in a signal handler context invokes undefined behaviour according to POSIX. -- David Hopwood From exarkun at divmod.com Mon Sep 4 18:46:22 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 4 Sep 2006 12:46:22 -0400 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <44FC5358.70806@blueyonder.co.uk> Message-ID: <20060904164622.1717.895455315.divmod.quotient.38999@ohm> On Mon, 04 Sep 2006 17:24:56 +0100, David Hopwood wrote: >Jean-Paul Calderone wrote: >> PyGTK would presumably implement its pending call callback by writing a >> byte to a pipe which it is also passing to poll(). > >But doing that in a signal handler context invokes undefined behaviour >according to POSIX. write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. Was this changed in a later edition? Otherwise, I don't understand what you mean by this. Jean-Paul From nmm1 at cus.cam.ac.uk Mon Sep 4 19:18:41 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 18:18:41 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: Jean-Paul Calderone wrote: > On Mon, 04 Sep 2006 17:24:56 +0100, > David Hopwood der.co.uk> wrote: > >Jean-Paul Calderone wrote: > >> PyGTK would presumably implement its pending call callback by writing a > >> byte to a pipe which it is also passing to poll(). > > > >But doing that in a signal handler context invokes undefined behaviour > >according to POSIX. > > write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. > Was this changed in a later edition? Otherwise, I don't understand what you > mean by this. Try looking at the C90 or C99 standard, for a start :-( NOTHING may safely be done in a real signal handler, except possibly setting a value of type static volatile sig_atomic_t. And even that can be problematic. And note that POSIX defers to C on what the C languages defines. So, even if the function is async-signal-safe, the code that calls it can't be! POSIX's lists are complete fantasy, anyway. Look at the one that defines thread-safety, and then try to get your mind around what exit being thread-safe actually implies (especially with regard to atexit functions). Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From david.nospam.hopwood at blueyonder.co.uk Mon Sep 4 19:24:38 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Mon, 04 Sep 2006 18:24:38 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <20060904164622.1717.895455315.divmod.quotient.38999@ohm> References: <20060904164622.1717.895455315.divmod.quotient.38999@ohm> Message-ID: <44FC6156.3000708@blueyonder.co.uk> Jean-Paul Calderone wrote: > On Mon, 04 Sep 2006 17:24:56 +0100, David Hopwood wrote: > >>Jean-Paul Calderone wrote: >> >>>PyGTK would presumably implement its pending call callback by writing a >>>byte to a pipe which it is also passing to poll(). >> >>But doing that in a signal handler context invokes undefined behaviour >>according to POSIX. > > write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. I stand corrected. I must have misremembered this. -- David Hopwood From exarkun at divmod.com Mon Sep 4 19:55:41 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Mon, 4 Sep 2006 13:55:41 -0400 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Message-ID: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> On Mon, 04 Sep 2006 18:18:41 +0100, Nick Maclaren wrote: >Jean-Paul Calderone wrote: >> On Mon, 04 Sep 2006 17:24:56 +0100, >> David Hopwood > der.co.uk> wrote: >> >Jean-Paul Calderone wrote: >> >> PyGTK would presumably implement its pending call callback by writing a >> >> byte to a pipe which it is also passing to poll(). >> > >> >But doing that in a signal handler context invokes undefined behaviour >> >according to POSIX. >> >> write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. >> Was this changed in a later edition? Otherwise, I don't understand what you >> mean by this. > >Try looking at the C90 or C99 standard, for a start :-( > >NOTHING may safely be done in a real signal handler, except possibly >setting a value of type static volatile sig_atomic_t. And even that >can be problematic. And note that POSIX defers to C on what the C >languages defines. So, even if the function is async-signal-safe, >the code that calls it can't be! > >POSIX's lists are complete fantasy, anyway. Look at the one that >defines thread-safety, and then try to get your mind around what >exit being thread-safe actually implies (especially with regard to >atexit functions). > Thanks for expounding. Given that it is basically impossible to do anything useful in a signal handler according to the relevant standards (does Python's current signal handler even avoid relying on undefined behavior?), how would you suggest addressing this issue? It seems to me that it is actually possible to do useful things in a signal handler, so long as one accepts that doing so is relying on platform specific behavior. How hard would it be to implement this for the platforms Python supports, rather than for a hypothetical standards-exact platform? Jean-Paul From nmm1 at cus.cam.ac.uk Mon Sep 4 20:44:30 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Mon, 04 Sep 2006 19:44:30 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: Your message of "Mon, 04 Sep 2006 13:55:41 EDT." <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> Message-ID: Jean-Paul Calderone wrote: > > Thanks for expounding. Given that it is basically impossible to do > anything useful in a signal handler according to the relevant standards > (does Python's current signal handler even avoid relying on undefined > behavior?), how would you suggest addressing this issue? Much as you are doing, and I described, but the first step would be to find out what 'most' Python people need for signal handling in threaded programs. This is because there is an unavoidable conflict between portability/reliability and functionality. I would definitely block all signals in threads, except for those that are likely to be generated ON the thread (SIGFPE etc.) It is a very good idea not to touch the handling of several of those, because doing so can cause chaos. I would have at least two 'standard' handlers, one of which would simply set a flag and return, and the other of which would abort. Now, NEITHER is a very useful specification, but providing ANY information is risky, which is why it is critical to know what people need. I would not TRUST the blocking of signals, so would set up handlers even when I blocked them, and would do the minimum fiddling in the main thread compatible with decent functionality. I would provide a call to test if the signal flag was set, and another to test and clear it. This would be callable ONLY from the main thread, and that would be checked. It is possible to do better, but that starts needing serious research. > It seems to me that it is actually possible to do useful things in a > signal handler, so long as one accepts that doing so is relying on > platform specific behavior. Unfortunately, that is wrong. That was true under MVS and VMS, but in Unix and Microsoft systems, the problem is that the behaviour is both platform and circumstance-dependent. What you can do reliably depends mostly on what is going on at the time. For example, on many Unix and Microsoft platforms, signals received while you are in the middle of certain functions or system calls, or certain particular signals (often SIGFPE), call the C handler with a bad set of global pointers or similar. I believe that this is one of reasons (perhaps the main one) that some such failures so often cause debuggers to be unable to find the stack pointer. I have tracked a few of those down, and have occasionally identified the cause (and even got it fixed!), but it is a murderous task, and I know of few other people who have ever succeeded. > How hard would it be to implement this for the platforms Python supports, > rather than for a hypothetical standards-exact platform? I have seen this effect on OSF/1, IRIX, Solaris, Linux and versions of Microsoft Windows. I have never used a modern BSD, haven't used HP-UX since release 9, and haven't used Microsoft systems seriously in years (though I did hang my new laptop in its GUI fairly easily). As I say, this isn't so much a platform issue as a circumstance one. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From andreas.raab at gmx.de Mon Sep 4 23:36:19 2006 From: andreas.raab at gmx.de (Andreas Raab) Date: Mon, 04 Sep 2006 14:36:19 -0700 Subject: [Python-Dev] Cross-platform math functions? Message-ID: <44FC9C53.5060304@gmx.de> Hi - I'm curious if there is any interest in the Python community to achieve better cross-platform math behavior. A quick test[1] shows a non-surprising difference between the platform implementations. Question: Is there any interest in changing the behavior to produce identical results across platforms (for example by utilizing fdlibm [2])? Since I have need for a set of cross-platform math functions I'll probably start with a math-compatible fdlibm module (unless somebody has done that already ;-) Cheers, - Andreas [1] Using Python 2.4: >>> import math >>> math.cos(1.0e32) WinXP: -0.39929634612021897 LinuxX86: -0.49093671143542561 [2] http://www.netlib.org/fdlibm/ From gjcarneiro at gmail.com Tue Sep 5 01:31:06 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Tue, 5 Sep 2006 00:31:06 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> Message-ID: In GLib we have a child watch notification feature that relies on the following signal handler: static void g_child_watch_signal_handler (int signum) { child_watch_count ++; if (child_watch_init_state == CHILD_WATCH_INITIALIZED_THREADED) { write (child_watch_wake_up_pipe[1], "B", 1); } else { /* We count on the signal interrupting the poll in the same thread. */ } } Now, we've had this API for a long time already (at least 2.5 years). I'm pretty sure it works well enough on most *nix systems. Event if it works 99% of the times, it's way better than *failing* *100%* of the times, which is what happens now with Python. All I ask is an API to add a callback that Python signal handlers call, from signal context. That much I'm sure is safe. What happens from there on will be out of Python's hands, so Python purist^H^H^H^H^H^H developers cannot be blamed for anything that happens next. You can laugh at PyGTK and GLib all you want for having "unsafe signal handling", I don't care. Regards. On 9/4/06, Nick Maclaren wrote: > Jean-Paul Calderone wrote: > > > > Thanks for expounding. Given that it is basically impossible to do > > anything useful in a signal handler according to the relevant standards > > (does Python's current signal handler even avoid relying on undefined > > behavior?), how would you suggest addressing this issue? > > Much as you are doing, and I described, but the first step would be > to find out what 'most' Python people need for signal handling in > threaded programs. This is because there is an unavoidable conflict > between portability/reliability and functionality. > > I would definitely block all signals in threads, except for those that > are likely to be generated ON the thread (SIGFPE etc.) It is a very > good idea not to touch the handling of several of those, because doing > so can cause chaos. > > I would have at least two 'standard' handlers, one of which would simply > set a flag and return, and the other of which would abort. Now, NEITHER > is a very useful specification, but providing ANY information is risky, > which is why it is critical to know what people need. > > I would not TRUST the blocking of signals, so would set up handlers even > when I blocked them, and would do the minimum fiddling in the main > thread compatible with decent functionality. > > I would provide a call to test if the signal flag was set, and another > to test and clear it. This would be callable ONLY from the main thread, > and that would be checked. > > It is possible to do better, but that starts needing serious research. > > > It seems to me that it is actually possible to do useful things in a > > signal handler, so long as one accepts that doing so is relying on > > platform specific behavior. > > Unfortunately, that is wrong. That was true under MVS and VMS, but > in Unix and Microsoft systems, the problem is that the behaviour is > both platform and circumstance-dependent. What you can do reliably > depends mostly on what is going on at the time. > > For example, on many Unix and Microsoft platforms, signals received > while you are in the middle of certain functions or system calls, or > certain particular signals (often SIGFPE), call the C handler with a > bad set of global pointers or similar. I believe that this is one of > reasons (perhaps the main one) that some such failures so often cause > debuggers to be unable to find the stack pointer. > > I have tracked a few of those down, and have occasionally identified > the cause (and even got it fixed!), but it is a murderous task, and > I know of few other people who have ever succeeded. > > > How hard would it be to implement this for the platforms Python supports, > > rather than for a hypothetical standards-exact platform? > > I have seen this effect on OSF/1, IRIX, Solaris, Linux and versions > of Microsoft Windows. I have never used a modern BSD, haven't used > HP-UX since release 9, and haven't used Microsoft systems seriously > in years (though I did hang my new laptop in its GUI fairly easily). > > As I say, this isn't so much a platform issue as a circumstance one. > > > Regards, > Nick Maclaren, > University of Cambridge Computing Service, > New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. > Email: nmm1 at cam.ac.uk > Tel.: +44 1223 334761 Fax: +44 1223 334679 > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com > From tim.peters at gmail.com Tue Sep 5 01:06:50 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 4 Sep 2006 19:06:50 -0400 Subject: [Python-Dev] Cross-platform math functions? In-Reply-To: <44FC9C53.5060304@gmx.de> References: <44FC9C53.5060304@gmx.de> Message-ID: <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com> [Andreas Raab] > I'm curious if there is any interest in the Python community to achieve > better cross-platform math behavior. A quick test[1] shows a > non-surprising difference between the platform implementations. > Question: Is there any interest in changing the behavior to produce > identical results across platforms (for example by utilizing fdlibm > [2])? Since I have need for a set of cross-platform math functions I'll > probably start with a math-compatible fdlibm module (unless somebody has > done that already ;-) Package a Python wrapper and see how popular it becomes. Some reasons against trying to standardize on fdlibm were explained here: http://mail.python.org/pipermail/python-list/2005-July/290164.html Bottom line is I suspect that when it comes to bit-for-bit reproducibility, fewer people care about that x-platform than care about it x-language on the box they use. Nothing wrong with different modules for people with different desires. From tim.peters at gmail.com Tue Sep 5 04:25:01 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 4 Sep 2006 22:25:01 -0400 Subject: [Python-Dev] gcc 4.2 exposes signed integer overflows In-Reply-To: <200608301242.28648.anthony@interlink.com.au> References: <20060826190600.0E75911002B@bromo.msbb.uc.edu> <20060829201022.GA22579@code0.codespeak.net> <1f7befae0608291557l5b04a8f6wd1371e62a5c9c69c@mail.gmail.com> <200608301242.28648.anthony@interlink.com.au> Message-ID: <1f7befae0609041925h61c184f1m8716951740b00b39@mail.gmail.com> [Tim Peters] >> Speaking of which, I saw no feedback on the proposed patch in >> >> http://mail.python.org/pipermail/python-dev/2006-August/068502.html >> >> so I'll just check that in tomorrow. [Anthony Baxter] > This should also be backported to release24-maint and release23-maint. Let me > know if you can't do the backport... Done in rev 51711 on the 2.5 branch. Done in rev 51715 on the 2.4 branch. Done in rev 51716 on the trunk, although in the LONG_MIN way (which is less obscure, but a more "radical" code change). I don't care about the 2.3 branch, so leaving that to someone who does. Merge rev 51711 from the 2.5 branch. It will generate a conflict on Misc/NEWS. Easiest to revert Misc/NEWS then and just copy/paste the little blurb from 2.5 news at the appropriate place: """ - Overflow checking code in integer division ran afoul of new gcc optimizations. Changed to be more standard-conforming. """ From rhamph at gmail.com Tue Sep 5 05:28:37 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 4 Sep 2006 21:28:37 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/4/06, Nick Maclaren wrote: > Jean-Paul Calderone wrote: > > On Mon, 04 Sep 2006 17:24:56 +0100, > > David Hopwood > der.co.uk> wrote: > > >Jean-Paul Calderone wrote: > > >> PyGTK would presumably implement its pending call callback by writing a > > >> byte to a pipe which it is also passing to poll(). > > > > > >But doing that in a signal handler context invokes undefined behaviour > > >according to POSIX. > > > > write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. > > Was this changed in a later edition? Otherwise, I don't understand what you > > mean by this. > > Try looking at the C90 or C99 standard, for a start :-( > > NOTHING may safely be done in a real signal handler, except possibly > setting a value of type static volatile sig_atomic_t. And even that > can be problematic. And note that POSIX defers to C on what the C > languages defines. So, even if the function is async-signal-safe, > the code that calls it can't be! I don't believe that is true. It says (or atleast SUSv3 says) that: """ 3.26 Async-Signal-Safe Function A function that may be invoked, without restriction, from signal-catching functions. No function is async-signal-safe unless explicitly described as such.""" Sure, it doesn't give me a warm-fuzzy feeling of knowing why it works, but we can expect that it magically does. My understanding is that threading in general is the same way... Of course that doesn't preclude bugs in the various implementations, but those trump the standards anyway. -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Tue Sep 5 05:41:13 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 4 Sep 2006 21:41:13 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> Message-ID: On 9/4/06, Gustavo Carneiro wrote: > Now, we've had this API for a long time already (at least 2.5 > years). I'm pretty sure it works well enough on most *nix systems. > Event if it works 99% of the times, it's way better than *failing* > *100%* of the times, which is what happens now with Python. Failing 99% of the time is as bad as failing 100% of the time, if your goal is to eliminate the short timeout on poll(). 1% is quite a lot, and it would probably have an annoying tendency to trigger repeatedly when the user does certain things (not reproducible by you of course). That said, I do hope we can get 100%, or at least enough nines that we can increase the timeout significantly. -- Adam Olsen, aka Rhamphoryncus From nnorwitz at gmail.com Tue Sep 5 06:12:43 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 4 Sep 2006 21:12:43 -0700 Subject: [Python-Dev] [Python-checkins] TRUNK IS UNFROZEN, available for 2.6 work if you are so inclined In-Reply-To: References: <200608180023.14037.anthony@interlink.com.au> Message-ID: On 8/18/06, Georg Brandl wrote: > > I'd like to commit this. It fixes bug 1542051. > > Index: Objects/exceptions.c ... Georg, Did you still want to fix this? I don't remember anything happening with it. I don't see where _PyObject_GC_TRACK is called, so I'm not sure why _PyObject_GC_UNTRACK is necessary. You should probably add the patch to the bug report and we can discuss there. n From nnorwitz at gmail.com Tue Sep 5 06:14:34 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 4 Sep 2006 21:14:34 -0700 Subject: [Python-Dev] no remaining issues blocking 2.5 release In-Reply-To: <20060815164114.GB23991@niemeyer.net> References: <20060815164114.GB23991@niemeyer.net> Message-ID: Gustavo, Did you still want this addressed? Anthony and I made some comments on the bug/patch, but nothing has been updated. n -- On 8/15/06, Gustavo Niemeyer wrote: > > If you have issues, respond ASAP! The release candidate is planned to > > be cut this Thursday/Friday. There are only a few more days before > > code freeze. A branch will be made when the release candidate is cut. > > I'd like to see problem #1531862 fixed. The bug is clear and the > fix should be trivial. I can commit a fix tonight, if the subprocess > module author/maintainer is unavailable to check it out. > > -- > Gustavo Niemeyer > http://niemeyer.net > From nnorwitz at gmail.com Tue Sep 5 06:24:16 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 4 Sep 2006 21:24:16 -0700 Subject: [Python-Dev] 2.5 status Message-ID: There are 3 bugs currently listed in PEP 356 as blocking: http://python.org/sf/1551432 - __unicode__ breaks on exception classes http://python.org/sf/1550938 - improper exception w/relative import http://python.org/sf/1541697 - sgmllib regexp bug causes hang Does anyone want to fix the sgmlib issue? If not, we should revert this week before c2 is cut. I'm hoping that we will have *no changes* in 2.5 final from c2. Should there be any bugs/patches added to or removed from the list? The buildbots are currently humming along, but I believe all 3 versions (2.4, 2.5, and 2.6) are fine. Test out 2.5c1+ and report all bugs! n From andreas.raab at gmx.de Tue Sep 5 07:03:11 2006 From: andreas.raab at gmx.de (Andreas Raab) Date: Mon, 04 Sep 2006 22:03:11 -0700 Subject: [Python-Dev] Cross-platform math functions? In-Reply-To: <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com> References: <44FC9C53.5060304@gmx.de> <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com> Message-ID: <44FD050F.20901@gmx.de> Tim Peters wrote: > Package a Python wrapper and see how popular it becomes. Some reasons > against trying to standardize on fdlibm were explained here: > > http://mail.python.org/pipermail/python-list/2005-July/290164.html Thanks, these are good points. About speed, do you have any good benchmarks available? In my experience fdlibm is quite reasonable for speed in the context of use by dynamic languages (i.e., counting allocation overheads, lookup and send performance etc) but since I'm not a Python expert I'd appreciate some help with realistic benchmarks. > Bottom line is I suspect that when it comes to bit-for-bit > reproducibility, fewer people care about that x-platform than care > about it x-language on the box they use. Nothing wrong with different > modules for people with different desires. Agreed. Thus my question if someone had already done this ;-) Cheers, - Andreas From nmm1 at cus.cam.ac.uk Tue Sep 5 10:51:43 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 05 Sep 2006 09:51:43 +0100 Subject: [Python-Dev] Cross-platform math functions? Message-ID: Andreas Raab wrote: > > I'm curious if there is any interest in the Python community to achieve > better cross-platform math behavior. A quick test[1] shows a > non-surprising difference between the platform implementations. > Question: Is there any interest in changing the behavior to produce > identical results across platforms (for example by utilizing fdlibm > [2])? Since I have need for a set of cross-platform math functions I'll > probably start with a math-compatible fdlibm module (unless somebody has > done that already ;-) > > [1] Using Python 2.4: > >>> import math > >>> math.cos(1.0e32) > > WinXP: -0.39929634612021897 > LinuxX86: -0.49093671143542561 Well, I hope not, but I am afraid that there is :-( The word "better" is emotive and inaccurate. Such calculations are numerically meaningless, and merely encourage the confusion between consistency and correctness. There is a strong sense in which giving random results between -1 and 1 would be better. Now, I am not saying that you don't have a requirement for consistency but I am saying that confusing it with correctness (as has been fostered by IEEE 754, Java etc.) is harmful. One of the great advantages of the wide variety of arithmetics available in the 1970s is that numerical testing was easier and more reliable - if you got wildly different results on two platforms, you got a strong pointer to numerical problems. That viewpoint is regarded as heresy nowadays, but used not to be! Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Tue Sep 5 11:07:12 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 05 Sep 2006 10:07:12 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: "Adam Olsen" wrote: > On 9/4/06, Gustavo Carneiro wrote: > > > Now, we've had this API for a long time already (at least 2.5 > > years). I'm pretty sure it works well enough on most *nix systems. > > Event if it works 99% of the times, it's way better than *failing* > > *100%* of the times, which is what happens now with Python. > > Failing 99% of the time is as bad as failing 100% of the time, if your > goal is to eliminate the short timeout on poll(). 1% is quite a lot, > and it would probably have an annoying tendency to trigger repeatedly > when the user does certain things (not reproducible by you of course). That can make it a lot WORSE that repeated failure. At least with hard failures, you have some hope of tracking them down in a reasonable time. The problem with exception handling code that goes off very rarely, under non-reproducible circumstances, is that it is almost untestable and that bugs in it are positive nightmares. I have been inflicted with quite a large number in my time, and have a fairly good success rate, but the number of people who know the tricks is decreasing. Consider the (real) case where an unpredictable process on a large server (64 CPUs) was failing about twice a week (detectably), with no indication of how many failures were giving wrong answers. We replaced dozens of DIMMs, took days of down time and got nowhere; it then went hard (i.e. one failure a day). After a week's total down time, with me spending 100% of my time on it and the vendor allocating an expert at high priority, we cracked it. We were very lucky to find it so fast. I could give you other examples that were/are there years and decades later, because the pain threshhold never got high enough to dedicate the time (and the VERY few people with experience). I know of at least one such problem in generic TCP/IP (i.e. on Linux, IRIX, AIX and possibly Solaris) that has been there for decades and causes occasional failure in most networked applications/protocols. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From andreas.raab at gmx.de Tue Sep 5 11:17:25 2006 From: andreas.raab at gmx.de (Andreas Raab) Date: Tue, 05 Sep 2006 02:17:25 -0700 Subject: [Python-Dev] Cross-platform math functions? In-Reply-To: References: Message-ID: <44FD40A5.8090406@gmx.de> Nick Maclaren wrote: > The word "better" is emotive and inaccurate. Such calculations are > numerically meaningless, and merely encourage the confusion between > consistency and correctness. There is a strong sense in which giving > random results between -1 and 1 would be better. I did, of course, mean more consistent (and yes, random consistent results would be "better" by this definition and indeed I would prefer that over inconsistent but more accurate results ;-) Cheers, - Andreas From gjcarneiro at gmail.com Tue Sep 5 15:44:14 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Tue, 5 Sep 2006 14:44:14 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm> Message-ID: On 9/5/06, Adam Olsen wrote: > On 9/4/06, Gustavo Carneiro wrote: > > Now, we've had this API for a long time already (at least 2.5 > > years). I'm pretty sure it works well enough on most *nix systems. > > Event if it works 99% of the times, it's way better than *failing* > > *100%* of the times, which is what happens now with Python. > > Failing 99% of the time is as bad as failing 100% of the time, if your > goal is to eliminate the short timeout on poll(). 1% is quite a lot, > and it would probably have an annoying tendency to trigger repeatedly > when the user does certain things (not reproducible by you of course). > > That said, I do hope we can get 100%, or at least enough nines that we > can increase the timeout significantly. Anyway, I was speaking hypothetically. I'm pretty sure writing to a pipe is async signal safe. It is the oldest trick in the book, everyone uses it. I don't have to see a written signed contract to know that it works. Here's a list of web sites google found me that talk about this problem: This one describes the pipe writing technique: http://www.cocoadev.com/index.pl?SignalSafety This one presents a list of "The only routines that POSIX guarantees to be Async-Signal-Safe": http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWdev/MTP/p40.html#GEN-95948 Also here: http://www.cs.usyd.edu.au/cgi-bin/man.cgi?section=5&topic=attributes This is all the evidence that I need. And again I reiterate that whether or not async safety can be achieved in practice for all platforms is not Python's problem. Although I believe writing to a pipe is 100% reliable for most platforms. Even if it is not, any mission critical application relying on signals for correct behaviour should be rewritten to use unix sockets instead; end of argument. From nmm1 at cus.cam.ac.uk Tue Sep 5 15:53:45 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 05 Sep 2006 14:53:45 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: "Gustavo Carneiro" wrote: > > Anyway, I was speaking hypothetically. I'm pretty sure writing to a > pipe is async signal safe. It is the oldest trick in the book, > everyone uses it. I don't have to see a written signed contract to > know that it works. Ah. Well, I can assure you that it's not the oldest trick in the book, and not everyone uses it. > This is all the evidence that I need. And again I reiterate that > whether or not async safety can be achieved in practice for all > platforms is not Python's problem. I wish you the joy of trying to report a case where it doesn't work to a large vendor and get them to accept that it is a bug. > Although I believe writing to a > pipe is 100% reliable for most platforms. Even if it is not, any > mission critical application relying on signals for correct behaviour > should be rewritten to use unix sockets instead; end of argument. Er, no. There are lots of circumstances where that isn't feasible, such as wanting to close down an application cleanly when the scheduler sends it a SIGXCPU. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From gustavo at niemeyer.net Tue Sep 5 17:28:33 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Tue, 5 Sep 2006 12:28:33 -0300 Subject: [Python-Dev] no remaining issues blocking 2.5 release In-Reply-To: References: <20060815164114.GB23991@niemeyer.net> Message-ID: <20060905152833.GA12378@niemeyer.net> > Did you still want this addressed? Anthony and I made some comments > on the bug/patch, but nothing has been updated. I was waiting because I got unassigned from the bug, so I thought the maintainer was stepping up. I'll commit a fix for it today. Thanks for pinging me, -- Gustavo Niemeyer http://niemeyer.net From jimjjewett at gmail.com Tue Sep 5 18:08:19 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 5 Sep 2006 12:08:19 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: Message-ID: Reversing the order of the return tuple will break the alignment with split/rsplit. Why not just change which of the three strings holds the remainder in the not-found case? In rc1, "d".rpartition(".") --> ('d', '', '') If that changes to "d".rpartition(".") --> ('', '', 'd') then (1) the loop will terminate (2) rpartition will be more parallel to partition (and split), (3) people who used rpartition without looping to termination (and therefore didn't catch the problem) will still be able to use their existing working code. (4) the existing docstring would remain correct, though it could still be improved. (It says "returns S and two empty strings", but doesn't specify the order.) -jJ From rhettinger at ewtllc.com Tue Sep 5 18:13:49 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 09:13:49 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: Message-ID: <44FDA23D.2060602@ewtllc.com> Jim Jewett wrote: > >Why not just change which of the three strings holds the remainder in >the not-found case? > > That was the only change submitted. Are you happy with what was checked-in? Raymond From jdahlin at async.com.br Tue Sep 5 18:18:20 2006 From: jdahlin at async.com.br (Johan Dahlin) Date: Tue, 05 Sep 2006 13:18:20 -0300 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: <44FDA34C.6030605@async.com.br> Nick Maclaren wrote: > "Gustavo Carneiro" wrote: >> Anyway, I was speaking hypothetically. I'm pretty sure writing to a >> pipe is async signal safe. It is the oldest trick in the book, >> everyone uses it. I don't have to see a written signed contract to >> know that it works. > > Ah. Well, I can assure you that it's not the oldest trick in the book, > and not everyone uses it. > >> This is all the evidence that I need. And again I reiterate that >> whether or not async safety can be achieved in practice for all >> platforms is not Python's problem. > > I wish you the joy of trying to report a case where it doesn't work > to a large vendor and get them to accept that it is a bug. Are you saying that we should let less commonly used platforms dictate features and functionality for the popular ones? I mean, who uses HP/UX, SCO and [insert your favorite flavor] as a modern desktop system where this particular bug makes a difference? Can't this just be enabled for platforms where it's known to work and let Python as it currently is for the users of these legacy systems ? Johan From jimjjewett at gmail.com Tue Sep 5 18:47:26 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 5 Sep 2006 12:47:26 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDA23D.2060602@ewtllc.com> References: <44FDA23D.2060602@ewtllc.com> Message-ID: > Jim Jewett wrote: > >Why not just change which of the three strings holds the remainder in > >the not-found case? On 9/5/06, Raymond Hettinger wrote: > That was the only change submitted. > Are you happy with what was checked-in? This change looks wrong: PyDoc_STRVAR(rpartition__doc__, -"S.rpartition(sep) -> (head, sep, tail)\n\ +"S.rpartition(sep) -> (tail, sep, head)\n\ It looks like the code itself does the right thing, but I wasn't quite confident of that. -jJ From rhettinger at ewtllc.com Tue Sep 5 19:10:47 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 10:10:47 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <44FDA23D.2060602@ewtllc.com> Message-ID: <44FDAF97.3050502@ewtllc.com> > > This change looks wrong: > > PyDoc_STRVAR(rpartition__doc__, > -"S.rpartition(sep) -> (head, sep, tail)\n\ > +"S.rpartition(sep) -> (tail, sep, head)\n\ > > It looks like the code itself does the right thing, but I wasn't quite > confident of that. > It is correct. There may be some confusion in terminology. Head and tail do not mean left-side or right-side. Instead, they refer to the "small part chopped-off" and "the rest that is still choppable". Think of head and tail in the sense of car and cdr. A post-condition invariant for both str.partition() and str.rpartition() is: assert sep not in head For non-looping cases, users will likely to use different variable names when they unpack the tuple: left, middle, right = s.rpartition(p) But when they perform multiple partitions, the "tail" or "rest" terminology is more appropriate for the part of the string that may still contain separators. Raymond From mcherm at mcherm.com Tue Sep 5 19:24:46 2006 From: mcherm at mcherm.com (Michael Chermside) Date: Tue, 05 Sep 2006 10:24:46 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() Message-ID: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> Jim Jewett writes: > This change [in docs] looks wrong: > > PyDoc_STRVAR(rpartition__doc__, > -"S.rpartition(sep) -> (head, sep, tail)\n\ > +"S.rpartition(sep) -> (tail, sep, head)\n\ Raymond Hettinger replies: > It is correct. There may be some confusion in terminology. Head > and tail do not mean left-side or right-side. Instead, they refer to > the "small part chopped-off" and "the rest that is still choppable". > Think of head and tail in the sense of car and cdr. It is incorrect. The purpose of documentation is to explain things to users, and documentation which fails to achieve this is not "correct". The level of confusion generated by using "head" to refer to the last part of the string and "tail" to refer to the beginning, is quite significant. How about something like this: S.partition(sep) -> (head, sep, tail) S.rpartition(sep) -> (tail, sep, rest) Perhaps someone else can find something clearer than my suggestion, but in my own head, the terms "head" and "tail" are tighly bound with the idea of beginning and end (respectively) rather than with the idea of "small part chopped off" and "big part that is still choppable". -- Michael Chermside From barry at python.org Tue Sep 5 19:26:15 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 13:26:15 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDAF97.3050502@ewtllc.com> References: <44FDA23D.2060602@ewtllc.com> <44FDAF97.3050502@ewtllc.com> Message-ID: <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 1:10 PM, Raymond Hettinger wrote: >> This change looks wrong: >> >> PyDoc_STRVAR(rpartition__doc__, >> -"S.rpartition(sep) -> (head, sep, tail)\n\ >> +"S.rpartition(sep) -> (tail, sep, head)\n\ >> >> It looks like the code itself does the right thing, but I wasn't >> quite >> confident of that. >> > It is correct. There may be some confusion in terminology. Head and > tail do not mean left-side or right-side. Instead, they refer to the > "small part chopped-off" and "the rest that is still choppable". Think > of head and tail in the sense of car and cdr. > > A post-condition invariant for both str.partition() and > str.rpartition() is: > > assert sep not in head > > For non-looping cases, users will likely to use different variable > names > when they unpack the tuple: > > left, middle, right = s.rpartition(p) > > But when they perform multiple partitions, the "tail" or "rest" > terminology is more appropriate for the part of the string that may > still contain separators. ISTM this is just begging for newbie (and maybe not-so-newbie) confusion. Why not just document both as returning (left, sep, right) which seems the most obvious description of what the methods return? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP2zPHEjvBPtnXfVAQKpvQP/X1Vg9G4gZLl9R7/fnevmfeszTbqVk1Bq V7aXYm5pTFiD27cKV2e7MKZPifob6Pg8NPjsvAh6jZU5Uj0BUQhIwgDXZpcivsTM MykyPz8oVpSLRhu5xfYU1IZjbogoKfPQ04FkqWgtM2QUqKjiLcvwzPnzLNLVxx9r v2LplvrqJyc= =Tckf -----END PGP SIGNATURE----- From rhettinger at ewtllc.com Tue Sep 5 19:46:01 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 10:46:01 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> References: <44FDA23D.2060602@ewtllc.com> <44FDAF97.3050502@ewtllc.com> <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> Message-ID: <44FDB7D9.5040108@ewtllc.com> > ISTM this is just begging for newbie (and maybe not-so-newbie) > confusion. Why not just document both as returning (left, sep, > right) which seems the most obvious description of what the methods > return? I'm fine with that (though it's a little sad that we think the rather basic concepts of head and tail are beyond the grasp of typical pythonistas). Changing to left/sep/right will certainly disambiguate questions about the ordering of the return tuple. OTOH, there is some small loss in that the head/tail terminology is highly suggestive of how to use the function when making succesive partitions. Raymond From fdrake at acm.org Tue Sep 5 19:51:49 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 5 Sep 2006 13:51:49 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> Message-ID: <200609051351.50494.fdrake@acm.org> On Tuesday 05 September 2006 13:24, Michael Chermside wrote: > How about something like this: > > S.partition(sep) -> (head, sep, tail) > S.rpartition(sep) -> (tail, sep, rest) I think I prefer: S.partition(sep) -> (head, sep, rest) S.rpartition(sep) -> (tail, sep, rest) Here, "rest" is always used for "what remains"; head/tail are somewhat more clear here I think. -Fred -- Fred L. Drake, Jr. From barry at python.org Tue Sep 5 19:52:45 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 13:52:45 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDB7D9.5040108@ewtllc.com> References: <44FDA23D.2060602@ewtllc.com> <44FDAF97.3050502@ewtllc.com> <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> <44FDB7D9.5040108@ewtllc.com> Message-ID: <76BC85F2-2184-476C-8059-A1944BBDD194@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 1:46 PM, Raymond Hettinger wrote: >> ISTM this is just begging for newbie (and maybe not-so-newbie) >> confusion. Why not just document both as returning (left, sep, >> right) which seems the most obvious description of what the >> methods return? > > > I'm fine with that (though it's a little sad that we think the > rather basic concepts of head and tail are beyond the grasp of > typical pythonistas). > > Changing to left/sep/right will certainly disambiguate questions > about the ordering of the return tuple. OTOH, there is some small > loss in that the head/tail terminology is highly suggestive of how > to use the function when making succesive partitions. Personally, I'd rather the docstring be clear and concise rather than suggestive of use cases. IMO, the latter would be better served as an example in the latex documentation. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP25cXEjvBPtnXfVAQJ4EwQAuKnVxtyabdtAv/Eu9CcZ8EkcwCJYOoAT DmgMWeml861Sn4qN6NV1vMKbXljxiKqoSBgbKdpU+FRb6TeNiCisuWA0Q9xoOfsj Jyvy3XN54WXCUBNBnfsfUROPqxjiNGnKxYUzx2a+pjkeSSSZxDzbuplU+2ijB6w4 HJWIT4JLldA= =u6iU -----END PGP SIGNATURE----- From fdrake at acm.org Tue Sep 5 19:55:17 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 5 Sep 2006 13:55:17 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDB7D9.5040108@ewtllc.com> References: <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org> <44FDB7D9.5040108@ewtllc.com> Message-ID: <200609051355.18117.fdrake@acm.org> On Tuesday 05 September 2006 13:46, Raymond Hettinger wrote: > Changing to left/sep/right will certainly disambiguate questions about left/right is definately not helpful. It's also ambiguous in the case of .rpartition(), where left and right in the input and result are different. > the ordering of the return tuple. OTOH, there is some small loss in > that the head/tail terminology is highly suggestive of how to use the > function when making succesive partitions. See my previous note in this thread for another suggestion. -Fred -- Fred L. Drake, Jr. From jimjjewett at gmail.com Tue Sep 5 20:02:31 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 5 Sep 2006 14:02:31 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <200609051351.50494.fdrake@acm.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: On 9/5/06, Fred L. Drake, Jr. wrote: > S.partition(sep) -> (head, sep, rest) > S.rpartition(sep) -> (tail, sep, rest) > Here, "rest" is always used for "what remains"; head/tail are somewhat more > clear here I think. Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) Another possibility is data (for head/tail) and unparsed (for rest). S.partition(sep) -> (data, sep, unparsed) S.rpartition(sep) -> (unparsed, sep, data) I'm not sure which is worse -- (1) distinguishing between tail and rest (2) using (overly generic) jargon like unparsed and data. Whatever the final decision, it would probably be best to add an example to the docstring. "a.b.c".rpartition(".") -> ("a.b", ".", "c") -jJ From rhettinger at ewtllc.com Tue Sep 5 20:06:19 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 11:06:19 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: <44FDBC9B.6050406@ewtllc.com> > > Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) Gads, the cure is worse than the disease. car and cdr are starting to look pretty good ;-) Raymond From fdrake at acm.org Tue Sep 5 20:10:33 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 5 Sep 2006 14:10:33 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: <200609051410.34201.fdrake@acm.org> On Tuesday 05 September 2006 14:02, Jim Jewett wrote: > Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) Whichever matches reality, sure. I've lost track of the rpartition() result order. --sigh-- > Another possibility is data (for head/tail) and unparsed (for rest). > > S.partition(sep) -> (data, sep, unparsed) > S.rpartition(sep) -> (unparsed, sep, data) It's all data, so I think that's too contrived. > I'm not sure which is worse -- > (1) distinguishing between tail and rest > (2) using (overly generic) jargon like unparsed and data. I don't see the distinction between tail and rest as problematic. But I've not used lisp for a long time. > Whatever the final decision, it would probably be best to add an > example to the docstring. "a.b.c".rpartition(".") -> ("a.b", ".", > "c") Agreed. -Fred -- Fred L. Drake, Jr. From barry at python.org Tue Sep 5 20:12:16 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 14:12:16 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDBC9B.6050406@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDBC9B.6050406@ewtllc.com> Message-ID: <6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 2:06 PM, Raymond Hettinger wrote: >> Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) > > Gads, the cure is worse than the disease. > > car and cdr are starting to look pretty good ;-) LOL, the lisper in me likes that too, but I don't think it'll work. :) Fred's disagreement notwithstanding, I still like (left, sep, right), but another alternative comes to mind after actually reading the docstring for rpartition : (before, sep, after). Now, that's not ambiguous is it? Seems to work for both partition and rpartition. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP2+AHEjvBPtnXfVAQLiPAP+N80jHkoT5VNTtX1h2cqD4pONz+j2maCI QXDBoODucxLDPrig8FJ3c6IcT+Uapifu8Rrvd7Vm8gSPMUsMqAgAqhqNDbXTkHVH xLk31en2k2fdiCQKQyKJSjE1R1CaFCezByV29FK3fWvqrrxObISRnsxf/wXB6Czu pOUNSA9LLKo= =g+iz -----END PGP SIGNATURE----- From Scott.Daniels at Acm.Org Tue Sep 5 20:16:56 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Tue, 05 Sep 2006 11:16:56 -0700 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <44FDA34C.6030605@async.com.br> References: <44FDA34C.6030605@async.com.br> Message-ID: Johan Dahlin wrote: > Nick Maclaren wrote: >> "Gustavo Carneiro" wrote: >>> .... I'm pretty sure writing to a pipe is async signal safe. It is the >>> oldest trick in the book, everyone uses it. I ... know that it works. >> Ah. Well, I can assure you that it's not the oldest trick in the book, >> and not everyone uses it. > ... > Can't this just be enabled for platforms where it's known to work and let > Python as it currently is for the users of these legacy systems ? Ah, but that _is_ the current state of affairs. .5 :-) -- Scott David Daniels Scott.Daniels at Acm.Org From jjl at pobox.com Tue Sep 5 20:22:11 2006 From: jjl at pobox.com (John J Lee) Date: Tue, 5 Sep 2006 19:22:11 +0100 (GMT Standard Time) Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <200609051351.50494.fdrake@acm.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: On Tue, 5 Sep 2006, Fred L. Drake, Jr. wrote: > On Tuesday 05 September 2006 13:24, Michael Chermside wrote: > > How about something like this: > > > > S.partition(sep) -> (head, sep, tail) > > S.rpartition(sep) -> (tail, sep, rest) > > I think I prefer: > > S.partition(sep) -> (head, sep, rest) > S.rpartition(sep) -> (tail, sep, rest) > > Here, "rest" is always used for "what remains"; head/tail are somewhat more > clear here I think. But isn't rest is in the wrong place there, for rpartition: that's not the string that you might typically call.rpartition() on a second time. How about: S.partition(sep) -> (left, sep, rest) S.rpartition(sep) -> (rest, sep, right) John From brett at python.org Tue Sep 5 20:25:53 2006 From: brett at python.org (Brett Cannon) Date: Tue, 5 Sep 2006 11:25:53 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: On 9/4/06, Neal Norwitz wrote: > > There are 3 bugs currently listed in PEP 356 as blocking: > http://python.org/sf/1551432 - __unicode__ breaks on exception > classes I replied on the bug report, but might as well comment here. The problem with this bug is that BaseException now defines a __unicode__() method in its PyMethodDef. That intercepts the unicode() call on the class and it complains it was not handed an instance. I guess the only way to fix this is to toss out the __unicode__() method and change the tp_str function to return Unicode as needed (unless someone else has a better idea). Or the bug can be closed as Won't Fix. http://python.org/sf/1550938 - improper exception w/relative import > http://python.org/sf/1541697 - sgmllib regexp bug causes hang > > Does anyone want to fix the sgmlib issue? If not, we should revert > this week before c2 is cut. I'm hoping that we will have *no changes* > in 2.5 final from c2. Should there be any bugs/patches added to or > removed from the list? > > The buildbots are currently humming along, but I believe all 3 > versions (2.4, 2.5, and 2.6) are fine. > > Test out 2.5c1+ and report all bugs! > > n > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/8ac53a65/attachment-0001.htm From seojiwon at gmail.com Tue Sep 5 20:33:59 2006 From: seojiwon at gmail.com (Jiwon Seo) Date: Tue, 5 Sep 2006 11:33:59 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDBC9B.6050406@ewtllc.com> <6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org> Message-ID: On 9/5/06, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Sep 5, 2006, at 2:06 PM, Raymond Hettinger wrote: > > >> Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail) > > > > Gads, the cure is worse than the disease. > > > > car and cdr are starting to look pretty good ;-) > > LOL, the lisper in me likes that too, but I don't think it'll work. :) > but when it comes to cadr, cddr, cdar... ;^) I personally prefer (left, sep, right ) since it's most clear and there are many Python programmers whose first language is not English. > Fred's disagreement notwithstanding, I still like (left, sep, right), > but another alternative comes to mind after actually reading the > docstring for rpartition : (before, sep, after). Now, that's > not ambiguous is it? Seems to work for both partition and rpartition. > > - -Barry > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (Darwin) > > iQCVAwUBRP2+AHEjvBPtnXfVAQLiPAP+N80jHkoT5VNTtX1h2cqD4pONz+j2maCI > QXDBoODucxLDPrig8FJ3c6IcT+Uapifu8Rrvd7Vm8gSPMUsMqAgAqhqNDbXTkHVH > xLk31en2k2fdiCQKQyKJSjE1R1CaFCezByV29FK3fWvqrrxObISRnsxf/wXB6Czu > pOUNSA9LLKo= > =g+iz > -----END PGP SIGNATURE----- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com > From rhettinger at ewtllc.com Tue Sep 5 20:32:46 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 11:32:46 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> Message-ID: <44FDC2CE.1040902@ewtllc.com> Jim Jewett wrote: > > Another possibility is data (for head/tail) and unparsed (for rest). > > S.partition(sep) -> (data, sep, unparsed) > S.rpartition(sep) -> (unparsed, sep, data) This communicates very little about the ordering of the return tuple. Beware of overly general terms like "data" that provide no hints about the semantics of the method. The one good part that the terms are consistent between partition and rpartition so that the invariant can be stated: assert sep not in datum I recommend we just leave the existing head/tail wording and add an example which will make the meaning instantly clear: 'www.python.org'.rpartition('.') --> ('www.python', '.', 'org') Also, remember that this discussion is being held in abstract. An actual user of rpartition() is already thinking in terms of parsing from the end of the string. Another thought is that strings don't really have a left and right. They have a beginning and end. The left/right or top/bottom distinction is culture specific. Raymond BTW, if someone chops your ankles, does it matter which way you're facing to decide whether it was your feet or your head that had been cut-off? From rrr at ronadam.com Tue Sep 5 20:35:40 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 05 Sep 2006 13:35:40 -0500 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> Message-ID: <44FDC37C.80304@ronadam.com> Michael Chermside wrote: > Jim Jewett writes: >> This change [in docs] looks wrong: >> >> PyDoc_STRVAR(rpartition__doc__, >> -"S.rpartition(sep) -> (head, sep, tail)\n\ >> +"S.rpartition(sep) -> (tail, sep, head)\n\ > > Raymond Hettinger replies: >> It is correct. There may be some confusion in terminology. Head >> and tail do not mean left-side or right-side. Instead, they refer to >> the "small part chopped-off" and "the rest that is still choppable". >> Think of head and tail in the sense of car and cdr. > > > It is incorrect. The purpose of documentation is to explain > things to users, and documentation which fails to achieve this > is not "correct". The level of confusion generated by using "head" > to refer to the last part of the string and "tail" to refer to > the beginning, is quite significant. > > How about something like this: > > S.partition(sep) -> (head, sep, tail) > S.rpartition(sep) -> (tail, sep, rest) This isn't immediately clear to me what I will get. s.partition(sep) -> (left, sep, right) s.rpartition(sep) -> (left, sep, right) Would be clearer, along with an explanation of what left, and right are. I hope this discussion is only about the words used and the documentation and not about the actual order of what is received. I would expect both the following should be true, and it is the current behavior. ''.join(s.partition(sep)) -> s ''.join(s.rpartition(sep)) -> s > Perhaps someone else can find something clearer than my suggestion, > but in my own head, the terms "head" and "tail" are tighly bound > with the idea of beginning and end (respectively) rather than with > the idea of "small part chopped off" and "big part that is still > choppable". Maybe this? partition(...) S.partition(sep) -> (left, sep, right) Partition a string at the first occurrence of sep from the left into a tuple of left, sep, and right parts. Returns (S, '', '') if sep is not found in S. rpartition(...) S.rpartition(sep) -> (left, sep, right) Partition a string at the first occurrence of sep from the right into a tuple of left, sep, and right parts. Returns ('', '', S) if sep is not found in S. I feel the terms head and tail, rest etc... should be used in examples where their meaning will be clear by the context they are used in. But not in the definition where their meanings are not obvious. Cheers, Ron From rrr at ronadam.com Tue Sep 5 20:44:40 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 05 Sep 2006 13:44:40 -0500 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDC2C5.2080709@ronadam.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <44FDC2C5.2080709@ronadam.com> Message-ID: <44FDC598.2000106@ronadam.com> Ron Adam wrote: Correcting myself... > I hope this discussion is only about the words used and the > documentation and not about the actual order of what is received. I > would expect both the following should be true, and it is the current > behavior. > > ''.join(s.partition(sep)) -> s > ''.join(s.rpartition(sep)) -> s >>> 'abcd'.partition('x') ('abcd', '', '') >>> 'abcd'.rpartition('x') ('abcd', '', '') >>> Ok, I see Raymonds point, they are not what I expected. Although the above is still true, the returned value for the not found condition is inconsistent. _Ron From g.brandl at gmx.net Tue Sep 5 20:49:01 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 05 Sep 2006 20:49:01 +0200 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: Brett Cannon wrote: > > > On 9/4/06, *Neal Norwitz* > wrote: > > There are 3 bugs currently listed in PEP 356 as blocking: > http://python.org/sf/1551432 - __unicode__ breaks on > exception classes > > > I replied on the bug report, but might as well comment here. > > The problem with this bug is that BaseException now defines a > __unicode__() method in its PyMethodDef. That intercepts the unicode() > call on the class and it complains it was not handed an instance. I > guess the only way to fix this is to toss out the __unicode__() method > and change the tp_str function to return Unicode as needed (unless > someone else has a better idea). Or the bug can be closed as Won't Fix. Throwing out the __unicode__ method is fine with me -- exceptions didn't have one before the NeedForSpeed rewrite, so there would be no loss in functionality. Georg From barry at python.org Tue Sep 5 20:51:13 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 14:51:13 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDC2CE.1040902@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 2:32 PM, Raymond Hettinger wrote: > Another thought is that strings don't really have a left and right. > They have a beginning and end. The left/right or top/bottom > distinction > is culture specific. For the target of the method, this is true, but it's not true for the results which is what we're talking about describing here. 'left' is whatever is to the left of the separator and 'right' is whatever is to the right of the separator. Seems obvious to me. I believe (left, sep, right) will be the clearest description for all users, with little chance of confusion. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP3HIXEjvBPtnXfVAQIx5wP+MPF5tk4moX4jH0yhGvR6gKcGBusyN152 redIr0xiNqECfrIHkc756UDLn3HhB2WdEjR9pn06RzmbgePMPcGP19cjZdHGwjFK 3e4Qg8zW3cL0iCnybL4AEaoZksuHGwJpZbId9HF60GFqYdjNTKEMNIVRI7jTE9pP zbBO6Sscnl0= =HB4k -----END PGP SIGNATURE----- From rrr at ronadam.com Tue Sep 5 20:58:30 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 05 Sep 2006 13:58:30 -0500 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDC2CE.1040902@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: <44FDC8D6.5090002@ronadam.com> Raymond Hettinger wrote: > Another thought is that strings don't really have a left and right. > They have a beginning and end. The left/right or top/bottom distinction > is culture specific. Well, it should have been epartition() and not rpartition() in that case. ;-) Is python ever edited in languages that don't use left to right lines? From rhettinger at ewtllc.com Tue Sep 5 21:06:03 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 12:06:03 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDC37C.80304@ronadam.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <44FDC37C.80304@ronadam.com> Message-ID: <44FDCA9B.60101@ewtllc.com> Ron Adam wrote: >I hope this discussion is only about the words used and the >documentation and not about the actual order of what is received. I >would expect both the following should be true, and it is the current >behavior. > > ''.join(s.partition(sep)) -> s > ''.join(s.rpartition(sep)) -> s > > > Right. The only thing in question is wording for the documentation. The viable options on the table are: * Leave the current wording and add a clarifying example. * Switch to left/sep/right and add a clarifying example. The former tells you which part can still contain a separator and suggests how to use the tool when successive partitions are needed. The latter makes the left/right ordering clear and tells you nothing about which part can still have the separators in it. That has some import because the use cases for rpartition() all involve strings with multiple separators --if there were only one, you would just use partition(). BTW, the last check-in fixed the return value for the sep-not-found case, so that now: 'a'.partition('x') --> ('a', '', '') 'a'.rpartition('x') --> ('', '', 'a') This was necessary so that looping/recursion would work and so that rpartition() acts as a mirror-image of partition(). Raymond From tim.peters at gmail.com Tue Sep 5 21:07:43 2006 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 5 Sep 2006 15:07:43 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> upto, sep, rest in whatever order they apply. I think of a partition-like function as starting at some position and matching "up to" the first occurence of the separator (be that left or right or diagonally, "up to" is relative to the search direction), and leaving "the rest" alone. The docs should match that, since my mental model is correct ;-) From brett at python.org Tue Sep 5 21:19:52 2006 From: brett at python.org (Brett Cannon) Date: Tue, 5 Sep 2006 12:19:52 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: On 9/5/06, Georg Brandl wrote: > > Brett Cannon wrote: > > > > > > On 9/4/06, *Neal Norwitz* > > wrote: > > > > There are 3 bugs currently listed in PEP 356 as blocking: > > http://python.org/sf/1551432 - __unicode__ breaks on > > exception classes > > > > > > I replied on the bug report, but might as well comment here. > > > > The problem with this bug is that BaseException now defines a > > __unicode__() method in its PyMethodDef. That intercepts the unicode() > > call on the class and it complains it was not handed an instance. I > > guess the only way to fix this is to toss out the __unicode__() method > > and change the tp_str function to return Unicode as needed (unless > > someone else has a better idea). Or the bug can be closed as Won't Fix. > > Throwing out the __unicode__ method is fine with me -- exceptions didn't > have one before the NeedForSpeed rewrite, so there would be no loss in > functionality. If this step is done and the tp_str function is not changed to return Unicode as needed, PEP 352 will need to be updated. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/a3423370/attachment.html From mal at egenix.com Tue Sep 5 21:33:54 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 05 Sep 2006 21:33:54 +0200 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: <44FDD122.3000809@egenix.com> Brett Cannon wrote: > On 9/4/06, Neal Norwitz wrote: >> >> There are 3 bugs currently listed in PEP 356 as blocking: >> http://python.org/sf/1551432 - __unicode__ breaks on exception >> classes > > > I replied on the bug report, but might as well comment here. > > The problem with this bug is that BaseException now defines a __unicode__() > method in its PyMethodDef. That intercepts the unicode() call on the class > and it complains it was not handed an instance. I guess the only way to > fix this is to toss out the __unicode__() method and change the tp_str function > to return Unicode as needed (unless someone else has a better idea). Or > the bug can be closed as Won't Fix. The proper fix would be to introduce a tp_unicode slot and let this decide what to do, ie. call .__unicode__() methods on instances and use the .__name__ on classes. I think this would be the right way to go for Python 2.6. For Python 2.5, just dropping this .__unicode__ method on exceptions is probably the right thing to do. The reason why the PyObject_Unicode() function tries to be smart here is that we don't have a tp_unicode slot (to complement tp_str). It's obvious that this is not perfect, but only a work-around. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 05 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From brett at python.org Tue Sep 5 21:41:49 2006 From: brett at python.org (Brett Cannon) Date: Tue, 5 Sep 2006 12:41:49 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: <44FDD122.3000809@egenix.com> References: <44FDD122.3000809@egenix.com> Message-ID: On 9/5/06, M.-A. Lemburg wrote: > > Brett Cannon wrote: > > On 9/4/06, Neal Norwitz wrote: > >> > >> There are 3 bugs currently listed in PEP 356 as blocking: > >> http://python.org/sf/1551432 - __unicode__ breaks on exception > >> classes > > > > > > I replied on the bug report, but might as well comment here. > > > > The problem with this bug is that BaseException now defines a > __unicode__() > > method in its PyMethodDef. That intercepts the unicode() call on the > class > > and it complains it was not handed an instance. I guess the only way to > > fix this is to toss out the __unicode__() method and change the tp_str > function > > to return Unicode as needed (unless someone else has a better idea). Or > > the bug can be closed as Won't Fix. > > The proper fix would be to introduce a tp_unicode slot and let > this decide what to do, ie. call .__unicode__() methods on instances > and use the .__name__ on classes. That was my bug reaction and what I said on the bug report. Kind of surprised one doesn't already exist. I think this would be the right way to go for Python 2.6. For > Python 2.5, just dropping this .__unicode__ method on exceptions > is probably the right thing to do. Neal, do you want to rip it out or should I? -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/f0862cc8/attachment.htm From p.f.moore at gmail.com Tue Sep 5 21:41:58 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 5 Sep 2006 20:41:58 +0100 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> Message-ID: <79990c6b0609051241x35bfd75fia7a9d8bb095e1019@mail.gmail.com> On 9/5/06, Tim Peters wrote: > upto, sep, rest > > in whatever order they apply. I think of a partition-like function as > starting at some position and matching "up to" the first occurence of > the separator (be that left or right or diagonally, "up to" is > relative to the search direction), and leaving "the rest" alone. The > docs should match that, since my mental model is correct ;-) +1 Paul From nmm1 at cus.cam.ac.uk Tue Sep 5 21:44:50 2006 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Tue, 05 Sep 2006 20:44:50 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions Message-ID: Johan Dahlin wrote: > > Are you saying that we should let less commonly used platforms dictate > features and functionality for the popular ones? > I mean, who uses HP/UX, SCO and [insert your favorite flavor] as a modern > desktop system where this particular bug makes a difference? You haven't been following the thread. As I posted, this problem occurs to a greater or lesser degree on all platforms. This will be my last posting on the topic, but I shall try to explain. The first problem is in the hardware and operating system. A signal interrupts the thread, and passes control to a handler with a very partial environment and (usually) information on the environment when it was interrupted. If it interrupted the thread in the middle of a system call or other library routine that uses non-Python conventions, the registers and other state may be weird. There ARE solutions to this, but they are unbelievably foul, and even Linux on x86 gas had trouble with this. And, on return, everything has to be reversed entirely transparently! It is VERY common for there to be bugs in the C run-time system and not rare for there to be ones in the kernel (that area of Linux has been rewritten MANY times, for this reason). In many cases, the run-time system simply doesn't pretend to handle interrupts in arbitrary code (which is where the C undefined behaviour is used by vendors). The second problem is that what you can do depends both on what you were doing and how your 'primitive' is implemented. For example, if you call something that takes out even a very short term lock or uses a spin loop to emulate an atomic operation, you had better not use it if you interrupted code that was doing the same. Your thread may hang, crash or otherwise go bananas. Can you guarantee that even write is free of such things? No, and certainly not if you are using a debugger, a profiling library or even tracing system calls. I have often used programs that crashed as soon as I did one of those :-( Related to this is that it is EXTREMELY hard to write synchronisation primitives (mutexes etc.) that are interrupt-safe - MUCH harder than to write thread-safe ones - and few people are even aware of the issues. There was a thread on some Linux kernel mailing list about this, and even the kernel developers were having headaches thinking about the issues. Even if write is atomic, there are gotchas. What if the interrupted code is doing something to that file at the time? Are you SURE that an unexpected operation on it (in the same thread) won't cause the library function of program to get confused? And can you be sure that the write will terminate fast enough to not cause time-critical code to fail? And have you studied the exact semantics of blocking on pipes? They are truly horrible. So this is NOT a matter of platform X is safe and platform Y isn't. Even Linux x86 isn't entirely safe - or wasn't, the last time I heard. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From rhettinger at ewtllc.com Tue Sep 5 22:13:02 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 13:13:02 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> Message-ID: <44FDDA4E.2080506@ewtllc.com> Tim Peters wrote: > upto, sep, rest > >in whatever order they apply. > In the rpartition case, that would be (rest, sep, upto) which seems a bit cryptic. We need some choice of words that clearly mean: * the chopped-off snippet (guaranteed to not contain the separator) * the separator if found * the unchopped remainer of the string (which may contain a separator). Of course, if a clear example is added, the choice of words becomes much less important. Raymond From barry at python.org Tue Sep 5 22:17:20 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 16:17:20 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDDA4E.2080506@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com> <44FDDA4E.2080506@ewtllc.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 4:13 PM, Raymond Hettinger wrote: > Tim Peters wrote: > >> upto, sep, rest >> >> in whatever order they apply. >> > In the rpartition case, that would be (rest, sep, upto) which seems a > bit cryptic. > > We need some choice of words that clearly mean: > * the chopped-off snippet (guaranteed to not contain the separator) > * the separator if found > * the unchopped remainer of the string (which may contain a > separator). > > Of course, if a clear example is added, the choice of words becomes > much > less important. Ideally too, the terminology (and order) for partition and rpartition would be the same. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP3bVXEjvBPtnXfVAQJSKwP9Ev3MPzum3kp4hNDJZyBmEShzPvL2WQv2 VThbxZX1MDfeDXupNwF22bFA5gF/9vZp3nToUqyAbOaPSd93hJSHOdeWdAhR2BdT EICkzBTGCtVkbqu3Ep1N/jb9GJUvgkgNAWtRZVuTWQtJc6AanV9ssTcF6F7ipc6p zgSWeAc0a3E= =W7LV -----END PGP SIGNATURE----- From jimjjewett at gmail.com Tue Sep 5 22:43:20 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 5 Sep 2006 16:43:20 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: I think I finally figured out where Raymond is coming from. For Raymond, "head" is where he started processing -- for rpartition, this is the .endswith part. For me, "head" is the start of the data structure -- always the .startswith part. We won't resolve that with anything suggesting a sequential order; we need something that makes it clear which part is the large leftover. S.partition(sep) -> (record, sep, remains) S.rpartition(sep) -> (remains, sep, record) I do like the plural (or collective) sound of "remains". I have no solid reasoning for "record" vs "rec" vs "onerec". I would welcome a word that did not suggest it would have further internal structure. -jJ From barry at python.org Tue Sep 5 22:55:44 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2006 16:55:44 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote: > I think I finally figured out where Raymond is coming from. > > For Raymond, "head" is where he started processing -- for rpartition, > this is the .endswith part. > > For me, "head" is the start of the data structure -- always the > .startswith part. > > We won't resolve that with anything suggesting a sequential order; we > need something that makes it clear which part is the large leftover. See, for me, it's all about the results of the operation, not how the results are (supposedly) used. The way I think about it is that I've got some string and I'm looking for some split point within that string. That split point is clearly the "middle" (but "sep" works too) and everything to the right of that split point gets returned in "right" while everything to the left gets returned in "left". I'm less concerned with repeated splits because I probably have as many existing cases where I'm looking for the first split point as where I'm looking repeatedly for split points (think RFC 2822 header splitting -- partition will be awesome for this). The bias with these terms is clearly the English left-to-right order. Actually, that brings up an interesting question: what would happen if you called rpartition on a unicode string representing Hebrew, Arabic, or other RTL language? Do partition and rpartition suddenly switch directions? If not, then I think left-sep-right are fine. If so, then yeah, we probably need something else. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRP3kUHEjvBPtnXfVAQJd6wP+OBtRR22O0A+s/uHF3ACgWhrdZJdEnzEW qimKEWmDCUuK7CFIUsJKteoNNSHjIBgZIMMdnsymgI7CPgPNuB6CUAp8KFFeYvMy PVpMIqNFOFXGUVYf4VA7ED9S7QbbDzHJv32kUUZvbuTniYK9DVMi0O7GStsv1Kg6 insyP+W1EcU= =4aar -----END PGP SIGNATURE----- From pje at telecommunity.com Tue Sep 5 23:07:17 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 05 Sep 2006 17:07:17 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> Message-ID: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote: >On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote: > > > I think I finally figured out where Raymond is coming from. > > > > For Raymond, "head" is where he started processing -- for rpartition, > > this is the .endswith part. > > > > For me, "head" is the start of the data structure -- always the > > .startswith part. > > > > We won't resolve that with anything suggesting a sequential order; we > > need something that makes it clear which part is the large leftover. > >See, for me, it's all about the results of the operation, not how the >results are (supposedly) used. The way I think about it is that I've >got some string and I'm looking for some split point within that >string. That split point is clearly the "middle" (but "sep" works >too) and everything to the right of that split point gets returned in >"right" while everything to the left gets returned in "left". +1 for left/sep/right for both operations. It's easier to remember a visual correlation (left,sep,right) than it is to try and think about an abstraction in which the order of results has something to do with what direction I found the separator in. If I'm repeating from right to left, then of course the "left" is the part I'll want to repeat on. From rhettinger at ewtllc.com Tue Sep 5 23:16:53 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Tue, 05 Sep 2006 14:16:53 -0700 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> Message-ID: <44FDE945.7080801@ewtllc.com> An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/9475cb5e/attachment.html From gjcarneiro at gmail.com Wed Sep 6 02:21:11 2006 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Wed, 6 Sep 2006 01:21:11 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: Message-ID: On 9/5/06, Nick Maclaren wrote: [...] > Even if write is atomic, there are gotchas. What if the interrupted > code is doing something to that file at the time? Are you SURE that > an unexpected operation on it (in the same thread) won't cause the > library function of program to get confused? Yes, I'm sure. The technique is based on writing any arbitrary byte onto a well known pipe. Any byte will do. All it matters is that we trick the kernel into realizing there is data to read on the other end of the pipe, so that it can wake up the poll() syscall waiting on it. Only signal handlers ever write to this file descriptor. If one signal handler interrupts another one, it's ok; all it takes is that at least one of them succeeds, and the data itself is irrelevant. Only the mainloop ever reads from the pipe. > And can you be sure that the write will terminate fast enough to not cause time-critical code to fail? Time critical code should block signals. Or should use a real-time OS. > And have you studied the exact semantics of blocking > on pipes? They are truly horrible. The pipe is changed to async mode; never blocks. We don't care about any data being transferred at all, only the state on the file descriptor changing. > So this is NOT a matter of platform X is safe and platform Y isn't. > Even Linux x86 isn't entirely safe - or wasn't, the last time I heard. We can't prove write() is async safe, but you can't prove it isn't either. From all I know, write() doesn't use malloc(); it only loads a few registers and calls some interrupt (or syscall in amd64). It is plausible that it is perfectly async safe. And that's completely beside the point. We only ask python to call a function of ours every time it handles a signal. You are criticizing the way pygtk or glib will handle the notification, but we are here to discuss how will Python just give us a small hand in solving the signals problem. These are different problem domains. We don't ask Python developers to endorse any particular way of solving our problem. But since Python already snatches away our beloved signals, especially SIGINT, it should at least be courteous enough to give us just a notification when signals happen. There is _no_ other way. From david.nospam.hopwood at blueyonder.co.uk Wed Sep 6 03:08:03 2006 From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood) Date: Wed, 06 Sep 2006 02:08:03 +0100 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> Message-ID: <44FE1F73.7020206@blueyonder.co.uk> Barry Warsaw wrote: > The bias with these terms is clearly the English left-to-right > order. Actually, that brings up an interesting question: what would > happen if you called rpartition on a unicode string representing > Hebrew, Arabic, or other RTL language? Do partition and rpartition > suddenly switch directions? What happens is that rpartition searches the string backwards in logical order (i.e. left to right as the text is written, assuming it only contains Hebrew or Arabic letters, and not numbers or a mixture of scripts). But this is not "switching directions"; it's still searching backwards. You really don't want to think of bidirectional text in terms of presentation, when you're doing processing that should be independent of presentation. > If not, then I think left-sep-right are fine. If so, then yeah, we > probably need something else. +1 for (upto, sep, rest) -- and I think it should be in that order for both partition and rpartition. -- David Hopwood From pje at telecommunity.com Wed Sep 6 03:14:18 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 05 Sep 2006 21:14:18 -0400 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FE1F73.7020206@blueyonder.co.uk> References: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org> Message-ID: <5.1.1.6.0.20060905211030.026352c8@sparrow.telecommunity.com> At 02:08 AM 9/6/2006 +0100, David Hopwood wrote: >Barry Warsaw wrote: > > The bias with these terms is clearly the English left-to-right > > order. Actually, that brings up an interesting question: what would > > happen if you called rpartition on a unicode string representing > > Hebrew, Arabic, or other RTL language? Do partition and rpartition > > suddenly switch directions? > >What happens is that rpartition searches the string backwards in logical >order (i.e. left to right as the text is written, assuming it only contains >Hebrew or Arabic letters, and not numbers or a mixture of scripts). But this >is not "switching directions"; it's still searching backwards. You really >don't want to think of bidirectional text in terms of presentation, when >you're doing processing that should be independent of presentation. > > > If not, then I think left-sep-right are fine. If so, then yeah, we > > probably need something else. > >+1 for (upto, sep, rest) -- and I think it should be in that order for >both partition and rpartition. It appears the problem is that one group of people thinks in terms of the order of the string, and the other in terms of the order of processing. Both groups agree that both partition and rpartition should be "in the same order" -- but we disagree about what that means. :) Me, I want left/sep/right because I'm in the "string order" camp, and you want upto/sep/rest because you're in the "processing order" camp. From fperez.net at gmail.com Wed Sep 6 06:56:04 2006 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 05 Sep 2006 22:56:04 -0600 Subject: [Python-Dev] inspect.py very slow under 2.5 Message-ID: Hi all, I know that the 2.5 release is extremely close, so this will probably be 2.5.1 material. I discussed it briefly with Guido at scipy'06, and he asked for some profile-based info, which I've only now had time to gather. I hope this will be of some use, as I think the problem is rather serious. For context: I am the IPython lead developer (http://ipython.scipy.org), and ipython is used as the base shell for several interactive environments, one of which is the mathematics system SAGE (http://modular.math.washington.edu/sage). It was the SAGE lead who first ran into this problem while testing SAGE with 2.5. The issue is the following: ipython provides several exception reporting modes which give a lot more information than python's default tracebacks. In order to generate this info, it makes extensive use of the inspect module. The module in ipython responsible for these fancy tracebacks is: http://projects.scipy.org/ipython/ipython/browser/ipython/trunk/IPython/ultraTB.py which is an enhanced port of Ka Ping-Yee's old cgitb module. Under 2.5, the generation of one of these detailed tracebacks is /extremely/ expensive, and the cost goes up very quickly the more modules have been imported into the current session. While in a new ipython session the slowdown is not crippling, under SAGE (which starts with a lot of loaded modules) it is bad enough to make the system nearly unusable. I'm attaching a little script which can be run to show the problem, but you need IPython to be installed to run it. If any of you run ubuntu, fedora, suse or almost any other major linux distro, it's already available via the usual channels. In case you don't want to (or can't) run the attached code, here's a summary of what I see on my machine (ubuntu dapper). Using ipython under python 2.4.3, I get: 2268 function calls (2225 primitive calls) in 0.020 CPU seconds Ordered by: call count List reduced from 127 to 32 due to restriction <0.25> ncalls tottime percall cumtime percall filename:lineno(function) 305 0.000 0.000 0.000 0.000 :0(append) 259/253 0.010 0.000 0.010 0.000 :0(len) 177 0.000 0.000 0.000 0.000 :0(isinstance) 90 0.000 0.000 0.000 0.000 :0(match) 68 0.000 0.000 0.000 0.000 ultraTB.py:539(tokeneater) 68 0.000 0.000 0.000 0.000 tokenize.py:16 (generate_tokens) 61 0.000 0.000 0.000 0.000 :0(span) 57 0.000 0.000 0.000 0.000 sre_parse.py:130(__getitem__) 56 0.000 0.000 0.000 0.000 string.py:220(lower) etc, while running the same script under ipython/python2.5 and no other changes gives: 230370 function calls (229754 primitive calls) in 3.340 CPU seconds Ordered by: call count List reduced from 83 to 21 due to restriction <0.25> ncalls tottime percall cumtime percall filename:lineno(function) 55003 0.420 0.000 0.420 0.000 :0(startswith) 45026 0.264 0.000 0.264 0.000 :0(endswith) 20013 0.148 0.000 0.148 0.000 :0(append) 12138 0.180 0.000 0.660 0.000 posixpath.py:156(islink) 12138 0.192 0.000 0.192 0.000 :0(lstat) 12138 0.180 0.000 0.288 0.000 stat.py:60(S_ISLNK) 12138 0.108 0.000 0.108 0.000 stat.py:29(S_IFMT) 11838 0.680 0.000 1.244 0.000 posixpath.py:56(join) 4837 0.052 0.000 0.052 0.000 :0(len) 4362 0.028 0.000 0.028 0.000 :0(split) 4362 0.048 0.000 0.100 0.000 posixpath.py:47(isabs) 3598 0.036 0.000 0.056 0.000 string.py:218(lower) 3598 0.020 0.000 0.020 0.000 :0(lower) 2815 0.032 0.000 0.032 0.000 :0(isinstance) 2809 0.028 0.000 0.028 0.000 :0(join) 2808 0.264 0.000 0.520 0.000 posixpath.py:374(normpath) 2632 0.040 0.000 0.068 0.000 inspect.py:35(ismodule) 2143 0.016 0.000 0.016 0.000 :0(hasattr) 1884 0.028 0.000 0.444 0.000 posixpath.py:401(abspath) 1557 0.016 0.000 0.016 0.000 :0(range) 1078 0.008 0.000 0.044 0.000 inspect.py:342(getfile) These enormous numbers of calls are the origin of the slowdown, and the more modules have been imported, the worse it gets. I haven't had time to dive deep into inspect.py to try and fix this, but I figured it would be best to at least report it now. As far as IPython and its user projects is concerned, I'll probably hack things to overwrite inspect.py from 2.4 over the 2.5 version in the exception reporter, because the current code is simply unusable for detailed tracebacks. It would be great if this could be fixed in the trunk at some point. I'll be happy to provide further feedback or put this information elsewhere. Guido suggested initially posting here, but if you prefer it on the SF tracker (even as incomplete as this report is) I'll be glad to do so. Regards, f -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: traceback_timings.py Url: http://mail.python.org/pipermail/python-dev/attachments/20060905/fb0ac8bf/attachment.asc From steve at holdenweb.com Wed Sep 6 10:14:20 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 06 Sep 2006 09:14:20 +0100 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FDE945.7080801@ewtllc.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> <44FDE945.7080801@ewtllc.com> Message-ID: Raymond Hettinger wrote: [...] > That's fine with me. I accept there will always be someone who stands > on their head [...] You'd have to be some kind of contortionist to stand on your head. willfully-misunderstanding-ly y'rs - steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From ncoghlan at gmail.com Wed Sep 6 10:21:54 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 06 Sep 2006 18:21:54 +1000 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> Message-ID: <44FE8522.5020703@gmail.com> Phillip J. Eby wrote: > At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote: >> On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote: >> >>> I think I finally figured out where Raymond is coming from. >>> >>> For Raymond, "head" is where he started processing -- for rpartition, >>> this is the .endswith part. >>> >>> For me, "head" is the start of the data structure -- always the >>> .startswith part. >>> >>> We won't resolve that with anything suggesting a sequential order; we >>> need something that makes it clear which part is the large leftover. >> See, for me, it's all about the results of the operation, not how the >> results are (supposedly) used. The way I think about it is that I've >> got some string and I'm looking for some split point within that >> string. That split point is clearly the "middle" (but "sep" works >> too) and everything to the right of that split point gets returned in >> "right" while everything to the left gets returned in "left". > > +1 for left/sep/right for both operations. It's easier to remember a > visual correlation (left,sep,right) than it is to try and think about an > abstraction in which the order of results has something to do with what > direction I found the separator in. -1. The string docs are already lousy with left/right terminology that is flatout wrong when dealing with a script that is displayed with a right-to-left or vertical orientation*. In reality, strings are processed such that index 0 is the first character and index -1 is the last character, regardless of script orientation, but you could be forgiven for not realising that after reading the current string docs. Let's not make that particular problem any worse. I don't see anything wrong with Raymond's 'head, sep, tail' and 'tail, sep, head' terminology (although noting the common postcondition 'sep not in head' in the docstrings might be useful). However, if we're going to use the same result tuple for both, then I'd prefer 'before, sep, after', with the partition() postcondition being 'sep not in before' and the rpartition() postcondition being 'sep not in after'. Those terms are accurate regardless of script orientation. Either way, I suggest putting the postcondition in the docstring to make the difference between the two methods explicit. Regards, Nick. * I acknowledge that Python *code* is almost certainly going to be edited in a left-to-right text editor, because it's an English-based programming language. But the strings that string methods like partition() and rpartition() are used with are quite likely to be coming from or written to a or user interface that uses a native script orientation. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From steve at holdenweb.com Wed Sep 6 10:32:19 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 06 Sep 2006 09:32:19 +0100 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: <44FE8522.5020703@gmail.com> References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> <44FE8522.5020703@gmail.com> Message-ID: Nick Coghlan wrote: > Phillip J. Eby wrote: > >>At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote: >> >>>On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote: >>> >>> >>>>I think I finally figured out where Raymond is coming from. >>>> >>>>For Raymond, "head" is where he started processing -- for rpartition, >>>>this is the .endswith part. >>>> >>>>For me, "head" is the start of the data structure -- always the >>>>.startswith part. >>>> >>>>We won't resolve that with anything suggesting a sequential order; we >>>>need something that makes it clear which part is the large leftover. >>> >>>See, for me, it's all about the results of the operation, not how the >>>results are (supposedly) used. The way I think about it is that I've >>>got some string and I'm looking for some split point within that >>>string. That split point is clearly the "middle" (but "sep" works >>>too) and everything to the right of that split point gets returned in >>>"right" while everything to the left gets returned in "left". >> >>+1 for left/sep/right for both operations. It's easier to remember a >>visual correlation (left,sep,right) than it is to try and think about an >>abstraction in which the order of results has something to do with what >>direction I found the separator in. > > > -1. The string docs are already lousy with left/right terminology that is > flatout wrong when dealing with a script that is displayed with a > right-to-left or vertical orientation*. In reality, strings are processed such > that index 0 is the first character and index -1 is the last character, > regardless of script orientation, but you could be forgiven for not realising > that after reading the current string docs. Let's not make that particular > problem any worse. > > I don't see anything wrong with Raymond's 'head, sep, tail' and 'tail, sep, > head' terminology (although noting the common postcondition 'sep not in head' > in the docstrings might be useful). > > However, if we're going to use the same result tuple for both, then I'd prefer > 'before, sep, after', with the partition() postcondition being 'sep not in > before' and the rpartition() postcondition being 'sep not in after'. Those > terms are accurate regardless of script orientation. > > Either way, I suggest putting the postcondition in the docstring to make the > difference between the two methods explicit. > > Regards, > Nick. > > * I acknowledge that Python *code* is almost certainly going to be edited in a > left-to-right text editor, because it's an English-based programming language. > But the strings that string methods like partition() and rpartition() are used > with are quite likely to be coming from or written to a or user interface that > uses a native script orientation. > Perhaps we should be thinking "beginning" and "end" here, though it seems as though it won't be possible to find a terminology that will be intuitively obvious to everyone. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From g.brandl at gmx.net Wed Sep 6 10:39:07 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 06 Sep 2006 10:39:07 +0200 Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition() In-Reply-To: References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com> <200609051351.50494.fdrake@acm.org> <44FDC2CE.1040902@ewtllc.com> <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com> <44FE8522.5020703@gmail.com> Message-ID: Steve Holden wrote: >> * I acknowledge that Python *code* is almost certainly going to be edited in a >> left-to-right text editor, because it's an English-based programming language. >> But the strings that string methods like partition() and rpartition() are used >> with are quite likely to be coming from or written to a or user interface that >> uses a native script orientation. >> > Perhaps we should be thinking "beginning" and "end" here, though it > seems as though it won't be possible to find a terminology that will be > intuitively obvious to everyone. Which is why an example is absolutely necessary and will make things clear for everyone. Georg From ralf at brainbot.com Wed Sep 6 12:14:09 2006 From: ralf at brainbot.com (Ralf Schmitt) Date: Wed, 06 Sep 2006 12:14:09 +0200 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: References: Message-ID: <44FE9F71.3090903@brainbot.com> Fernando Perez wrote: > > These enormous numbers of calls are the origin of the slowdown, and the more > modules have been imported, the worse it gets. --- /exp/lib/python2.5/inspect.py 2006-08-28 11:53:36.000000000 +0200 +++ inspect.py 2006-09-06 12:10:45.000000000 +0200 @@ -444,7 +444,8 @@ in the file and the line number indexes a line in that list. An IOError is raised if the source code cannot be retrieved.""" file = getsourcefile(object) or getfile(object) - module = getmodule(object) + #module = getmodule(object) + module = None if module: lines = linecache.getlines(file, module.__dict__) else: The problem seems to originate from the module=getmodule(object) in findsource. If I outcomment that code (or rather do a module=None), things seem to be back as normal. (linecache.getlines has been called with a None module in python 2.4's inspect.py). - Ralf From mwh at python.net Wed Sep 6 12:34:23 2006 From: mwh at python.net (Michael Hudson) Date: Wed, 06 Sep 2006 11:34:23 +0100 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: (Gustavo Carneiro's message of "Mon, 4 Sep 2006 14:52:36 +0000") References: Message-ID: <2m8xkxnv0w.fsf@starship.python.net> "Gustavo Carneiro" writes: > On 9/4/06, Nick Maclaren wrote: >> "Gustavo Carneiro" wrote: >> > I am now thinking of something along these lines: >> > typedef void (*PyPendingCallNotify)(void *user_data); >> > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, >> > void *user_data); >> > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify >> > callback, void *user_data); >> >> Why would that help? The problems are semantic, not syntactic. >> >> Anthony Baxter isn't exaggerating the problem, despite what you may >> think from his posting. > > You guys are tough customers to please. Yes. > I am just trying to solve a problem here, not create a new one; you > have to believe me. We believe you, but you are stirring the ashes of old problems. > 1. In PyGTK we have a gobject.MainLoop.run() method, which blocks > essentially forever in a poll() system call, and only wakes if/when it > has to process timeout or IO event; > 2. When we only have one thread, we can guarantee that e.g. > SIGINT will always be caught by the thread running the > g_main_loop_run(), so we know poll() will be interrupted and a EINTR > will be generated, giving us control temporarily back to check for > python signals; > 3. When we have multiple thread, we cannot make this assumption, > so instead we install a timeout to periodically check for signals. > > We want to get rid of timeouts. Now my idea: add a Python API to say: > "dear Python, please call me when you start having pending calls, > even if from a signal handler context, ok?" This seems a reasonable proposal. But it's totally a Python 2.6 thing, so how about taking a deep breath, working on a patch and submitting it when it's ready? Having to wake a process up a few times a second is ugly and annoying, sure, but it is not a release delaying problem. Cheers, mwh -- It is never worth a first class man's time to express a majority opinion. By definition, there are plenty of others to do that. -- G. H. Hardy From ncoghlan at gmail.com Wed Sep 6 12:54:45 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 06 Sep 2006 20:54:45 +1000 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FE9F71.3090903@brainbot.com> References: <44FE9F71.3090903@brainbot.com> Message-ID: <44FEA8F5.1000700@gmail.com> Ralf Schmitt wrote: > The problem seems to originate from the module=getmodule(object) in > findsource. If I outcomment that code (or rather do a module=None), > things seem to be back as normal. (linecache.getlines has been called > with a None module in python 2.4's inspect.py). It looks like the problem is the call to getabspath() in getmodule(). This happens every time, even if the file name is already in the modulesbyfile cache. This calls os.path.abspath() and os.path.normpath() every time that inspect.findsource() is called. That can be fixed by having findsource() pass the filename argument to getmodule(), and adding a check of the modulesbyfile cache *before* the call to getabspath(). Can you try this patch and see if you get 2.4 level performance back on Fernando's test?: http://www.python.org/sf/1553314 (Assigned to Neal in the hopes of making 2.5rc2) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ralf at brainbot.com Wed Sep 6 13:22:45 2006 From: ralf at brainbot.com (Ralf Schmitt) Date: Wed, 06 Sep 2006 13:22:45 +0200 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FEA8F5.1000700@gmail.com> References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> Message-ID: <44FEAF85.1000107@brainbot.com> Nick Coghlan wrote: > > It looks like the problem is the call to getabspath() in getmodule(). This > happens every time, even if the file name is already in the modulesbyfile > cache. This calls os.path.abspath() and os.path.normpath() every time that > inspect.findsource() is called. > > That can be fixed by having findsource() pass the filename argument to > getmodule(), and adding a check of the modulesbyfile cache *before* the call > to getabspath(). > > Can you try this patch and see if you get 2.4 level performance back on > Fernando's test?: no. this doesn't work. getmodule always iterates over sys.modules.values() and only returns None afterwards. One would have to cache the bad file value, or only inspect new/changed modules from sys.modules. > > http://www.python.org/sf/1553314 > > (Assigned to Neal in the hopes of making 2.5rc2) > > Cheers, > Nick. > From g.brandl at gmx.net Wed Sep 6 14:41:19 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 06 Sep 2006 14:41:19 +0200 Subject: [Python-Dev] Exception message for invalid with statement usage Message-ID: Current trunk: >>> with 1: ... print "1" ... Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute '__exit__' Isn't that a bit crude? For "for i in 1" there's a better error message, so why shouldn't the above give a TypeError: 'int' object is not a context manager ? Georg From ncoghlan at gmail.com Wed Sep 6 15:06:33 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 06 Sep 2006 23:06:33 +1000 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FEAF85.1000107@brainbot.com> References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com> Message-ID: <44FEC7D9.80500@gmail.com> Ralf Schmitt wrote: > Nick Coghlan wrote: >> >> It looks like the problem is the call to getabspath() in getmodule(). >> This happens every time, even if the file name is already in the >> modulesbyfile cache. This calls os.path.abspath() and >> os.path.normpath() every time that inspect.findsource() is called. >> >> That can be fixed by having findsource() pass the filename argument to >> getmodule(), and adding a check of the modulesbyfile cache *before* >> the call to getabspath(). >> >> Can you try this patch and see if you get 2.4 level performance back >> on Fernando's test?: > > no. this doesn't work. getmodule always iterates over > sys.modules.values() and only returns None afterwards. > One would have to cache the bad file value, or only inspect new/changed > modules from sys.modules. Good point. I modified the patch so it does the latter (it only calls getabspath() again for a module if the value of module.__file__ changes). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Wed Sep 6 15:11:31 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 06 Sep 2006 23:11:31 +1000 Subject: [Python-Dev] Exception message for invalid with statement usage In-Reply-To: References: Message-ID: <44FEC903.7060303@gmail.com> Georg Brandl wrote: > Current trunk: > >>>> with 1: > ... print "1" > ... > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'int' object has no attribute '__exit__' > > Isn't that a bit crude? For "for i in 1" there's a better > error message, so why shouldn't the above give a > TypeError: 'int' object is not a context manager The for loop has a nice error message because it starts with its own opcode, but the with statement translates pretty much to the code in PEP 343. There's a special opcode at the end to help with unwinding the stack, but at the start it's just normal attribute retrieval opcodes for __enter__ and __exit__. >>> def f(): ... with 1: ... pass ... >>> dis.dis(f) 2 0 LOAD_CONST 1 (1) 3 DUP_TOP 4 LOAD_ATTR 0 (__exit__) 7 STORE_FAST 0 (_[1]) 10 LOAD_ATTR 1 (__enter__) 13 CALL_FUNCTION 0 16 POP_TOP 17 SETUP_FINALLY 4 (to 24) 3 20 POP_BLOCK 21 LOAD_CONST 0 (None) >> 24 LOAD_FAST 0 (_[1]) 27 DELETE_FAST 0 (_[1]) 30 WITH_CLEANUP 31 END_FINALLY 32 LOAD_CONST 0 (None) 35 RETURN_VALUE Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ralf at brainbot.com Wed Sep 6 16:53:30 2006 From: ralf at brainbot.com (Ralf Schmitt) Date: Wed, 06 Sep 2006 16:53:30 +0200 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FEC7D9.80500@gmail.com> References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com> Message-ID: <44FEE0EA.7000303@brainbot.com> Nick Coghlan wrote: > Ralf Schmitt wrote: >> Nick Coghlan wrote: >>> It looks like the problem is the call to getabspath() in getmodule(). >>> This happens every time, even if the file name is already in the >>> modulesbyfile cache. This calls os.path.abspath() and >>> os.path.normpath() every time that inspect.findsource() is called. >>> >>> That can be fixed by having findsource() pass the filename argument to >>> getmodule(), and adding a check of the modulesbyfile cache *before* >>> the call to getabspath(). >>> >>> Can you try this patch and see if you get 2.4 level performance back >>> on Fernando's test?: >> no. this doesn't work. getmodule always iterates over >> sys.modules.values() and only returns None afterwards. >> One would have to cache the bad file value, or only inspect new/changed >> modules from sys.modules. > > Good point. I modified the patch so it does the latter (it only calls > getabspath() again for a module if the value of module.__file__ changes). with _filesbymodname[modname] = file changed to _filesbymodname[modname] = f it seems to work ok. diff -r d41ffd2faa28 inspect.py --- a/inspect.py Wed Sep 06 13:01:12 2006 +0200 +++ b/inspect.py Wed Sep 06 16:52:39 2006 +0200 @@ -403,6 +403,7 @@ def getabsfile(object, _filename=None): return os.path.normcase(os.path.abspath(_filename)) modulesbyfile = {} +_filesbymodname = {} def getmodule(object, _filename=None): """Return the module an object was defined in, or None if not found.""" @@ -410,17 +411,23 @@ def getmodule(object, _filename=None): return object if hasattr(object, '__module__'): return sys.modules.get(object.__module__) + if _filename is not None and _filename in modulesbyfile: + return sys.modules.get(modulesbyfile[_filename]) try: file = getabsfile(object, _filename) except TypeError: return None if file in modulesbyfile: return sys.modules.get(modulesbyfile[file]) - for module in sys.modules.values(): + for modname, module in sys.modules.iteritems(): if ismodule(module) and hasattr(module, '__file__'): + f = module.__file__ + if f == _filesbymodname.get(modname, None): + continue + _filesbymodname[modname] = f f = getabsfile(module) modulesbyfile[f] = modulesbyfile[ - os.path.realpath(f)] = module.__name__ + os.path.realpath(f)] = modname if file in modulesbyfile: return sys.modules.get(modulesbyfile[file]) main = sys.modules['__main__'] @@ -444,7 +451,7 @@ def findsource(object): in the file and the line number indexes a line in that list. An IOError is raised if the source code cannot be retrieved.""" file = getsourcefile(object) or getfile(object) - module = getmodule(object) + module = getmodule(object, file) if module: lines = linecache.getlines(file, module.__dict__) else: From guido at python.org Wed Sep 6 17:46:21 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2006 08:46:21 -0700 Subject: [Python-Dev] Exception message for invalid with statement usage In-Reply-To: References: Message-ID: IMO it's fine. The only time you'll see this in reality is when someone passed you the wrong type of object by mistake, and then the type mentioned in the message is plenty help to debug it. Anyone with even a slight understanding of 'with' knows it involves '__exit__', and the linenumber should be a big fat hint, too. On 9/6/06, Georg Brandl wrote: > Current trunk: > > >>> with 1: > ... print "1" > ... > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'int' object has no attribute '__exit__' > > Isn't that a bit crude? For "for i in 1" there's a better > error message, so why shouldn't the above give a > TypeError: 'int' object is not a context manager > > ? > > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Wed Sep 6 22:44:31 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 6 Sep 2006 16:44:31 -0400 Subject: [Python-Dev] Cross-platform math functions? In-Reply-To: <44FD050F.20901@gmx.de> References: <44FC9C53.5060304@gmx.de> <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com> <44FD050F.20901@gmx.de> Message-ID: <1f7befae0609061344x61b1ae87vdd523fceb32a12d7@mail.gmail.com> [Tim Peters] >> Package a Python wrapper and see how popular it becomes. Some reasons >> against trying to standardize on fdlibm were explained here: >> >> http://mail.python.org/pipermail/python-list/2005-July/290164.html [Andreas Raab] > Thanks, these are good points. About speed, do you have any good > benchmarks available? Certainly not for "typical Python use" -- doubt such a benchmark exists. Some people use sqrt once in a blue moon, others make heavy use of many libm functions over millions & millions of floats, and in some apps extremely heavy use is made where speed is everything and accuracy doesn't much matter at all (e.g., gross plotting). I'd ask on numeric Python lists, and (e.g.) people working with visualization. > In my experience fdlibm is quite reasonable for speed in the context of use > by dynamic languages (i.e., counting allocation overheads, lookup and send > performance etc) "Reasonable" for which purpose(s), specifically? Some people would certainly care about a 5% slowdown, while most others wouldn't, but one thing to avoid is pissing off the people who use a thing the most ;-) > but since I'm not a Python expert I'd appreciate some help with realistic > benchmarks. As above, python-dev isn't a likely place to look for such answers. > ... > Agreed. Thus my question if someone had already done this ;-) Not that I know of, although my understanding (which may be wrong) is that glibc's current math functions started as a copy of fdlibm. From gustavo at niemeyer.net Thu Sep 7 01:24:23 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Wed, 6 Sep 2006 20:24:23 -0300 Subject: [Python-Dev] buildbot breakage Message-ID: <20060906232422.GA8620@niemeyer.net> Some buildbots will fail because they got revision r51793, and it has a change I made to fix a problem in the subprocess module. Please do not rollback any changes. I'm handling the issue. Also notice that there's no broken code there. The problem is that the issue in subprocess is related to stdout/stderr handling, and I'm having trouble making buildbot happy while keeping the new tests in place. I apologise for any inconvenience this may cause. -- Gustavo Niemeyer http://niemeyer.net From gustavo at niemeyer.net Thu Sep 7 01:45:50 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Wed, 6 Sep 2006 20:45:50 -0300 Subject: [Python-Dev] buildbot breakage In-Reply-To: <20060906232422.GA8620@niemeyer.net> References: <20060906232422.GA8620@niemeyer.net> Message-ID: <20060906234550.GA9265@niemeyer.net> > Some buildbots will fail because they got revision r51793, and it > has a change I made to fix a problem in the subprocess module. I've removed the offending test in r51794 and buildbots should be happy again. One of the ways of exploring the issue reported is using sys.stdout as the stdout keyword, such as: subprocess.call([...], stdout=sys.stdout) it breaks because it ends up closing one of the standard descriptors of the subprocess. Unfortunately we can't test it that way because buildbot uses a StringIO in sys.stdout. I kept the test which uses stdout=1, and removed the one expecting sys.stdout to be a "normal" file. Sorry for the trouble, -- Gustavo Niemeyer http://niemeyer.net From python-dev at zesty.ca Thu Sep 7 05:38:07 2006 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Wed, 6 Sep 2006 22:38:07 -0500 (CDT) Subject: [Python-Dev] new security doc using object-capabilities In-Reply-To: References: Message-ID: Hi Brett, Here are some comments on your proposal. Sorry this took so long. I apologize if any of these comments are out of date (but also look forward to your answers to some of the questions, as they'll help me understand some more of the details of your proposal). Thanks! > Introduction > /////////////////////////////////////// [...] > Throughout this document several terms are going to be used. A > "sandboxed interpreter" is one where the built-in namespace is not the > same as that of an interpreter whose built-ins were unaltered, which > is called an "unprotected interpreter". Is this a definition or an implementation choice? As in, are you defining "sandboxed" to mean "with altered built-ins" or just "restricted in some way", and does the above mean to imply that altering the built-ins is what triggers other kinds of restrictions (as it did in Python's old restricted execution mode)? > A "bare interpreter" is one where the built-in namespace has been > stripped down the bare minimum needed to run any form of basic Python > program. This means that all atomic types (i.e., syntactically > supported types), ``object``, and the exceptions provided by the > ``exceptions`` module are considered in the built-in namespace. There > have also been no imports executed in the interpreter. Is a "bare interpreter" just one example of a sandboxed interpreter, or are all sandboxed interpreters in your design initially bare (i.e. "sandboxed" = "bare" + zero or more granted authorities)? > The "security domain" is the boundary at which security is cared > about. For this dicussion, it is the interpreter. It might be clearer to say (if i understand correctly) "Each interpreter is a separate security domain." Many interpreters can run within a single operating system process, right? Could you say a bit about what sort of concurrency model you have in mind? How would this interact (if at all) with use of the existing threading functionality? > The "powerbox" is the thing that possesses the ultimate power in the > system. In our case it is the Python process. This could also be the application process, right? > Rationale > /////////////////////////////////////// [...] > For instance, think of an application that supports a plug-in system > with Python as the language used for writing plug-ins. You do not > want to have to examine every plug-in you download to make sure that > it does not alter your filesystem if you can help it. With a proper > security model and implementation in place this hinderance of having > to examine all code you execute should be alleviated. I'm glad to have this use case set out early in the document, so the reader can keep it in mind as an example while reading about the model. > Approaches to Security > /////////////////////////////////////// > > There are essentially two types of security: who-I-am > (permissions-based) security and what-I-have (authority-based) > security. As Mark Miller mentioned in another message, your descriptions of "who-I-am" security and "what-I-have" security make sense, but they don't correspond to "permission" vs. "authority". They correspond to "identity-based" vs. "authority-based" security. > Difficulties in Python for Object-Capabilities > ////////////////////////////////////////////// [...] > Three key requirements for providing a proper perimeter defence is > private namespaces, immutable shared state across domains, and > unforgeable references. Nice summary. > Problem of No Private Namespace > =============================== [...] > The Python language has no such thing as a private namespace. Don't local scopes count as private namespaces? It seems clear that they aren't designed with the intention of being exposed, unlike other namespaces in Python. > It also makes providing security at the object level using > object-capabilities non-existent in pure Python code. I don't think this is necessarily the case. No Python code i've ever seen expects to be able to invade the local scopes of other functions, so you could use them as private namespaces. There are two ways i've seen to invade local scopes: (a) Use gc.get_referents to get back from a cell object to its contents. (b) Compare the cell object to another cell object, thereby causing __eq__ to be invoked to compare the contents of the cells. So you could protect local scopes by prohibiting these or by simply turning off access to func_closure. It's clear that hardly any code depends on these introspection featuresl, so it would be reasonble to turn them off in a sandboxed interpreter. (It seems you would have to turn off some introspection features anyway in order to have reliable import guards.) > Problem of Mutable Shared State > =============================== [...] > Regardless, sharing of state that can be influenced by another > interpreter is not safe for object-capabilities. Yup. > Threat Model > /////////////////////////////////////// Good to see this specified here. I like the way you've broken this down. > * An interpreter cannot gain abilties the Python process possesses > without explicitly being given those abilities. It would be good to enumerate which abilities you're referring to in this item. For example, a bare interpreter should be able to allocate memory and call most of the built-in functions, but should not be able to open network connections. > * An interpreter cannot influence another interpreter directly at the > Python level without explicitly allowing it. You mean, without some other entity explicitly allowing it, right? What would that other entity be -- presumably the interpreter that spawned both of these sub-interpreters? > * An interpreter cannot use operating system resources without being > explicitly given those resources. Okay. > * A bare Python interpreter is always trusted. What does "trusted" mean in the above? > * Python bytecode is always distrusted. > * Pure Python source code is always safe on its own. It would be helpful to clarify "safe" here. I assume by "safe" you mean that the Python source code can express whatever it wants, including potentially dangerous activities, but when run in a bare or sandboxed interpreter it cannot have harmful effects. But then in what sense does the "safety" have to do with the Python source code rather than the restrictions on the interpreter? Would it be correct to say: + We want to guarantee that Python source code cannot violate the restrictions in a restricted or bare interpreter. + We do not prevent arbitrary Python bytecode from violating these restrictions, and assume that it can. > + Malicious abilities are derived from C extension modules, > built-in modules, and unsafe types implemented in C, not from > pure Python source. By "malicious" do you just mean "anything that isn't accessible to a bare interpreter"? > * A sub-interpreter started by another interpreter does not inherit > any state. Do you envision a tree of interpreters and sub-interpreters? Can the levels of spawning get arbitrarily deep? If i am visualizing your model correctly, maybe it would be useful to introduce the term "parent", where each interpreter has as its parent either the Python process or another interpreter. Then you could say that each interpreter acquires authority only by explicit granting from its parent. Then i have another question: can an interpreter acquire authorities only when it is started, or can it acquire them while it is running, and how? > Implementation > /////////////////////////////////////// > > Guiding Principles > ======================== > > To begin, the Python process garners all power as the powerbox. It is > up to the process to initially hand out access to resources and > abilities to interpreters. This might take the form of an interpreter > with all abilities granted (i.e., a standard interpreter as launched > when you execute Python), which then creates sub-interpreters with > sandboxed abilities. Another alternative is only creating > interpreters with sandboxed abilities (i.e., Python being embedded in > an application that only uses sandboxed interpreters). This sounds like part of your design to me. It might help to have this earlier in the document (maybe even with an example diagram of a tree of interpreters). > All security measures should never have to ask who an interpreter is. > This means that what abilities an interpreter has should not be stored > at the interpreter level when the security can use a proxy to protect > a resource. This means that while supporting a memory cap can > have a per-interpreter setting that is checked (because access to the > operating system's memory allocator is not supported at the program > level), protecting files and imports should not such a per-interpreter > protection at such a low level (because those can have extension > module proxies to provide the security). It might be good to declare two categories of resources -- those protected by object hiding and those protected by a per-interpreter setting -- and make lists. > Backwards-compatibility will not be a hindrance upon the design or > implementation of the security model. Because the security model will > inherently remove resources and abilities that existing code expects, > it is not reasonable to expect existing code to work in a sandboxed > interpreter. You might qualify the last statement a bit. For example, a Python implementation of a pure algorithm (e.g. string processing, data compression, etc.) would still work in a sandboxed interpreter. > Keeping Python "pythonic" is required for all design decisions. As Lawrence Oluyede also mentioned, it would be helpful to say a little more about what "pythonic" means. > Restricting what is in the built-in namespace and the safe-guarding > the interpreter (which includes safe-guarding the built-in types) is > where security will come from. Sounds good. > Abilities of a Standard Sandboxed Interpreter > ============================================= > [...] > * You cannot open any files directly. > * Importation > + You can import any pure Python module. > + You cannot import any Python bytecode module. > + You cannot import any C extension module. > + You cannot import any built-in module. > * You cannot find out any information about the operating system you > are running on. > * Only safe built-ins are provided. This looks reasonable. This is probably a good place to itemize exactly which built-ins are considered safe. > Imports > ------- > > A proxy for protecting imports will be provided. This is done by > setting the ``__import__()`` function in the built-in namespace of the > sandboxed interpreter to a proxied version of the function. > > The planned proxy will take in a passed-in function to use for the > import and a whitelist of C extension modules and built-in modules to > allow importation of. Presumably these are passed in to the proxy's constructor. > If an import would lead to loading an extension > or built-in module, it is checked against the whitelist and allowed > to be imported based on that list. All .pyc and .pyo file will not > be imported. All .py files will be imported. I'm unclear about this. Is the whitelist a list of module names only, or of filenames with extensions? Does the normal path-searching process take place or can it be restricted in some way? Would it simplify the security analysis to have the whitelist be a dictionary that maps module names to absolute pathnames? If both the .py and .pyc are present, the normal import would find the .pyc file; would the import proxy reject such an import or ignore it and recompile the .py instead? > It must be warned that importing any C extension module is dangerous. Right. > Implementing Import in Python > +++++++++++++++++++++++++++++ > > To help facilitate in the exposure of more of what importation > requires (and thus make implementing a proxy easier), the import > machinery should be rewritten in Python. This seems like a good idea. Can you identify which minimum essential pieces of the import machinery have to be written in C? > Sanitizing Built-In Types > ------------------------- [...] > Constructors > ++++++++++++ > > Almost all of Python's built-in types > contain a constructor that allows code to create a new instance of a > type as long as you have the type itself. Unfortunately this does not > work in an object-capabilities system without either providing a proxy > to the constructor or just turning it off. The existence of the constructor isn't (by itself) the problem. The problem is that both of the following are true: (a) From any object you can get its type object. (b) Using any type object you can construct a new instance. So, you can control this either by hiding the type object, separating the constructor from the type, or disabling the constructor. > Types whose constructors are considered dangerous are: > > * ``file`` > + Will definitely use the ``open()`` built-in. > * code objects > * XXX sockets? > * XXX type? > * XXX Looks good so far. Not sure i see what's dangerous about 'type'. > Filesystem Information > ++++++++++++++++++++++ > > When running code in a sandboxed interpreter, POLA suggests that you > do not want to expose information about your environment on top of > protecting its use. This means that filesystem paths typically should > not be exposed. Unfortunately, Python exposes file paths all over the > place: > > * Modules > + ``__file__`` attribute > * Code objects > + ``co_filename`` attribute > * Packages > + ``__path__`` attribute > * XXX > > XXX how to expose safely? It seems that in most cases, a single Python object is associated with a single pathname. If that's true in general, one solution would be to provide an introspection function named 'getpath' or something similar that would get the path associated with any object. This function might go in a module containing all the introspection functions, so imports of that module could be easily restricted. > Mutable Shared State > ++++++++++++++++++++ > > Because built-in types are shared between interpreters, they cannot > expose any mutable shared state. Unfortunately, as it stands, some > do. Below is a list of types that share some form of dangerous state, > how they share it, and how to fix the problem: > > * ``object`` > + ``__subclasses__()`` function > - Remove the function; never seen used in real-world code. > * XXX Okay, more to work out here. :) > Perimeter Defences Between a Created Interpreter and Its Creator > ---------------------------------------------------------------- > > The plan is to allow interpreters to instantiate sandboxed > interpreters safely. By using the creating interpreter's abilities to > provide abilities to the created interpreter, you make sure there is > no escalation in abilities. Good. > * ``__del__`` created in sandboxed interpreter but object is cleaned > up in unprotected interpreter. How do you envision the launching of a sandboxed interpreter to look? Could you sketch out some rough code examples? Were you thinking of something like: sys.spawn(code, dict) code: a string containing Python source code dict: the global namespace in which to run the code If you allow the parent interpreter to pass mutable objects into the child interpreter, then the parent and child can already communicate via the object, so '__del__' is a moot issue. Do you want to prevent all communication between parent and child? It's not obvious to me why that would be necessary. > * Using frames to walk the frame stack back to another interpreter. Could you just disable introspection of the frame stack? > Making the ``sys`` Module Safe > ------------------------------ [...] > This means that the ``sys`` module needs to have its safe information > separated out from the unsafe settings. Yes. > XXX separate modules, ``sys.settings`` and ``sys.info``, or strip > ``sys`` to settings and put info somewhere else? Or provide a method > that will create a faked sys module that has the safe values copied > into it? I think the last suggestion above would lead to confusion. The two groups should have two distinct names and it should be clear which attribute goes with which group. > Protecting I/O > ++++++++++++++ > > The ``print`` keyword and the built-ins ``raw_input()`` and > ``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``. > By exposing these attributes to the creating interpreter, one can set > them to safe objects, such as instances of ``StringIO``. Sounds good. > Safe Networking > --------------- > > XXX proxy on socket module, modify open() to be the constructor, etc. Lots more to think about here. :) > Protecting Memory Usage > ----------------------- > > To protect memory, low-level hooks into the memory allocator for > Python is needed. By hooking into the C API for memory allocation and > deallocation a very rough running count of used memory can kept. This > can be used to prevent sandboxed interpreters from using so much > memory that it impacts the overall performance of the system. Preventing denial-of-service is in general quite difficult, but i applaud the attempt. I agree with your decision to separate this work from the rest of the security model. -- ?!ng From nnorwitz at gmail.com Thu Sep 7 09:28:39 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 7 Sep 2006 00:28:39 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: <44FDD122.3000809@egenix.com> Message-ID: On 9/5/06, Brett Cannon wrote: > > > [MAL] > > The proper fix would be to introduce a tp_unicode slot and let > > this decide what to do, ie. call .__unicode__() methods on instances > > and use the .__name__ on classes. > > That was my bug reaction and what I said on the bug report. Kind of > surprised one doesn't already exist. > > > I think this would be the right way to go for Python 2.6. For > > Python 2.5, just dropping this .__unicode__ method on exceptions > > is probably the right thing to do. > > Neal, do you want to rip it out or should I? Is removing __unicode__ backwards compatible with 2.4 for both instances and exception classes? Does everyone agree this is the proper approach? I'm not familiar with this code. Brett, if everyone agrees (ie, remains silent), please fix this and add tests and a NEWS entry. Everyone should be looking for incompatibilities with previous versions. Exceptions are new and deserve special attention. Lots of the internals of strings (8-bit and unicode) and the struct module changed and should be tested thoroughly. I'm sure there are a bunch of other things I'm not remembering. The compiler is also an obvious target to verify your code still works. We're stuck with anything that makes it into 2.5, so now is the time to fix these problems. n From ronaldoussoren at mac.com Thu Sep 7 11:17:37 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 7 Sep 2006 11:17:37 +0200 Subject: [Python-Dev] 2.5 status In-Reply-To: References: Message-ID: On 5-sep-2006, at 6:24, Neal Norwitz wrote: > There are 3 bugs currently listed in PEP 356 as blocking: > http://python.org/sf/1551432 - __unicode__ breaks on > exception classes > http://python.org/sf/1550938 - improper exception w/ > relative import > http://python.org/sf/1541697 - sgmllib regexp bug causes hang > > Does anyone want to fix the sgmlib issue? If not, we should revert > this week before c2 is cut. I'm hoping that we will have *no changes* > in 2.5 final from c2. Should there be any bugs/patches added to or > removed from the list? > > The buildbots are currently humming along, but I believe all 3 > versions (2.4, 2.5, and 2.6) are fine. > > Test out 2.5c1+ and report all bugs! I have another bug that I'd like to fix: Mac/ReadMe contains an error: it claims that you can build the frameworkinstall into a temporary directory and then move it into place, but that isn't actually true. The erroneous paragraph is this: Note that there are no references to the actual locations in the code or resource files, so you are free to move things around afterwards. For example, you could use --enable-framework=/tmp/newversion/Library/ Frameworks and use /tmp/newversion as the basis for an installer or something. My proposed fix is to drop this paragraph. There is no bugreport for this yet, I got notified of this issue in a private e-mail. Ronald From nnorwitz at gmail.com Thu Sep 7 11:19:35 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 7 Sep 2006 02:19:35 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References:

Message-ID: Doc patches are fine, please fix. n -- On 9/7/06, Ronald Oussoren wrote: > > On 5-sep-2006, at 6:24, Neal Norwitz wrote: > > > There are 3 bugs currently listed in PEP 356 as blocking: > > http://python.org/sf/1551432 - __unicode__ breaks on > > exception classes > > http://python.org/sf/1550938 - improper exception w/ > > relative import > > http://python.org/sf/1541697 - sgmllib regexp bug causes hang > > > > Does anyone want to fix the sgmlib issue? If not, we should revert > > this week before c2 is cut. I'm hoping that we will have *no changes* > > in 2.5 final from c2. Should there be any bugs/patches added to or > > removed from the list? > > > > The buildbots are currently humming along, but I believe all 3 > > versions (2.4, 2.5, and 2.6) are fine. > > > > Test out 2.5c1+ and report all bugs! > > I have another bug that I'd like to fix: Mac/ReadMe contains an > error: it claims that you can build the frameworkinstall into a > temporary directory and then move it into place, but that isn't > actually true. The erroneous paragraph is this: > > Note that there are no references to the actual locations in the > code or > resource files, so you are free to move things around afterwards. > For example, > you could use --enable-framework=/tmp/newversion/Library/ > Frameworks and use > /tmp/newversion as the basis for an installer or something. > > My proposed fix is to drop this paragraph. There is no bugreport for > this yet, I got notified of this issue in a private e-mail. > > Ronald > From ncoghlan at gmail.com Thu Sep 7 12:59:01 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 07 Sep 2006 20:59:01 +1000 Subject: [Python-Dev] inspect.py very slow under 2.5 In-Reply-To: <44FEE0EA.7000303@brainbot.com> References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com> <44FEE0EA.7000303@brainbot.com> Message-ID: <44FFFB75.3030903@gmail.com> Ralf Schmitt wrote: > Nick Coghlan wrote: >> Good point. I modified the patch so it does the latter (it only calls >> getabspath() again for a module if the value of module.__file__ changes). > > with _filesbymodname[modname] = file changed to _filesbymodname[modname] > = f > it seems to work ok. I checked the inspect module unit tests and discovered the test for this function was only covering one of the half dozen possible execution paths. I've updated the patch on SF, and committed the fix (including PJE's and Neal's comments) to the trunk. I'll backport it tomorrow night (assuming I don't hear any objections in the meantime :). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From murman at gmail.com Thu Sep 7 15:37:41 2006 From: murman at gmail.com (Michael Urman) Date: Thu, 7 Sep 2006 08:37:41 -0500 Subject: [Python-Dev] Change in file() behavior in 2.5 Message-ID: Hi folks, Between 2.4 and 2.5 the behavior of file or open with the mode 'wU' has changed. In 2.4 it silently works. in 2.5 it raises a ValueError. I can't find any more discussion on it in python-dev than tangential mentions in this thread: http://mail.python.org/pipermail/python-dev/2006-June/065939.html It is (buried) in NEWS. First I found: Bug #1462152: file() now checks more thoroughly for invalid mode strings and removes a possible "U" before passing the mode to the C library function. Which seems to imply different behavior than the actual entry: bug #967182: disallow opening files with 'wU' or 'aU' as specified by PEP 278. I don't see anything in pep278 about a timeline, and wanted to make sure that transitioning directly from working to raising an error was a desired change. This actually caught a bug in an application I work with, which used an explicit 'wU', that will currently stop working when people upgrade Python but not our application. Thanks, Michael -- Michael Urman http://www.tortall.net/mu/blog From mwh at python.net Thu Sep 7 16:15:35 2006 From: mwh at python.net (Michael Hudson) Date: Thu, 07 Sep 2006 15:15:35 +0100 Subject: [Python-Dev] Change in file() behavior in 2.5 In-Reply-To: (Michael Urman's message of "Thu, 7 Sep 2006 08:37:41 -0500") References: Message-ID: <2m4pvjoj94.fsf@starship.python.net> "Michael Urman" writes: > Hi folks, > > Between 2.4 and 2.5 the behavior of file or open with the mode 'wU' > has changed. In 2.4 it silently works. in 2.5 it raises a ValueError. > I can't find any more discussion on it in python-dev than tangential > mentions in this thread: > http://mail.python.org/pipermail/python-dev/2006-June/065939.html > > It is (buried) in NEWS. First I found: > Bug #1462152: file() now checks more thoroughly for invalid mode > strings and removes a possible "U" before passing the mode to the > C library function. > Which seems to imply different behavior than the actual entry: > bug #967182: disallow opening files with 'wU' or 'aU' as specified by PEP > 278. > > I don't see anything in pep278 about a timeline, and wanted to make > sure that transitioning directly from working to raising an error was > a desired change. That it was silently ignored was never intentional; it was a bug and it was fixed. I don't think having a release with deprecation warnings and so on is worth it. > This actually caught a bug in an application I work with, which used > an explicit 'wU', that will currently stop working when people > upgrade Python but not our application. I would hope they wouldn't do that without careful testing anyway. Cheers, mwh -- No. In fact, my eyeballs fell out just from reading this question, so it's a good thing I can touch-type. -- John Baez, sci.physics.research From fperez.net at gmail.com Thu Sep 7 17:31:20 2006 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 07 Sep 2006 09:31:20 -0600 Subject: [Python-Dev] inspect.py very slow under 2.5 References: <44FE9F71.3090903@brainbot.com> <44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com> <44FEE0EA.7000303@brainbot.com> <44FFFB75.3030903@gmail.com> Message-ID: Nick Coghlan wrote: > I've updated the patch on SF, and committed the fix (including PJE's and > Neal's comments) to the trunk. > > I'll backport it tomorrow night (assuming I don't hear any objections in the > meantime :). I just wanted to thank you all for taking the time to work on this, even with my 11-th hour report. Greatly appreciated, really. Looking forward to 2.5! f From grig.gheorghiu at gmail.com Thu Sep 7 17:34:17 2006 From: grig.gheorghiu at gmail.com (Grig Gheorghiu) Date: Thu, 7 Sep 2006 08:34:17 -0700 Subject: [Python-Dev] 'with' bites Twisted Message-ID: <3f09d5a00609070834m35694c34u5af582dff3aa5bb4@mail.gmail.com> When the pybot buildslave for Twisted is trying to run the Twisted test suite via 'trial', it gets an exception: Traceback (most recent call last): File "/tmp/Twisted/bin/trial", line 23, in from twisted.scripts.trial import run File "/tmp/Twisted/twisted/scripts/trial.py", line 10, in from twisted.application import app File "/tmp/Twisted/twisted/application/app.py", line 10, in from twisted.application import service File "/tmp/Twisted/twisted/application/service.py", line 20, in from twisted.python import components File "/tmp/Twisted/twisted/python/components.py", line 37, in from zope.interface.adapter import AdapterRegistry File "/tmp/python-buildbot/local/lib/python2.6/site-packages/zope/interface/adapter.py", line 201 for with, objects in v.iteritems(): ^ SyntaxError: invalid syntax So the culprit in this case is really zope.interface. The full log is here: http://www.python.org/dev/buildbot/community/all/x86%20RedHat%209%20trunk/builds/97/step-shell/0 Grig -- http://agiletesting.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/148dda2b/attachment.htm From exarkun at divmod.com Thu Sep 7 18:06:13 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Thu, 7 Sep 2006 12:06:13 -0400 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com> Message-ID: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> On Thu, 7 Sep 2006 11:41:48 -0400, Timothy Fitz wrote: >On 9/5/06, Jean-Paul Calderone wrote: >>You cannot stop the reactor and then start it again. > >Why don't the reactors throw if this happens? This question comes up >almost once a month. > One could just as easily ask why no one bothers to read mailing list archives to see if their question has been answered before. No one will ever know, it is just one of the mysteries of the universe. Jean-Paul From aahz at pythoncraft.com Thu Sep 7 18:22:17 2006 From: aahz at pythoncraft.com (Aahz) Date: Thu, 7 Sep 2006 09:22:17 -0700 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> References: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com> <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> Message-ID: <20060907162217.GA17623@panix.com> On Thu, Sep 07, 2006, Jean-Paul Calderone wrote: > On Thu, 7 Sep 2006 11:41:48 -0400, Timothy Fitz wrote: >>On 9/5/06, Jean-Paul Calderone wrote: >>> >>>You cannot stop the reactor and then start it again. >> >>Why don't the reactors throw if this happens? This question comes up >>almost once a month. > > One could just as easily ask why no one bothers to read mailing list > archives to see if their question has been answered before. > > No one will ever know, it is just one of the mysteries of the universe. One could also ask why this got x-posted to python-dev... -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ I support the RKAB From skip at pobox.com Thu Sep 7 18:31:42 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 7 Sep 2006 11:31:42 -0500 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> References: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com> <20060907160613.1717.1053187541.divmod.quotient.42002@ohm> Message-ID: <17664.18798.756868.339094@montanaro.dyndns.org> Jean-Paul> One could just as easily ask why no one bothers to read Jean-Paul> mailing list archives to see if their question has been Jean-Paul> answered before. Jean-Paul> No one will ever know, it is just one of the mysteries of the Jean-Paul> universe. +1 QOTF... Skip From exarkun at divmod.com Thu Sep 7 18:36:00 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Thu, 7 Sep 2006 12:36:00 -0400 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <20060907162217.GA17623@panix.com> Message-ID: <20060907163600.1717.1300037898.divmod.quotient.42020@ohm> Sorry, brainfart. Jean-Paul From kristjan at ccpgames.com Thu Sep 7 18:56:15 2006 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=) Date: Thu, 7 Sep 2006 16:56:15 -0000 Subject: [Python-Dev] Unicode Imports Message-ID: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> Hello All. I just added patch 1552880 to sourceforge. It is a patch for 2.6 (and 2.5) which allows unicode paths in sys.path and uses the unicode file api on windows. This is tried and tested on 2.5, and backported to 2.3 and is currently running on clients in china and esewhere. It is minimally intrusive to the inporting mechanism, at the cost of some string conversion overhead (to utf8 and then back to unicode). Cheers, Kristj?n -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/071726d7/attachment.html From skip at pobox.com Thu Sep 7 19:23:39 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 7 Sep 2006 12:23:39 -0500 Subject: [Python-Dev] [Twisted-Python] Newbie question In-Reply-To: <20060907163600.1717.1300037898.divmod.quotient.42020@ohm> References: <20060907162217.GA17623@panix.com> <20060907163600.1717.1300037898.divmod.quotient.42020@ohm> Message-ID: <17664.21915.553226.875941@montanaro.dyndns.org> Jean-Paul> Sorry, brainfart. But still... QOTF ;-) S From amk at amk.ca Thu Sep 7 19:39:00 2006 From: amk at amk.ca (A.M. Kuchling) Date: Thu, 7 Sep 2006 13:39:00 -0400 Subject: [Python-Dev] Arlington sprints to occur monthly Message-ID: <20060907173900.GA4691@rogue.amk.ca> Jeffrey Elkner has arranged things so that the 1-day Python sprints in Arlington VA will now be happening every month. Future sprints will be on September 23rd, October 21st, November 18th, and December 16th. See http://wiki.python.org/moin/ArlingtonSprint for directions and to sign up. --amk From brett at python.org Thu Sep 7 19:39:20 2006 From: brett at python.org (Brett Cannon) Date: Thu, 7 Sep 2006 10:39:20 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: <44FDD122.3000809@egenix.com> Message-ID: On 9/7/06, Neal Norwitz wrote: > > On 9/5/06, Brett Cannon wrote: > > > > > [MAL] > > > The proper fix would be to introduce a tp_unicode slot and let > > > this decide what to do, ie. call .__unicode__() methods on instances > > > and use the .__name__ on classes. > > > > That was my bug reaction and what I said on the bug report. Kind of > > surprised one doesn't already exist. > > > > > I think this would be the right way to go for Python 2.6. For > > > Python 2.5, just dropping this .__unicode__ method on exceptions > > > is probably the right thing to do. > > > > Neal, do you want to rip it out or should I? > > Is removing __unicode__ backwards compatible with 2.4 for both > instances and exception classes? Should be. There was no proper __unicode__() originally so that's why this whole problem came up in the first place. Does everyone agree this is the proper approach? I'm not familiar > with this code. I am not terribly anymore either since Georg and Richard rewrote the whole thing. =) Brett, if everyone agrees (ie, remains silent), > please fix this and add tests and a NEWS entry. OK. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/3def1449/attachment.htm From anthony at interlink.com.au Thu Sep 7 19:53:03 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 8 Sep 2006 03:53:03 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> Message-ID: <200609080353.07502.anthony@interlink.com.au> On Friday 08 September 2006 02:56, Kristj?n V. J?nsson wrote: > Hello All. > I just added patch 1552880 to sourceforge. It is a patch for 2.6 (and 2.5) > which allows unicode paths in sys.path and uses the unicode file api on > windows. This is tried and tested on 2.5, and backported to 2.3 and is > currently running on clients in china and esewhere. It is minimally > intrusive to the inporting mechanism, at the cost of some string conversion > overhead (to utf8 and then back to unicode). As this can't be considered a bugfix (that I can see), I'd be against it being checked into 2.5. From brett at python.org Thu Sep 7 20:26:53 2006 From: brett at python.org (Brett Cannon) Date: Thu, 7 Sep 2006 11:26:53 -0700 Subject: [Python-Dev] new security doc using object-capabilities In-Reply-To: References: Message-ID: On 9/6/06, Ka-Ping Yee wrote: > > Hi Brett, > > Here are some comments on your proposal. Sorry this took so long. > I apologize if any of these comments are out of date (but also look > forward to your answers to some of the questions, as they'll help > me understand some more of the details of your proposal). Thanks! I think they are slightly outdated. The latest version of the doc is in the bcannon-objcap branch and is named securing_python.txt ( http://svn.python.org/view/python/branches/bcannon-objcap/securing_python.txt ). > Introduction > > /////////////////////////////////////// > [...] > > Throughout this document several terms are going to be used. A > > "sandboxed interpreter" is one where the built-in namespace is not the > > same as that of an interpreter whose built-ins were unaltered, which > > is called an "unprotected interpreter". > > Is this a definition or an implementation choice? As in, are you > defining "sandboxed" to mean "with altered built-ins" or just > "restricted in some way", and does the above mean to imply that > altering the built-ins is what triggers other kinds of restrictions > (as it did in Python's old restricted execution mode)? There is no "triggering" of other restrictions. This is an implementation choice. "Sandboxed" means "with altered built-ins". > A "bare interpreter" is one where the built-in namespace has been > > stripped down the bare minimum needed to run any form of basic Python > > program. This means that all atomic types (i.e., syntactically > > supported types), ``object``, and the exceptions provided by the > > ``exceptions`` module are considered in the built-in namespace. There > > have also been no imports executed in the interpreter. > > Is a "bare interpreter" just one example of a sandboxed interpreter, > or are all sandboxed interpreters in your design initially bare (i.e. > "sandboxed" = "bare" + zero or more granted authorities)? You build up from a bare interpreter by adding in authorities (e.g., providing a wrapped version of open()) to reach the level of security you want. > The "security domain" is the boundary at which security is cared > > about. For this dicussion, it is the interpreter. > > It might be clearer to say (if i understand correctly) "Each interpreter > is a separate security domain." > > Many interpreters can run within a single operating system process, > right? Yes. Could you say a bit about what sort of concurrency model you > have in mind? None specifically. Each new interpreter automatically runs in its own Python thread, so they have essentially the same concurrency as using the 'thread' module. How would this interact (if at all) with use of the > existing threading functionality? See above. > The "powerbox" is the thing that possesses the ultimate power in the > > system. In our case it is the Python process. > > This could also be the application process, right? If Python is embedded, yes. > Rationale > > /////////////////////////////////////// > [...] > > For instance, think of an application that supports a plug-in system > > with Python as the language used for writing plug-ins. You do not > > want to have to examine every plug-in you download to make sure that > > it does not alter your filesystem if you can help it. With a proper > > security model and implementation in place this hinderance of having > > to examine all code you execute should be alleviated. > > I'm glad to have this use case set out early in the document, so the > reader can keep it in mind as an example while reading about the model. > > > Approaches to Security > > /////////////////////////////////////// > > > > There are essentially two types of security: who-I-am > > (permissions-based) security and what-I-have (authority-based) > > security. > > As Mark Miller mentioned in another message, your descriptions of > "who-I-am" security and "what-I-have" security make sense, but > they don't correspond to "permission" vs. "authority". They > correspond to "identity-based" vs. "authority-based" security. Right. This was fixed the day Mark and Alan Karp made the comment. > Difficulties in Python for Object-Capabilities > > ////////////////////////////////////////////// > [...] > > Three key requirements for providing a proper perimeter defence is > > private namespaces, immutable shared state across domains, and > > unforgeable references. > > Nice summary. > > > Problem of No Private Namespace > > =============================== > [...] > > The Python language has no such thing as a private namespace. > > Don't local scopes count as private namespaces? It seems clear > that they aren't designed with the intention of being exposed, > unlike other namespaces in Python. Sort of. But you can still get access to them if you have an execution frame and they are not persistent. Generators are are worse since they store their execution frame with the generator itself, completely exposing the local namespace. > It also makes providing security at the object level using > > object-capabilities non-existent in pure Python code. I don't think this is necessarily the case. No Python code i've > ever seen expects to be able to invade the local scopes of other > functions, so you could use them as private namespaces. There > are two ways i've seen to invade local scopes: > > (a) Use gc.get_referents to get back from a cell object > to its contents. > > (b) Compare the cell object to another cell object, thereby > causing __eq__ to be invoked to compare the contents of > the cells. Or the execution frame which is exposed directly on generators. But regardless, the comment was meant to apply to Python as it stands, not that it couldn't be possibly tweaked somehow. So you could protect local scopes by prohibiting these or by > simply turning off access to func_closure. It's clear that hardly > any code depends on these introspection featuresl, so it would be > reasonble to turn them off in a sandboxed interpreter. (It seems > you would have to turn off some introspection features anyway in > order to have reliable import guards.) Maybe this can be changed in the future, but this more than I need at the moment so I am not going to go down that path right now. But I added a quick mention of this. > Problem of Mutable Shared State > > =============================== > [...] > > Regardless, sharing of state that can be influenced by another > > interpreter is not safe for object-capabilities. > > Yup. > > > Threat Model > > /////////////////////////////////////// > > Good to see this specified here. I like the way you've broken this > down. The current version has more details per point than the one you read. > * An interpreter cannot gain abilties the Python process possesses > > without explicitly being given those abilities. > > It would be good to enumerate which abilities you're referring to in > this item. For example, a bare interpreter should be able to allocate > memory and call most of the built-in functions, but should not be able > to open network connections. > > > * An interpreter cannot influence another interpreter directly at the > > Python level without explicitly allowing it. > > You mean, without some other entity explicitly allowing it, right? Yep. What would that other entity be -- presumably the interpreter that > spawned both of these sub-interpreters? Sure. You could stick something in the built-in namespace of the sub-interpreter to use for communicating. > * An interpreter cannot use operating system resources without being > > explicitly given those resources. > > Okay. > > > * A bare Python interpreter is always trusted. > > What does "trusted" mean in the above? It means that if Python source code can execute within a bare interpreter it is considered safe code. This is covered in the new version of the doc. > * Python bytecode is always distrusted. > > * Pure Python source code is always safe on its own. > > It would be helpful to clarify "safe" here. I assume by "safe" you > mean that the Python source code can express whatever it wants, > including potentially dangerous activities, but when run in a bare > or sandboxed interpreter it cannot have harmful effects. But then > in what sense does the "safety" have to do with the Python source code > rather than the restrictions on the interpreter? > > Would it be correct to say: > + We want to guarantee that Python source code cannot violate > the restrictions in a restricted or bare interpreter. > + We do not prevent arbitrary Python bytecode from violating > these restrictions, and assume that it can. > + Malicious abilities are derived from C extension modules, > > built-in modules, and unsafe types implemented in C, not from > > pure Python source. > > By "malicious" do you just mean "anything that isn't accessible to > a bare interpreter"? Anything that could harm the system or interpreter. > * A sub-interpreter started by another interpreter does not inherit > > any state. > > Do you envision a tree of interpreters and sub-interpreters? Can the > levels of spawning get arbitrarily deep? Yes and yes. If i am visualizing your model correctly, maybe it would be useful to > introduce the term "parent", where each interpreter has as its parent > either the Python process or another interpreter. Then you could say > that each interpreter acquires authority only by explicit granting from > its parent. You could, although there is not hierarchy at the implementation level. But it works in terms of who has a reference to whom and who gives each interpreter their authority. Then i have another question: can an interpreter acquire > authorities only when it is started, or can it acquire them while it is > running, and how? Well, whatever you want to do through the built-in namespace. So if you pass in a mutable object like a dict and add stuff to it on the fly, I don't see why you couldn't give new authorities on the fly. > Implementation > > /////////////////////////////////////// > > > > Guiding Principles > > ======================== > > > > To begin, the Python process garners all power as the powerbox. It is > > up to the process to initially hand out access to resources and > > abilities to interpreters. This might take the form of an interpreter > > with all abilities granted (i.e., a standard interpreter as launched > > when you execute Python), which then creates sub-interpreters with > > sandboxed abilities. Another alternative is only creating > > interpreters with sandboxed abilities (i.e., Python being embedded in > > an application that only uses sandboxed interpreters). > > This sounds like part of your design to me. It might help to have > this earlier in the document (maybe even with an example diagram of a > tree of interpreters). Made Guiding Principles its own section and split off the bottom part of the section and put it under Implementation. > All security measures should never have to ask who an interpreter is. > > This means that what abilities an interpreter has should not be stored > > at the interpreter level when the security can use a proxy to protect > > a resource. This means that while supporting a memory cap can > > have a per-interpreter setting that is checked (because access to the > > operating system's memory allocator is not supported at the program > > level), protecting files and imports should not such a per-interpreter > > protection at such a low level (because those can have extension > > module proxies to provide the security). > > It might be good to declare two categories of resources -- those > protected by object hiding and those protected by a per-interpreter > setting -- and make lists. That is rather unknown since I am constantly finding stuff that is global to the process compared to the interpreter, so making the list seems premature. > Backwards-compatibility will not be a hindrance upon the design or > > implementation of the security model. Because the security model will > > inherently remove resources and abilities that existing code expects, > > it is not reasonable to expect existing code to work in a sandboxed > > interpreter. > > You might qualify the last statement a bit. For example, a Python > implementation of a pure algorithm (e.g. string processing, data > compression, etc.) would still work in a sandboxed interpreter. I tossed in "all" to clarify. > Keeping Python "pythonic" is required for all design decisions. > > As Lawrence Oluyede also mentioned, it would be helpful to say a > little more about what "pythonic" means. Done in the current version. > Restricting what is in the built-in namespace and the safe-guarding > > the interpreter (which includes safe-guarding the built-in types) is > > where security will come from. > > Sounds good. > > > Abilities of a Standard Sandboxed Interpreter > > ============================================= > > > [...] > > * You cannot open any files directly. > > * Importation > > + You can import any pure Python module. > > + You cannot import any Python bytecode module. > > + You cannot import any C extension module. > > + You cannot import any built-in module. > > * You cannot find out any information about the operating system you > > are running on. > > * Only safe built-ins are provided. > > This looks reasonable. This is probably a good place to itemize > exactly which built-ins are considered safe. > > > Imports > > ------- > > > > A proxy for protecting imports will be provided. This is done by > > setting the ``__import__()`` function in the built-in namespace of the > > sandboxed interpreter to a proxied version of the function. > > > > The planned proxy will take in a passed-in function to use for the > > import and a whitelist of C extension modules and built-in modules to > > allow importation of. > > Presumably these are passed in to the proxy's constructor. Current plan is to expose the built-in namespace, imported modules, and sys module dict when creating an Interpreter instance. > If an import would lead to loading an extension > > or built-in module, it is checked against the whitelist and allowed > > to be imported based on that list. All .pyc and .pyo file will not > > be imported. All .py files will be imported. > > I'm unclear about this. Is the whitelist a list of module names only, > or of filenames with extensions? Have not deciced, but probably module name. Does the normal path-searching process > take place or can it be restricted in some way? Have not decided. Would it simplify the > security analysis to have the whitelist be a dictionary that maps module > names to absolute pathnames? Don't know. Protecting imports is the last thing I am going to implement since it is the trickiest. If both the .py and .pyc are present, the normal import would find the > .pyc file; would the import proxy reject such an import or ignore it > and recompile the .py instead? Somethign along those lines. > It must be warned that importing any C extension module is dangerous. > > Right. > > > Implementing Import in Python > > +++++++++++++++++++++++++++++ > > > > To help facilitate in the exposure of more of what importation > > requires (and thus make implementing a proxy easier), the import > > machinery should be rewritten in Python. > > This seems like a good idea. Can you identify which minimum essential > pieces of the import machinery have to be written in C? Loading of C extensions, stating files, reading files, etc. Pretty much that requires help from the OS. > Sanitizing Built-In Types > > ------------------------- > [...] > > Constructors > > ++++++++++++ > > > > Almost all of Python's built-in types > > contain a constructor that allows code to create a new instance of a > > type as long as you have the type itself. Unfortunately this does not > > work in an object-capabilities system without either providing a proxy > > to the constructor or just turning it off. > > The existence of the constructor isn't (by itself) the problem. > The problem is that both of the following are true: > > (a) From any object you can get its type object. > (b) Using any type object you can construct a new instance. > > So, you can control this either by hiding the type object, separating > the constructor from the type, or disabling the constructor. I separated the constructor or initializer (tp_new or tp_init) into a factory function. > Types whose constructors are considered dangerous are: > > > > * ``file`` > > + Will definitely use the ``open()`` built-in. > > * code objects > > * XXX sockets? > > * XXX type? > > * XXX > > Looks good so far. Not sure i see what's dangerous about 'type'. That's why it has the question mark. =) > Filesystem Information > > ++++++++++++++++++++++ > > > > When running code in a sandboxed interpreter, POLA suggests that you > > do not want to expose information about your environment on top of > > protecting its use. This means that filesystem paths typically should > > not be exposed. Unfortunately, Python exposes file paths all over the > > place: > > > > * Modules > > + ``__file__`` attribute > > * Code objects > > + ``co_filename`` attribute > > * Packages > > + ``__path__`` attribute > > * XXX > > > > XXX how to expose safely? > > It seems that in most cases, a single Python object is associated with > a single pathname. If that's true in general, one solution would be > to provide an introspection function named 'getpath' or something > similar that would get the path associated with any object. This > function might go in a module containing all the introspection functions, > so imports of that module could be easily restricted. That is the current thinking. > Mutable Shared State > > ++++++++++++++++++++ > > > > Because built-in types are shared between interpreters, they cannot > > expose any mutable shared state. Unfortunately, as it stands, some > > do. Below is a list of types that share some form of dangerous state, > > how they share it, and how to fix the problem: > > > > * ``object`` > > + ``__subclasses__()`` function > > - Remove the function; never seen used in real-world code. > > * XXX > > Okay, more to work out here. :) Possibly. I might have to wait until I am much closer to being done to discover more places where mutable shared state is exposed in a bare interpreter because I have not been able to think of anymore. > Perimeter Defences Between a Created Interpreter and Its Creator > > ---------------------------------------------------------------- > > > > The plan is to allow interpreters to instantiate sandboxed > > interpreters safely. By using the creating interpreter's abilities to > > provide abilities to the created interpreter, you make sure there is > > no escalation in abilities. > > Good. > > > * ``__del__`` created in sandboxed interpreter but object is cleaned > > up in unprotected interpreter. > > How do you envision the launching of a sandboxed interpreter to look? > Could you sketch out some rough code examples? >>> interp = interpreter.Interpreter() >>> interp.builtins['open'] = wrapped_open() >>> interp.sys_dict['path'] = [] >>> interp.exec("2 + 3") Were you thinking of > something like: > > sys.spawn(code, dict) > code: a string containing Python source code > dict: the global namespace in which to run the code > > If you allow the parent interpreter to pass mutable objects into the > child interpreter, then the parent and child can already communicate > via the object, so '__del__' is a moot issue. Do you want to prevent > all communication between parent and child? It's not obvious to me > why that would be necessary. No, I don't since there should be a secure way to allow that. The __del__ worry came up from Guido pointing out you might be able to screw with it. But if you pass in something implemented in C you should be okay. > * Using frames to walk the frame stack back to another interpreter. > > Could you just disable introspection of the frame stack? If you don't allow importing of 'sys' then yes, and that is planned. I just wanted to make sure I didn't forget this needs to be protected. I do need to check what a generator's frame exposes, though. > Making the ``sys`` Module Safe > > ------------------------------ > [...] > > This means that the ``sys`` module needs to have its safe information > > separated out from the unsafe settings. > > Yes. > > > XXX separate modules, ``sys.settings`` and ``sys.info``, or strip > > ``sys`` to settings and put info somewhere else? Or provide a method > > that will create a faked sys module that has the safe values copied > > into it? > > I think the last suggestion above would lead to confusion. The two > groups should have two distinct names and it should be clear which > attribute goes with which group. This is also more complicated by the fact that some things are for the entire process while others are per interpreter. Might have to separate things out even more. > Protecting I/O > > ++++++++++++++ > > > > The ``print`` keyword and the built-ins ``raw_input()`` and > > ``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``. > > By exposing these attributes to the creating interpreter, one can set > > them to safe objects, such as instances of ``StringIO``. > > Sounds good. > > > Safe Networking > > --------------- > > > > XXX proxy on socket module, modify open() to be the constructor, etc. > > Lots more to think about here. :) Oh yeah. =) > Protecting Memory Usage > > ----------------------- > > > > To protect memory, low-level hooks into the memory allocator for > > Python is needed. By hooking into the C API for memory allocation and > > deallocation a very rough running count of used memory can kept. This > > can be used to prevent sandboxed interpreters from using so much > > memory that it impacts the overall performance of the system. > > Preventing denial-of-service is in general quite difficult, but i > applaud the attempt. I agree with your decision to separate this The memory tracking has a proof-of-concept done in the bcannon-sandboxing branch. Not perfect, but it does show how one could go about accounting for every byte of data in terms of what it is basically used for. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/be9d8493/attachment.html From steve at holdenweb.com Fri Sep 8 10:24:03 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 08 Sep 2006 09:24:03 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <200609080353.07502.anthony@interlink.com.au> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > On Friday 08 September 2006 02:56, Kristj?n V. J?nsson wrote: > >>Hello All. >>I just added patch 1552880 to sourceforge. It is a patch for 2.6 (and 2.5) >>which allows unicode paths in sys.path and uses the unicode file api on >>windows. This is tried and tested on 2.5, and backported to 2.3 and is >>currently running on clients in china and esewhere. It is minimally >>intrusive to the inporting mechanism, at the cost of some string conversion >>overhead (to utf8 and then back to unicode). > > > As this can't be considered a bugfix (that I can see), I'd be against it being > checked into 2.5. > Are you suggesting that Python's inability to correctly handle Unicode path elements isn't a bug? Or simply that this inability isn't currently described in a bug report on Sourceforge? I agree it's a relatively large patch for a release candidate but if prudence suggests deferring it, it should be a *definite* for 2.5.1 and subsequent releases. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From anthony at interlink.com.au Fri Sep 8 10:58:28 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 8 Sep 2006 18:58:28 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> Message-ID: <200609081858.32277.anthony@interlink.com.au> On Friday 08 September 2006 18:24, Steve Holden wrote: > > As this can't be considered a bugfix (that I can see), I'd be against it > > being checked into 2.5. > > Are you suggesting that Python's inability to correctly handle Unicode > path elements isn't a bug? Or simply that this inability isn't currently > described in a bug report on Sourceforge? I'm suggesting that adding the ability to handle unicode paths is a *new* *feature*. If people actually want to see 2.5 final ever released, they're going to have to accept that "oh, but just this _one_ _more_ _thing_" is not going to fly. We're _well_ past beta1, where new features should have been added. At this point, we have to cut another release candidate. This is far too much to add during the release candidate stage. > I agree it's a relatively large patch for a release candidate but if > prudence suggests deferring it, it should be a *definite* for 2.5.1 and > subsequent releases. Possibly. I remain unconvinced. -- Anthony Baxter It's never too late to have a happy childhood. From steve at holdenweb.com Fri Sep 8 11:19:08 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 08 Sep 2006 10:19:08 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <200609081858.32277.anthony@interlink.com.au> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <200609081858.32277.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > On Friday 08 September 2006 18:24, Steve Holden wrote: > >>>As this can't be considered a bugfix (that I can see), I'd be against it >>>being checked into 2.5. >> >>Are you suggesting that Python's inability to correctly handle Unicode >>path elements isn't a bug? Or simply that this inability isn't currently >>described in a bug report on Sourceforge? > > I'm suggesting that adding the ability to handle unicode paths is a *new* > *feature*. > That's certainly true. > If people actually want to see 2.5 final ever released, they're going to have > to accept that "oh, but just this _one_ _more_ _thing_" is not going to fly. > > We're _well_ past beta1, where new features should have been added. At this > point, we have to cut another release candidate. This is far too much to add > during the release candidate stage. > Right. I couldn't argue for putting this in to 2.5 - it would certainly represent unwarranted feature creep at the rc2 stage. > >>I agree it's a relatively large patch for a release candidate but if >>prudence suggests deferring it, it should be a *definite* for 2.5.1 and >>subsequent releases. > > > Possibly. I remain unconvinced. > But it *is* a desirable, albeit new, feature, so I'm surprised that you don't appear to perceive it as such for a downstream release. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From ncoghlan at gmail.com Fri Sep 8 11:56:27 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 08 Sep 2006 19:56:27 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <200609081858.32277.anthony@interlink.com.au> Message-ID: <45013E4B.4050802@gmail.com> Steve Holden wrote: > Anthony Baxter wrote: >> On Friday 08 September 2006 18:24, Steve Holden wrote: >>> I agree it's a relatively large patch for a release candidate but if >>> prudence suggests deferring it, it should be a *definite* for 2.5.1 and >>> subsequent releases. >> >> Possibly. I remain unconvinced. >> > > But it *is* a desirable, albeit new, feature, so I'm surprised that you > don't appear to perceive it as such for a downstream release. And unlike 2.2's True/False problem, it is an *environmental* feature, rather than a programmatic one. So while it's a new feature, it would merely mean that 2.5.1 works correctly in more environments than 2.5. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From anthony at interlink.com.au Fri Sep 8 11:48:51 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 8 Sep 2006 19:48:51 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> Message-ID: <200609081948.55218.anthony@interlink.com.au> On Friday 08 September 2006 19:19, Steve Holden wrote: > But it *is* a desirable, albeit new, feature, so I'm surprised that you > don't appear to perceive it as such for a downstream release. Point releases (2.x.1 and suchlike) are absolutely not for new features. They're for bugfixes, only. It's possible that this could be considered a bugfix, but as I said right now I'm dubious. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From steve at holdenweb.com Fri Sep 8 12:28:27 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 08 Sep 2006 11:28:27 +0100 Subject: [Python-Dev] Unicode Imports In-Reply-To: <200609081948.55218.anthony@interlink.com.au> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> <200609081948.55218.anthony@interlink.com.au> Message-ID: <450145CB.3070601@holdenweb.com> Anthony Baxter wrote: > On Friday 08 September 2006 19:19, Steve Holden wrote: > >>But it *is* a desirable, albeit new, feature, so I'm surprised that you >>don't appear to perceive it as such for a downstream release. > > > Point releases (2.x.1 and suchlike) are absolutely not for new features. > They're for bugfixes, only. It's possible that this could be considered a > bugfix, but as I said right now I'm dubious. > OK, in that case I'm going to argue that the current behaviour is buggy. I suppose your point is that, assuming the patch is correct (and it seems the authors are relying on it for production purposes in tens of thousands of installations), it doesn't change the behaviour of the interpreter in existing cases, and therefore it is providing a new feature. I don't regard this as the provision of a new feature but as the removal of an unnecessary restriction (which I would prefer to call a bug). If it was *documented* somewhere that Unicode paths aren't legal I would find your arguments more convincing. As things stand new Python users would, IMHO, be within their rights to assume that arbitrary directories could be added to the path without breakage. Ultimately, your call, I guess. Would it help if I added "inability to import from Unicode directories" as a bug? Or would you prefer to change the documentation to state that some directories can't be used as path elements <0.3 wink>? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden From guido at python.org Fri Sep 8 18:29:16 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Sep 2006 09:29:16 -0700 Subject: [Python-Dev] Unicode Imports In-Reply-To: <450145CB.3070601@holdenweb.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> <200609081948.55218.anthony@interlink.com.au> <450145CB.3070601@holdenweb.com> Message-ID: On 9/8/06, Steve Holden wrote: > Anthony Baxter wrote: > > On Friday 08 September 2006 19:19, Steve Holden wrote: > > > >>But it *is* a desirable, albeit new, feature, so I'm surprised that you > >>don't appear to perceive it as such for a downstream release. > > > > > > Point releases (2.x.1 and suchlike) are absolutely not for new features. > > They're for bugfixes, only. It's possible that this could be considered a > > bugfix, but as I said right now I'm dubious. > > > OK, in that case I'm going to argue that the current behaviour is buggy. > > I suppose your point is that, assuming the patch is correct (and it > seems the authors are relying on it for production purposes in tens of > thousands of installations), it doesn't change the behaviour of the > interpreter in existing cases, and therefore it is providing a new feature. > > I don't regard this as the provision of a new feature but as the removal > of an unnecessary restriction (which I would prefer to call a bug). If > it was *documented* somewhere that Unicode paths aren't legal I would > find your arguments more convincing. As things stand new Python users > would, IMHO, be within their rights to assume that arbitrary directories > could be added to the path without breakage. > > Ultimately, your call, I guess. Would it help if I added "inability to > import from Unicode directories" as a bug? Or would you prefer to change > the documentation to state that some directories can't be used as path > elements <0.3 wink>? We've all heard the arguments for both sides enough times I think. IMO it's the call of the release managers. Board members ought to trust the release managers and not apply undue pressure. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Fri Sep 8 18:41:44 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 8 Sep 2006 11:41:44 -0500 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> <200609081948.55218.anthony@interlink.com.au> <450145CB.3070601@holdenweb.com> Message-ID: <17665.40264.242710.426290@montanaro.dyndns.org> Guido> IMO it's the call of the release managers. Board members ought to Guido> trust the release managers and not apply undue pressure. Indeed. Let's not go whacking people with boards. The Perl people would just laugh at us... Skip From rasky at develer.com Fri Sep 8 20:51:46 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 8 Sep 2006 20:51:46 +0200 Subject: [Python-Dev] Unicode Imports References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com> Message-ID: <010301c6d377$d5df7bc0$46ba2997@bagio> Guido van Rossum wrote: > IMO it's the call of the release managers. Board members ought to > trust the release managers and not apply undue pressure. +1, but I would love to see a more formal definition of what a "bugfix" is, which would reduce the ambiguous cases, and thus reduce the number of times the release managers are called to pronounce. Other projects, for instance, describe point releases as "open for regression fixes only", which means that a patch, to be eligible for a point release, must fix a regression (something which used to work before, and doesn't anymore). Regressions are important because they affect people wanting to upgrade Python. If something never worked before (like this unicode path thingie), surely existing Python users are not affected by the bug (or they have already workarounds in place), so that NOT having the bug fixed in a point release is not a problem. Anyway, I'm not pushing for this specific policy (even if I like it): I'm just suggesting Release Managers to more formally define what should and what should not go in a point release. Giovanni Bajo From rhettinger at ewtllc.com Fri Sep 8 21:00:50 2006 From: rhettinger at ewtllc.com (Raymond Hettinger) Date: Fri, 08 Sep 2006 12:00:50 -0700 Subject: [Python-Dev] Unicode Imports In-Reply-To: <010301c6d377$d5df7bc0$46ba2997@bagio> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com> <010301c6d377$d5df7bc0$46ba2997@bagio> Message-ID: <4501BDE2.6020306@ewtllc.com> Giovanni Bajo wrote: > >+1, but I would love to see a more formal definition of what a "bugfix" is, >which would reduce the ambiguous cases, and thus reduce the number of times the >release managers are called to pronounce. > > Sorry, that is just a pipe-dream. To some degree, all bug-fixes are new features in that there is some behavioral difference, something will now work that wouldn't work before. While some cases are clear-cut (such as API changes), the ones that are interesting will defy definition and need a human judgment call as to whether a given change will help more than it hurts. The RMs are also strongly biased against extensive patches than haven't had a chance to go through a beta-cycle -- they don't want their releases mucked-up. Raymond From mal at egenix.com Fri Sep 8 21:12:33 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 08 Sep 2006 21:12:33 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> Message-ID: <4501C0A1.4010600@egenix.com> Kristj?n V. J?nsson wrote: > Hello All. > I just added patch 1552880 to sourceforge. It is a patch for 2.6 (and 2.5) which allows unicode paths in sys.path and uses the unicode file api on windows. > This is tried and tested on 2.5, and backported to 2.3 and is currently running on clients in china and esewhere. It is minimally intrusive to the inporting mechanism, at the cost of some string conversion overhead (to utf8 and then back to unicode). +1 on adding it to Python 2.6. -0 for Python 2.5.x: Applications/modules written for Python 2.4 and 2.5 won't be expecting Unicode strings in sys.path with all the consequences that go with it, so this is a true change in semantics, not just a nice to have additional feature or "bug" fix. OTOH, those applications will just break in a different place with the patch applied :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 08 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Fri Sep 8 22:51:09 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 08 Sep 2006 22:51:09 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> Message-ID: <4501D7BD.1020006@v.loewis.de> Steve Holden schrieb: >> As this can't be considered a bugfix (that I can see), I'd be against it being >> checked into 2.5. >> > Are you suggesting that Python's inability to correctly handle Unicode > path elements isn't a bug? Not sure whether Anthony suggests it, but I do. > Or simply that this inability isn't currently > described in a bug report on Sourceforge? No: sys.path is specified (originally) as containing a list of byte strings; it was extended to also support path importers (or whatever that PEP calls them). It was never extended to support Unicode strings. That other PEP e > I agree it's a relatively large patch for a release candidate but if > prudence suggests deferring it, it should be a *definite* for 2.5.1 and > subsequent releases. I'm not so sure it should. It *is* a new feature: it makes applications possible which aren't possible today, and the documentation does not ever suggest that these applications should have been possible. In fact, it is common knowledge that this currently isn't supported. Regards, Martin From martin at v.loewis.de Fri Sep 8 22:52:26 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Sep 2006 22:52:26 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <200609081858.32277.anthony@interlink.com.au> Message-ID: <4501D80A.5050008@v.loewis.de> Steve Holden schrieb: >>> I agree it's a relatively large patch for a release candidate but if >>> prudence suggests deferring it, it should be a *definite* for 2.5.1 and >>> subsequent releases. >> >> Possibly. I remain unconvinced. >> > > But it *is* a desirable, albeit new, feature, so I'm surprised that you > don't appear to perceive it as such for a downstream release. Because 2.5.1 shouldn't include any new features. If it is a new feature (which it is), it should go into 2.6. Regards, Martin From martin at v.loewis.de Fri Sep 8 22:54:43 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Sep 2006 22:54:43 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <45013E4B.4050802@gmail.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <200609081858.32277.anthony@interlink.com.au> <45013E4B.4050802@gmail.com> Message-ID: <4501D893.4090504@v.loewis.de> Nick Coghlan schrieb: >> But it *is* a desirable, albeit new, feature, so I'm surprised that you >> don't appear to perceive it as such for a downstream release. > > And unlike 2.2's True/False problem, it is an *environmental* feature, rather > than a programmatic one. Not sure what you mean by that; if you mean "thus existing applications cannot break": this is not true. In fact, it seems that some applications are extremely susceptible to the types of objects on sys.path. Some applications apparently know exactly what you can and cannot find on sys.path; changing that might break them. Regards, Martin From martin at v.loewis.de Fri Sep 8 22:56:48 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Sep 2006 22:56:48 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <450145CB.3070601@holdenweb.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609081858.32277.anthony@interlink.com.au> <200609081948.55218.anthony@interlink.com.au> <450145CB.3070601@holdenweb.com> Message-ID: <4501D910.8020805@v.loewis.de> Steve Holden schrieb: > I don't regard this as the provision of a new feature but as the removal > of an unnecessary restriction (which I would prefer to call a bug). You got the definition of "bug" wrong. Primarily, a bug is a deviation from the specification. Extending the domain of an argument to an existing function is a new feature. Regards, Martin From martin at v.loewis.de Fri Sep 8 22:59:57 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 08 Sep 2006 22:59:57 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <010301c6d377$d5df7bc0$46ba2997@bagio> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com> <010301c6d377$d5df7bc0$46ba2997@bagio> Message-ID: <4501D9CD.80301@v.loewis.de> Giovanni Bajo schrieb: > +1, but I would love to see a more formal definition of what a "bugfix" is, > which would reduce the ambiguous cases, and thus reduce the number of times the > release managers are called to pronounce. > > Other projects, for instance, describe point releases as "open for regression > fixes only", which means that a patch, to be eligible for a point release, must > fix a regression (something which used to work before, and doesn't anymore). In Python, the tradition has excepted bug fixes beyond that. For example, fixing a memory leak would also count as a bug fix. In general, I think a "bug" is a deviation from the specification (it might be necessary to interpret the specification first to find out whether the implementation deviates). A bug fix is then a behavior change so that the new behavior follows the specification, or a specification change so that it correctly describes the behavior. Regards, Martin From misa at redhat.com Sat Sep 9 00:06:05 2006 From: misa at redhat.com (Mihai Ibanescu) Date: Fri, 8 Sep 2006 18:06:05 -0400 Subject: [Python-Dev] Py_BuildValue and decref Message-ID: <20060908220605.GF990@abulafia.devel.redhat.com> Hi, Looking at: http://docs.python.org/api/arg-parsing.html The description for "O" is: "O" (object) [PyObject *] Store a Python object (without any conversion) in a C object pointer. The C program thus receives the actual object that was passed. The object's reference count is not increased. The pointer stored is not NULL. There is no description of what happens when Py_BuildValue fails. Will it decref the python object passed in? Will it not? Looking at tupleobject.h: /* Another generally useful object type is a tuple of object pointers. For Python, this is an immutable type. C code can change the tuple items (but not their number), and even use tuples are general-purpose arrays of object references, but in general only brand new tuples should be mutated, not ones that might already have been exposed to Python code. *** WARNING *** PyTuple_SetItem does not increment the new item's reference count, but does decrement the reference count of the item it replaces, if not nil. It does *decrement* the reference count if it is *not* ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ inserted in the tuple. Similarly, PyTuple_GetItem does not increment the returned item's reference count. */ So, if the call to PyTuple_SetItem fails, the value passed in is lost. Should I expect the same thing with Py_BuildValue? Looking at how other modules deal with this, I picked typeobject.c: result = Py_BuildValue("[O]", (PyObject *)type); if (result == NULL) { Py_DECREF(to_merge); return NULL; } so no attempt to DECREF type in the error case. Further down... if (n) { state = Py_BuildValue("(NO)", state, slots); if (state == NULL) goto end; } and further down: end: Py_XDECREF(cls); Py_XDECREF(args); Py_XDECREF(args2); Py_XDECREF(slots); Py_XDECREF(state); Py_XDECREF(names); Py_XDECREF(listitems); Py_XDECREF(dictitems); Py_XDECREF(copy_reg); Py_XDECREF(newobj); return res; so it will attempt to DECREF the (non-NULL) slots in the error case. It's probably not a big issue since if Py_BuildValue fails, you have bigger issues than memory leaks, but it seems inconsistent to me. Can someone that knows the internal implementation clarify one way over the other? Thanks! Misa From barry at barrys-emacs.org Sat Sep 9 00:18:49 2006 From: barry at barrys-emacs.org (Barry Scott) Date: Fri, 8 Sep 2006 23:18:49 +0100 Subject: [Python-Dev] What windows tool chain do I need for python 2.5 extensions? Message-ID: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org> I have the tool chains to build extensions against your binary python 2.2, 2.3 and 2.4 on windows. What are the tool chain requirements for building extensions against python 2.5 on windows? Barry From barry at python.org Sat Sep 9 00:27:08 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 8 Sep 2006 18:27:08 -0400 Subject: [Python-Dev] Py_BuildValue and decref In-Reply-To: <20060908220605.GF990@abulafia.devel.redhat.com> References: <20060908220605.GF990@abulafia.devel.redhat.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 8, 2006, at 6:06 PM, Mihai Ibanescu wrote: > There is no description of what happens when Py_BuildValue fails. > Will it > decref the python object passed in? Will it not? I just want to point out that the C API documentation is pretty silent about the refcounting side-effects in error conditions (and often in success conditions too) of most Python functions. For example, what is the refcounting side-effects of PyDict_SetItem() on val? What about if that function fails? Has val been incref'd or not? What about the side-effects on any value the new one replaces, both in success and failure? The C API documentation has improved in documenting the refcount behavior for return values of many of the functions, but the only reliable way to know what some other side-effects are is to read the code. After I perfect my human cloning techniques, I'll be assigning one of my minions to fix this situation (I'll bet my clean-the-kitty- litter-and-stalk-er-keep-tabs-on-Britney clone would love to take a break for a few weeks to work on this). - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iQCVAwUBRQHuQnEjvBPtnXfVAQJfFAP9GHIRhiVc7lkzwEkPtJgqNsrN8edQcKh3 l4edSlDD7JoJrIaOElqyIaEKcJSkjpKfJt6qdA1qIt8LD9x4pGvdxpxgodGVYfFo VGPwm+pU9SH6JJIZcCOOf9bJbEmR9iqZKceAJMGgJvZjBnTnoVSyf52254q3JJGR b9glwqbddi0= =3iWf -----END PGP SIGNATURE----- From misa at redhat.com Sat Sep 9 00:35:58 2006 From: misa at redhat.com (Mihai Ibanescu) Date: Fri, 8 Sep 2006 18:35:58 -0400 Subject: [Python-Dev] Py_BuildValue and decref In-Reply-To: References: <20060908220605.GF990@abulafia.devel.redhat.com> Message-ID: <20060908223558.GG990@abulafia.devel.redhat.com> On Fri, Sep 08, 2006 at 06:27:08PM -0400, Barry Warsaw wrote: > > On Sep 8, 2006, at 6:06 PM, Mihai Ibanescu wrote: > > >There is no description of what happens when Py_BuildValue fails. > >Will it > >decref the python object passed in? Will it not? > > I just want to point out that the C API documentation is pretty > silent about the refcounting side-effects in error conditions (and > often in success conditions too) of most Python functions. For > example, what is the refcounting side-effects of PyDict_SetItem() on > val? What about if that function fails? Has val been incref'd or > not? What about the side-effects on any value the new one replaces, > both in success and failure? In this particular case, it doesn't decref it (or so I read the code). Relevant code is in do_mkvalue from Python/modsupport.c case 'N': case 'S': case 'O': if (**p_format == '&') { typedef PyObject *(*converter)(void *); converter func = va_arg(*p_va, converter); void *arg = va_arg(*p_va, void *); ++*p_format; return (*func)(arg); } else { PyObject *v; v = va_arg(*p_va, PyObject *); if (v != NULL) { if (*(*p_format - 1) != 'N') Py_INCREF(v); } else if (!PyErr_Occurred()) /* If a NULL was passed * because a call that should * have constructed a value * failed, that's OK, and we * pass the error on; but if * no error occurred it's not * clear that the caller knew * what she was doing. */ PyErr_SetString(PyExc_SystemError, "NULL object passed to Py_BuildValue"); return v; } Barry, where can I ship you my cloning machine? :-) Misa From jcarlson at uci.edu Sat Sep 9 00:48:59 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 08 Sep 2006 15:48:59 -0700 Subject: [Python-Dev] What windows tool chain do I need for python 2.5 extensions? In-Reply-To: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org> References: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org> Message-ID: <20060908154754.F8DF.JCARLSON@uci.edu> Barry Scott wrote: > > I have the tool chains to build extensions against your binary python > 2.2, 2.3 and 2.4 on windows. > > What are the tool chain requirements for building extensions against > python 2.5 on windows? The compiler requirements for 2.5 on Windows is the same as 2.4 . - Josiah From kbk at shore.net Sat Sep 9 03:35:24 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri, 8 Sep 2006 21:35:24 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200609090135.k891ZOcT003051@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 413 open ( +1) / 3407 closed (+10) / 3820 total (+11) Bugs : 897 open ( -3) / 6167 closed (+18) / 7064 total (+15) RFE : 234 open ( +1) / 238 closed ( +2) / 472 total ( +3) New / Reopened Patches ______________________ Fix decimal context management for 2.5 (2006-09-02) CLOSED http://python.org/sf/1550886 opened by Nick Coghlan Fix for rpartition() end-case (2006-09-03) CLOSED http://python.org/sf/1551339 opened by Raymond Hettinger Updated spec file for 2.5 release. (2006-09-03) CLOSED http://python.org/sf/1551340 opened by Sean Reifschneider unparse.py decorator support (2006-09-04) http://python.org/sf/1552024 opened by Adal Chiriliuc eval docstring typo (2006-09-04) CLOSED http://python.org/sf/1552093 opened by Ori Avtalion Fix error checks and leaks in setobject.c (2006-09-05) CLOSED http://python.org/sf/1552731 reopened by gbrandl Fix error checks and leaks in setobject.c (2006-09-05) CLOSED http://python.org/sf/1552731 opened by Raymond Hettinger Unicode Imports (2006-09-05) http://python.org/sf/1552880 opened by Kristj?n Valur Fix inspect.py 2.5 slowdown (2006-09-06) CLOSED http://python.org/sf/1553314 opened by Nick Coghlan locale.getdefaultlocale() bug when _locale is missing (2006-09-06) http://python.org/sf/1553427 opened by STINNER Victor UserDict New Style (2006-09-09) http://python.org/sf/1555097 opened by Indy Performance enhancements. (2006-09-09) http://python.org/sf/1555098 opened by Indy Patches Closed ______________ Fix decimal context management for 2.5 (2006-09-02) http://python.org/sf/1550886 closed by ncoghlan Fix for rpartition() end-case (2006-09-02) http://python.org/sf/1551339 closed by nnorwitz Updated spec file for 2.5 release. (2006-09-02) http://python.org/sf/1551340 closed by nnorwitz eval docstring typo (2006-09-04) http://python.org/sf/1552093 closed by nnorwitz crash in dict_equal (2006-08-24) http://python.org/sf/1546288 closed by nnorwitz Patches for OpenBSD 4.0 (2006-08-15) http://python.org/sf/1540470 closed by nnorwitz Fix error checks and leaks in setobject.c (2006-09-05) http://python.org/sf/1552731 closed by rhettinger Fix error checks and leaks in setobject.c (2006-09-05) http://python.org/sf/1552731 closed by gbrandl make exec a function (2006-09-01) http://python.org/sf/1550800 closed by gbrandl Ellipsis literal "..." (2006-09-01) http://python.org/sf/1550786 closed by gbrandl Fix inspect.py 2.5 slowdown (2006-09-06) http://python.org/sf/1553314 closed by ncoghlan New / Reopened Bugs ___________________ from . import bug (2006-09-02) CLOSED http://python.org/sf/1550938 opened by ganges master random.choice(setinstance) fails (2006-09-02) CLOSED http://python.org/sf/1551113 opened by Alan Build of 2.4.3 on fedora core 5 fails to find asm/msr.h (2006-09-02) http://python.org/sf/1551238 opened by George R. Goffe tiny bug in win32_urandom (2006-09-03) CLOSED http://python.org/sf/1551427 opened by Rocco Matano __unicode__ breaks for exception class objects (2006-09-03) http://python.org/sf/1551432 opened by Marcin 'Qrczak' Kowalczyk Wrong link to unicode database (2006-09-03) CLOSED http://python.org/sf/1551669 opened by Yevgen Muntyan unpack list of singleton tuples not unpacking (2006-07-11) CLOSED http://python.org/sf/1520864 reopened by gbrandl UnixCCompiler runtime_library_dir uses -R instead of -Wl,-R (2006-09-04) CLOSED http://python.org/sf/1552304 opened by TFKyle PEP 290 <-> normal docu... (2006-09-05) CLOSED http://python.org/sf/1552618 opened by Jens Diemer Python polls unecessarily every 0.1 when interactive (2006-09-05) http://python.org/sf/1552726 opened by Richard Boulton Python polls unnecessarily every 0.1 second when interactive (2006-09-05) http://python.org/sf/1552726 reopened by akuchling subprocess.Popen(cmd, stdout=sys.stdout) fails (2006-07-31) CLOSED http://python.org/sf/1531862 reopened by nnorwitz ConfigParser converts option names to lower case on set() (2006-09-05) CLOSED http://python.org/sf/1552892 opened by daniel Pythonw doesn't get rebuilt if version number changes (2006-09-05) http://python.org/sf/1552935 opened by Jack Jansen python 2.5 install can't find tcl/tk in /usr/lib64 (2006-09-06) http://python.org/sf/1553166 opened by David Strozzi logging.handlers.RotatingFileHandler - inconsistent mode (2006-09-06) http://python.org/sf/1553496 opened by Walker Hale datetime.datetime.now() mangles tzinfo (2006-09-06) http://python.org/sf/1553577 opened by Skip Montanaro Class instance apparently not destructed when expected (2006-09-06) http://python.org/sf/1553819 opened by Peter Donis PyOS_InputHook() and related API funcs. not documented (2006-09-07) http://python.org/sf/1554133 opened by A.M. Kuchling Bugs Closed ___________ itertools.tee raises SystemError (2006-09-01) http://python.org/sf/1550714 closed by nnorwitz Typo in Language Reference Section 3.2 Class Instances (2006-08-28) http://python.org/sf/1547931 closed by nnorwitz from . import bug (2006-09-02) http://python.org/sf/1550938 closed by gbrandl tiny bug in win32_urandom (2006-09-03) http://python.org/sf/1551427 closed by gbrandl sgmllib.sgmlparser is not thread safe (2006-08-29) http://python.org/sf/1548288 closed by gbrandl test_anydbm segmentation fault (2006-08-21) http://python.org/sf/1544106 closed by greg Wrong link to unicode database (2006-09-03) http://python.org/sf/1551669 closed by gbrandl unpack list of singleton tuples not unpacking (2006-07-11) http://python.org/sf/1520864 closed by nnorwitz UnixCCompiler runtime_library_dir uses -R instead of -Wl,-R (2006-09-04) http://python.org/sf/1552304 closed by tfkyle gcc trunk (4.2) exposes a signed integer overflows (2006-08-23) http://python.org/sf/1545668 closed by nnorwitz Exceptions don't call _PyObject_GC_UNTRACK(self) (2006-08-17) http://python.org/sf/1542051 closed by gbrandl PEP 290 <-> normal docu... (2006-09-05) http://python.org/sf/1552618 closed by gbrandl SimpleXMLRpcServer still uses sys.exc_value and sys.exc_type (2006-07-19) http://python.org/sf/1525469 closed by akuchling unbalanced parentheses from command line crash pdb (2006-07-22) http://python.org/sf/1526834 closed by akuchling Python polls unnecessarily every 0.1 second when interactive (2006-09-05) http://python.org/sf/1552726 closed by akuchling subprocess.Popen(cmd, stdout=sys.stdout) fails (2006-07-31) http://python.org/sf/1531862 closed by niemeyer subprocess.Popen(cmd, stdout=sys.stdout) fails (2006-07-31) http://python.org/sf/1531862 closed by niemeyer ConfigParser converts option names to lower case on set() (2006-09-05) http://python.org/sf/1552892 closed by gbrandl SWIG wrappers incompatible with 2.5c1 (2006-09-01) http://python.org/sf/1550559 closed by gbrandl Building Python 2.4.3 on Solaris 9/10 with Sun Studio 11 (2006-05-28) http://python.org/sf/1496561 closed by andyfloe Curses module doesn't install on Solaris 2.8 (2005-10-12) http://python.org/sf/1324799 closed by akuchling New / Reopened RFE __________________ Add traceback.print_full_exception() (2006-09-06) http://python.org/sf/1553375 opened by Michael Hoffman Print full exceptions as they occur in logging (2006-09-06) http://python.org/sf/1553380 opened by Michael Hoffman RFE Closed __________ random.choice(setinstance) fails (2006-09-02) http://python.org/sf/1551113 closed by rhettinger Add 'find' method to sequence types (2006-08-28) http://python.org/sf/1548178 closed by gbrandl From jan-python at maka.demon.nl Sat Sep 9 04:07:02 2006 From: jan-python at maka.demon.nl (Jan Kanis) Date: Sat, 09 Sep 2006 04:07:02 +0200 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: At the risk of waking up a thread that was already declared dead, but perhaps this is usefull. So, what happens is pythons signal handler sets a flag and registrers a callback. Then the main thread should check the flag and make the callback to actually do something with the signal. However the main thread is blocked in GTK and can't check the flag. Nick Maclaren wrote: ...lots of reasons why you can't do anything reliably from within a signal handler... As far as I understand it, what could work is this: -PyGTK registrers a callback. -Pythons signal handler does not change at all. -All threads that run in the Python interpreter occasionally check the flag which the signal handler sets, like the main thread does nowadays. If it is set, the thread calls PyGTKs callback. It does not do anything else with the signal. -PyGTKs callback wakes up the main thread, which actually handles the signal just like it does now. PyGTKs callback could be called from any thread, but it would be called in a normal context, not in a signal handler. As the signal handler does not change, the risk of breaking anything or causing chaos is as large/small as it is under the current scheme. However, PyGTKs problem does get solved, as long as there is _a_ thread that returns to the interpreter within some timeframe. It seems plausible that this will happen. From rhamph at gmail.com Sat Sep 9 06:52:42 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 8 Sep 2006 22:52:42 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: On 9/8/06, Jan Kanis wrote: > At the risk of waking up a thread that was already declared dead, but > perhaps this is usefull. I don't think we should let this die, at least not yet. Nick seems to be arguing that ANY signal handler is prone to random crashes or corruption (due to bugs). However, we already have a signal handler, so we should already be exposed to the random crashes/corruption. If we're going to rely on signal handling being correct then I think we should also rely on write() being correct. Note that I'm not suggesting an API that allows arbitrary signal handlers, but rather one that calls write() on an array of prepared file descriptors (ignoring errors). Ensuring modifications to that array are atomic would be tricky, but I think it would be doable if we use a read-copy-update approach (with two alternating signal handler functions). Not sure how to ensure there's no currently running signal handlers in another thread though. Maybe have to rip the atomic read/write stuff out of the Linux sources to ensure it's *always* defined behavior. Looking into the existing signalmodule.c, I see no attempts to ensure atomic access to the Handlers data structure. Is the current code broken, at least on non-x86 platforms? -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Sat Sep 9 06:59:48 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 8 Sep 2006 22:59:48 -0600 Subject: [Python-Dev] Signals, threads, blocking C functions In-Reply-To: References: <09f901c6a72c$495f2690$12472597@bagio> <20060714112137.GA891@Andrew-iBook2.local> <44B8A90C.6070309@v.loewis.de> <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com> Message-ID: On 9/8/06, Adam Olsen wrote: > Ensuring modifications to that array are atomic would be tricky, but I > think it would be doable if we use a read-copy-update approach (with > two alternating signal handler functions). Not sure how to ensure > there's no currently running signal handlers in another thread though. > Maybe have to rip the atomic read/write stuff out of the Linux > sources to ensure it's *always* defined behavior. Doh, except that's exactly what sig_atomic_t is for. Ah well, can't win them all. -- Adam Olsen, aka Rhamphoryncus From ncoghlan at gmail.com Sat Sep 9 07:55:56 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 09 Sep 2006 15:55:56 +1000 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4501D7BD.1020006@v.loewis.de> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> Message-ID: <4502576C.3060604@gmail.com> Martin v. L?wis wrote: > Steve Holden schrieb: >> Or simply that this inability isn't currently >> described in a bug report on Sourceforge? > > No: sys.path is specified (originally) as containing a list of byte > strings; it was extended to also support path importers (or whatever > that PEP calls them). It was never extended to support Unicode strings. > That other PEP e That other PEP being PEP 302. That said, Unicode strings *are* permitted on sys.path - the import system will automatically encode them to an 8-bit string using the default filesystem encoding as part of the import process. This works fine on Unix systems that use UTF-8 encoded strings to handle Unicode paths at the C API level, but is screwed on Windows because the default mbcs filesystem encoding can't handle the full range of possible Unicode path names (such as the Chinese directories that originally gave Kristj?n grief). To get Unicode path names to work on Windows, you have to use the Windows-specific wide character API instead of the normal C API, and the import machinery doesn't do that. So this is taking something that *already works properly on POSIX systems* and making it work on Windows as well. >> I agree it's a relatively large patch for a release candidate but if >> prudence suggests deferring it, it should be a *definite* for 2.5.1 and >> subsequent releases. > > I'm not so sure it should. It *is* a new feature: it makes applications > possible which aren't possible today, and the documentation does not > ever suggest that these applications should have been possible. In fact, > it is common knowledge that this currently isn't supported. It should already work fine on POSIX filesystems that use the default filesystem encoding for path names. As far as I am aware, it is only Windows where it doesn't work. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Sat Sep 9 09:23:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Sep 2006 09:23:32 +0200 Subject: [Python-Dev] Unicode Imports In-Reply-To: <4502576C.3060604@gmail.com> References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc> <200609080353.07502.anthony@interlink.com.au> <4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com> Message-ID: <45026BF4.5080108@v.loewis.de> Nick Coghlan schrieb: > So this is taking something that *already works properly on POSIX > systems* and making it work on Windows as well. I doubt it does without side effects. For example, an application that would go through sys.path, and encode everything with sys.getfilesystemencoding() currently works, but will break if the patch is applied and non-mbcs strings are put on sys.path. Also, what will be the effect on __file__? What value will it have if the module originates from a sys.path entry that is a non-mbcs unicode string? I haven't tested the patch, but it looks like __file__ becomes a unicode string on Windows, and remains a byte string encoded with the file system encoding elsewhere. That's also a change in behavior. Regards, Martin From brett at python.org Sat Sep 9 09:23:54 2006 From: brett at python.org (Brett Cannon) Date: Sat, 9 Sep 2006 00:23:54 -0700 Subject: [Python-Dev] 2.5 status In-Reply-To: References: <44FDD122.3000809@egenix.com>